Journal Mining, API Use, Better Conversation, and Apollo 11 Source
73 Million Journal Articles for Text Mining (BoingBoing) — The JNU Data Depot is a joint project between rogue archivist Carl Malamud, bioinformatician Andrew Lynn, and a research team from New Delhi’s Jawaharlal Nehru University: together, they have assembled 73 million journal articles from 1847 to the present day and put them into an airgapped respository that they’re offering to noncommercial third parties who want to perform textual analysis on them to “pull out insights without actually reading the text.”
The O’Reilly Data Show Podcast: Roger Chen on the fair value and decentralized governance of data.
In this episode of the Data Show, I spoke with Roger Chen, co-founder and CEO of Computable Labs, a startup focused on building tools for the creation of data networks and data exchanges. Chen has also served as co-chair of O’Reilly’s Artificial Intelligence Conference since its inception in 2016. This conversation took place the day after Chen and his collaborators released an interesting new white paper, Fair value and decentralized governance of data. Current-generation AI and machine learning technologies rely on large amounts of data, and to the extent they can use their large user bases to create “data silos,” large companies in large countries (like the U.S. and China) enjoy a competitive advantage. With that said, we are awash in articles about the dangers posed by these data silos. Privacy
Weird Algorithms, Open Syllabi, Conversational AI, and Quantum Computing
30 Weird Chess Algorithms (YouTube) — An intricate and lengthy account of several different computer chess topics from my SIGBOVIK 2019 papers. We conduct a tournament of fools with a pile of different weird chess algorithms, ostensibly to quantify how well my other weird program to play color- and piece-blind chess performs. On the way we “learn” about mirrors, arithmetic encoding, perversions of game tree search, spicy oils, and hats.
Open Syllabus Project — as FastCompany explains, the 6M+ syllabi from courses around the world tell us about changing trends in subjects. Not sure how I feel that four of the textbooks I learned on are still in the top 20 (Cormen, Tanenbaum, Silberschatz, Stallings).
Plato — Uber open-sourced its flexible platform for developing conversational AI agents. See also their blog post.
Margaret Hamilton, WeChat Censorship, Refactoring, and Ancient Games
Margaret Hamilton Interview (The Guardian) — I found a job to support our family at the nearby Massachusetts Institute of Technology (MIT). It was in the laboratory of Prof Edward Lorenz, the father of chaos theory, working on a system to predict weather. He was asking for math majors. To take care of our daughter, we hired a babysitter. Here I learned what a computer was and how to write software. Computer science and software engineering were not yet disciplines; instead, programmers learned on the job. Lorenz’s love for software experimentation was contagious, and I caught the bug.
How WeChat Censors Images in Private Chats (BoingBoing) — WeChat maintains a massive index of the MD5 hashes of every image that Chinese censors have prohibited. When a user sends another user an image that matches one of these hashes, it’s recognized and blocked
Quantum TiqTaqToe, Social Media and Depression, Incidents, and Unity ML
Introducing a new game: Quantum TiqTaqToe — This experience was essential to the birth of Quantum TiqTaqToe. In my quest to understand Unity and Quantum Games, I set out to implement a “simple” game to get a handle on how all the different game components worked together. Having a game based on quantum mechanics is one thing; making sure it is fun to play requires an entirely different skill set.
A look at how guidelines from regulated industries can help shape your ML strategy.
As companies use machine learning (ML) and AI technologies across a broader suite of products and services, it’s clear that new tools, best practices, and new organizational structures will be needed. In recent posts, we described requisite foundational technologies needed to sustain machine learning practices within organizations, and specialized tools for model development, model governance, and model operations/testing/monitoring.
What cultural and organizational changes will be needed to accommodate the rise of machine and learning and AI? In this post, we’ll address this question through the lens of one highly regulated industry: financial services. Financial services firms have a rich tradition of being early adopters of many new technologies, and AI is no exception:
Alongside health care, another heavily regulated sector, financial services
Climbing Robot, Programming and Programming Languages, Media Player, and Burnout Shops
NASA Climbing Robot — a four-limbed robot named LEMUR (Limbed Excursion Mechanical Utility Robot) can scale rock walls, gripping with hundreds of tiny fishhooks in each of its 16 fingers and using artificial intelligence to find its way around obstacles.
Releasing Fast and Slow — Our research shows that: rapid releases are more commonly delayed than their non-rapid counterparts; however, rapid releases have shorter delays; rapid releases can be beneficial in terms of reviewing and user-perceived quality; rapidly released software tends to have a higher code churn, a higher test coverage, and a lower average complexity; challenges in rapid releases are related to managing dependencies and certain code aspects—e.g., design debt.
Embracing Innovation in Government (OECD) — a global review that explores how governments are innovating and taking steps to make innovation a routine and integrated practice across the globe.
Museum Copyright, Twitter Apprenticeship, AI Regulation, and Computational Biology
The Great Wave: What Hokusai’s Masterpiece Tells Us About Museums, Copyright, and Online Collections Today — If we consider the customer journey of acquiring a digital image of “The Great Wave” from our 14 museums, a definite trend emerges—the more open the policy of a museum is, the easier it is to obtain its pictures. Like the other open access institutions in our sample group, The Art Institute of Chicago’s collections website makes the process incredibly simple: clicking once on the download icon triggers the download of a high-resolution image. In contrast, undertaking the same process on the British Museum’s website entails mandatory user registration and the submission of personal data.