Four short links: 15 November 2018

Punish Online Criminals, Fake Fingerprints, Implementing Identity, and Project Visbug
  1. USA Needs to Pursue Malicious Cyber Actors -- a report that argues that the United States currently lacks a comprehensive overarching strategic approach to identify, stop, and punish cyberattackers. (1) There is a burgeoning cybercrime wave. (2) There is a stunning cyber enforcement gap. (3) There is no comprehensive U.S. cyber enforcement strategy aimed at the human attacker. This is definitely a golden age of online crime.
  2. DeepMasterPrints: Generating MasterPrints for Dictionary Attacks via Latent Variable Evolution -- MasterPrints are real or synthetic fingerprints that can fortuitously match with a large number of fingerprints, thereby undermining the security afforded by fingerprint systems. Previous work by Roy, et al., generated synthetic MasterPrints at the feature level. In this work, we generate complete image-level MasterPrints known as DeepMasterPrints, whose attack accuracy is found to be much superior than that of
    Continue reading "Four short links: 15 November 2018"

Four short links: 14 November 2018

ML Risk, IGF Session, Feature Engineering, and Solving Snake
  1. Managing Risk in Machine Learning Projects (Ben Lorica) -- Considerations for a world where ML models are becoming mission critical.
  2. Transcripts of 2018 IGF -- Internet Governance Forum session transcripts.
  3. Featuretools -- open source Python framework for automated feature engineering.
  4. Solving Snake -- fun exploration of different algorithms you might use to play the Snake game.
Continue reading Four short links: 14 November 2018.

Four short links: 13 November 2018

Ways of Working, Too-Smart AI, Wi-Fi Vision, and Materials Science AI
  1. Internet-Era Ways of Working -- an elegant brief summary of how we do software in 2018, from Tom Loosemore's team.
  2. Examples of AI Gaming the System -- a list of examples of AIs learning more than was intended. Neural nets evolved to classify edible and poisonous mushrooms, took advantage of the data being presented in alternating order, and didn't actually learn any features of the input images. (via BoingBoing)
  3. Using Wi-Fi to “See” Behind Closed Doors is Easier than Anyone Thought (MIT TR) -- if all you are interested in is the movement of people. Humans also reflect and distort this Wi-Fi light. The distortion, and the way it moves, would be clearly visible through Wi-Fi eyes, even though the other details would be smeared. This crazy Wi-Fi vision would clearly reveal whether anybody was behind
    Continue reading "Four short links: 13 November 2018"

Managing risk in machine learning

Considerations for a world where ML models are becoming mission critical. In this post, I share slides and notes from a keynote I gave at the Strata Data Conference in New York last September. As the data community begins to deploy more machine learning (ML) models, I wanted to review some important considerations. Let’s begin by looking at the state of adoption. We recently conducted a survey which garnered more than 11,000 respondents—our main goal was to ascertain how enterprises were using machine learning. One of the things we learned was that many companies are still in the early stages of deploying machine learning (ML):
deploying machine learning
As far as reasons for companies holding back, we found from a survey we conducted earlier this year that companies cited lack of skilled people, a “skills gap,” as the main challenge holding back adoption. Interest on the part of companies means the demand side
machine learning training
machine learning model deployment capabilities
machine learning tools
criteria for a classifier to be fair
ml reliability and safety
ml skills and teams
Continue reading "Managing risk in machine learning"

Four short links: 12 November 2018

Gov Open Source, Bruce Sterling, Robot Science, and Illustrated TLS 1.3
  1. FDA MyStudies App -- open source from government, designed to facilitate the input of real-world data directly by patients which can be linked to electronic health data supporting traditional clinical trials, pragmatic trials, observational studies, and registries.
  2. Bruce Sterling Interview -- on architecture, design, science fiction, futurism, and involuntary parks. (via Cory Doctorow)
  3. Inventing New Materials with AI (MIT TR) -- using machine learning to generate hypotheses for new materials, to be explored and tested by actual humans.
  4. The New Illustrated TLS Connection -- Every byte explained and reproduced. A revised edition in which we dissect the new manner of secure and authenticated data exchange, the TLS 1.3 cryptographic protocol.
Continue reading Four short links: 12 November 2018.

Four short links: 9 November 2018

Counting Computers, New Software, Unix History, and Tencent Framework
  1. How Many Computers Are In Your Computer? -- So, a desktop or smartphone can reasonably be expected to have anywhere from 15 to several thousand computers in the sense of a Turing-complete device which can be programmed and which is computationally powerful enough to run many programs from throughout computing history and which can be exploited by an adversary for surveillance, exfiltration, or attacks against the rest of the system. Which is why security folks sometimes sleep poorly at night.
  2. Some Notes on Running New Software in Production (Julia Evans) -- The playbook for understanding the software you run in production is pretty simple. Here it is: (1) Start using it in production in a non-critical capacity (by sending a small percentage of traffic to it, on a less critical service, etc); (2) Let that bake for a few weeks. (3)
    Continue reading "Four short links: 9 November 2018"

Lessons learned while helping enterprises adopt machine learning

The O’Reilly Data Show Podcast: Francesca Lazzeri and Jaya Mathew on digital transformation, culture and organization, and the team data science process. In this episode of the Data Show, I spoke with Francesca Lazzeri, an AI and machine learning scientist at Microsoft, and her colleague Jaya Mathew, a senior data scientist at Microsoft. We conducted a couple of surveys this year—“How Companies Are Putting AI to Work Through Deep Learning” and “The State of Machine Learning Adoption in the Enterprise”—and we found that while many companies are still in the early stages of machine learning adoption, there’s considerable interest in moving forward with projects in the near future. Lazzeri and Mathew spend a considerable amount of time interacting with companies that are beginning to use machine learning and have experiences that span many different industries and applications. I wanted to learn some of the processes
Continue reading "Lessons learned while helping enterprises adopt machine learning"

Four short links: 8 November 2018

Approximate Graph Pattern Mining, Ephemeral Containers, SaaS Metrics, and Edge Neural Networks
  1. ASAP: Fast, Approximate, Graph Pattern Mining at Scale (Usenix) -- we present A Swift Approximate Pattern-miner (ASAP), a system that enables both fast and scalable pattern mining. ASAP is motivated by one key observation: in many pattern mining tasks, it is often not necessary to output the exact answer [...] an approximate count is good enough. (via Morning Paper)
  2. Binci -- tackling the same problem space as Docker Compose, but aimed at ephemeral containers rather than long-running ones (e.g., for test/CI systems).
  3. Metrics for Investors (Andrew Chen) -- detailed take on the metrics through which investors view SaaS businesses.
  4. How to Fit Large Neural Networks on the Edge -- This blog explores a few techniques that can be used to fit neural networks in memory-constrained settings. Different techniques are used for the “training” and
    Continue reading "Four short links: 8 November 2018"

Four short links: 7 November 2018

Summarizing Text, Knowledge Database, AI Park, and Approximate Regexes
  1. Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting -- Inspired by how humans summarize long documents, we propose an accurate and fast summarization model that first selects salient sentences and then rewrites them abstractively (i.e., compresses and paraphrases) to generate a concise overall summary. We use a novel sentence-level policy gradient method to bridge the non-differentiable computation between these two neural networks in a hierarchical way, while maintaining language fluency. Source code available.
  2. KBPedia -- a comprehensive knowledge structure for promoting data interoperability and knowledge-based artificial intelligence, [which] combines seven "core" public knowledge bases—Wikipedia, Wikidata,, DBpedia, GeoNames, OpenCyc, and UMBEL—into an integrated whole. Now has a serious open source offering.
  3. Baidu Opens AI Park in Beijing -- autonomous buses, smart walkways that track people's steps using facial recognition, intelligent pavilions equipped with the company's conversational DuerOS system, and
    Continue reading "Four short links: 7 November 2018"

140 live online training courses opened for November, December, and January

Get hands-on training in deep learning, Python, Kubernetes, blockchain, security, and many other topics. Learn new topics and refine your skills with 140 live online training courses we opened up for November, December, and January on our learning platform.

Artificial intelligence and machine learning

Artificial Intelligence for Big Data, November 28-29 Essential Machine Learning and Exploratory Data Analysis with Python and Jupyter Notebook , December 3 Deep Learning for Machine Vision, December 4 Beginning Machine Learning with Scikit-Learn, December 5 Managed Machine Learning Systems and Internet of Things, December 5-6 Natural Language Processing (NLP) from Scratch, December 7 Machine Learning in Practice, December 7 Deep Learning with TensorFlow, December 12 Getting Started with Machine Learning, December 12 Essential Machine Learning and Exploratory Data Analysis with Python and Jupyter Notebook, January 7-8 Artificial Intelligence: AI for Business, January 9 Managed Machine
Continue reading "140 live online training courses opened for November, December, and January"

Kubernetes’ scheduling magic revealed

Understanding how the Kubernetes scheduler makes scheduling decisions is critical to ensure consistent performance and optimal resource utilization. Kubernetes is an industry-changing technology that allows massive scale and simplicity for the orchestration of containers. Most of us happily push thousands of deployments and pods to Kubernetes every day. Have you ever wondered what sorcery is at play in Kubernetes to determine where all those pods will be created in the Kubernetes cluster? All of this is made possible by the kube-scheduler. Understanding how the Kubernetes scheduler makes scheduling decisions is critical in order to ensure consistent performance and optimal resource utilization. All scheduling in Kubernetes is done based upon a few key pieces of information. First, it is using information about the worker node to determine what the total capacity of the node is. Using kubectl describe node <node> will give you all the information you need to understand regarding
Continue reading "Kubernetes’ scheduling magic revealed"

Four short links: 6 November 2018

People Don't Change, Open Access, Event Database, and Apple Maps
  1. People Don't Change -- interesting and entertaining talk to remind you that modern people with their selfies and mobile phone obsessions aren't new special creatures unlike the people of the past. The first half is non-technical similarities, and the second half kicks into how the same human drives behind our tech obsessions can be found (with different tech) in the past. (via Daniel Siegel)
  2. Bill and Melinda Gates Foundation Endorses European Open-Access Plan (Nature) -- the Wellcome Trust, which funds over a billion pounds of research each year, will only permit publication in subscription journals if there's simultaneous release in PubMed Central. The Gates Foundation, which is already strongly pro-OA, is bringing its requirements in line with the new European Plan S. (via Slashdot)
  3. EventStore -- open source, functional database with complex event processing in JavaScript.
  4. Apple's New
    Continue reading "Four short links: 6 November 2018"

Four short links: 5 November 2018

Probabilistic Model Checker, Notebooks to Docs, AWS 12-Factor Apps, and AI Physicist
  1. Stormchecker -- A modern model checker for probabilistic systems. Test your models of your distributed system.
  2. MonoCorpus -- a note-taking app for software and machine learning engineers meant to encourage learning, sharing, and easier development. Increase documentation for yourself and your team without slowing your velocity. Take notes as part of your process instead of dedicating time to writing them. An interesting use for notebooks.
  3. Odin -- Deploy your 12-factor-applications to AWS easily and securely with the Odin, an AWS Step Function based on the step framework that deploys services as auto-scaling groups (ASGs).
  4. Toward an AI Physicist for Unsupervised Learning -- We investigate opportunities and challenges for improving unsupervised machine learning using four common strategies with a long history in physics: divide-and-conquer, Occam's Razor, unification, and lifelong learning. Instead of using one model to learn everything, we
    Continue reading "Four short links: 5 November 2018"

Four short links: 2 November 2018

Colorizing Photos, Evolving Space Invaders, Is It Too Late?, and Decision-Making
  1. DeOldify -- Deep learning-based project for colorizing and restoring old images. Impressive, and open source.
  2. InvaderZ -- Space invaders, but the invaders evolve with a genetic algorithm.
  3. The Best Way to Predict the Future is to Create It. But Is It Already Too Late? -- Alan Kay lecture. If we've done things with technology that got us in a bit of a pickle, doing things with technology will probably only make that worse. When Alan Kay speaks, I listen.
  4. Farsighted -- new book by Steven Johnson, on powerful tools for honing the important skill of complex decision-making. Shades of Algorithms to Live By, but Johnson is a good writer and a good thinker, so this promises to be much more.
Continue reading Four short links: 2 November 2018.