Four short links: 25 June 2019

This post is by Nat Torkington from All - O'Reilly Media

Click here to view on the original site: Original Post

Analog Deep Learning, Low-Trust Internet, Media Literacy, and Psych Experiments

  1. The Next Generation of Deep Learning: Analog Computing (IEEE) — Further progress in compute efficiency for deep learning training can be made by exploiting the more random and approximate nature of deep learning work flows. In the digital space that means to trade off numerical precision for accuracy at the benefit of compute efficiency. It also opens the possibility to revisit analog computing, which is intrinsically noisy, to execute the matrix operations for deep learning in constant time on arrays of nonvolatile memories. (Paywalled paper)
  2. The Internet is Increasingly a Low-Trust Society (Wired) — Zeynep Tufecki nails it. Social scientists distinguish high-trust societies (ones where you can expect most interactions to work) from low-trust societies (ones where you have to be on your guard at all times). People break rules in high-trust societies, of course, but laws, regulations, and norms

    Continue reading “Four short links: 25 June 2019”

Four short links: 24 June 2019

This post is by Nat Torkington from All - O'Reilly Media

Click here to view on the original site: Original Post

Wacky Timestamps, Computers and Spies, Surveillance Capitalism, and Twitter Adventures

  1. NTFS Timestampsa 64-bit value representing the number of 100-nanosecond intervals since January 1, 1601 (UTC). WTAF?
  2. Computers Changed Spycraft (Foreign Policy) — so much has changed—eg., dead letter drops: It is easy for Russian counterintelligence to track the movements of every mobile phone in Moscow, so if the Canadian is carrying her device, observers can match her movements with any location that looks like a potential site for a dead drop. They could then look at any other phone signal that pings in the same location in the same time window. If the visitor turns out to be a Russian government official, he or she will have some explaining to do.
  3. Netflix Records All of your Bandersnatch Choices, GDPR Request Reveals (Verge) — that’s some next-level meta.
  4. Being Beyoncé’s Assistant for the Day (Twitter) — a choose-your-own-adventure

    Continue reading “Four short links: 24 June 2019”

Four short links: 21 June 2019

This post is by Nat Torkington from All - O'Reilly Media

Click here to view on the original site: Original Post

Private Computation, Robot Framework, 3D Objects, and Self-Supervised Learning

  1. Private Join and Compute (Google) — This functionality allows two users, each holding an input file, to privately compute the sum of associated values for records that have common identifiers. (via Wired)
  2. PyRobot — from CMU and Facebook. PyRobot is a framework and ecosystem that enables AI researchers and students to get up and running with a robot in just a few hours, without specialized knowledge of the hardware or of details such as device drivers, control, and planning.
  3. PartNeta consistent, large-scale data set of 3D objects annotated with fine-grained, instance-level, and hierarchical 3D part information. Our data set consists of 573,585 part instances over 26,671 3D models covering 24 object categories. This data set enables and serves as a catalyst for many tasks such as shape analysis, dynamic 3D scene modeling and simulation, affordance analysis, and others.

    Continue reading “Four short links: 21 June 2019”

Four short links: 20 June 2019

This post is by Nat Torkington from All - O'Reilly Media

Click here to view on the original site: Original Post

Model Governance, Content Moderators, Interactive Fiction, and End-User Probabilistic Programming

  1. Model Governance and Model Operationsmodels built or tuned for specific applications (in reality, this means models + data) will need to be managed and protected.
  2. Bodies in Seats — the story of Facebook’s 30,000 content moderators: contractors, low pay (as little as $28,800 a year), and a lot of PTSD for everyone. “Nobody’s prepared to see a little girl have her organs taken out while she’s still alive and screaming.” Moderators were told they had to watch at least 15 to 30 seconds of each video.
  3. Dialoga domain-specific language for creating works of interactive fiction. Inspired by Inform and Prolog, they say.
  4. End-User Probabilistic ProgrammingWe examine the sources of uncertainty actually encountered by spreadsheet users, and their coping mechanisms, via an interview study. We examine spreadsheet-based interfaces and technology to help

    Continue reading “Four short links: 20 June 2019”

Enabling end-to-end machine learning pipelines in real-world applications

This post is by Ben Lorica from All - O'Reilly Media

Click here to view on the original site: Original Post

The O’Reilly Data Show Podcast: Nick Pentreath on overcoming challenges in productionizing machine learning models.

In this episode of the Data Show, I spoke with Nick Pentreath, principal engineer at IBM. Pentreath was an early and avid user of Apache Spark, and he subsequently became a Spark committer and PMC member. Most recently his focus has been on machine learning, particularly deep learning, and he is part of a group within IBM focused on building open source tools that enable end-to-end machine learning pipelines.

Continue reading Enabling end-to-end machine learning pipelines in real-world applications.

Four short links: 19 June 2019

This post is by Nat Torkington from All - O'Reilly Media

Click here to view on the original site: Original Post

Voice2Face, DIY Minivac, Cloud Metrics, and Envoy for Mobile

  1. Speech2Face: Learning the Face Behind a Voice — complete with an interesting ethics discussion up front. I wonder where this was intended to go: after all, it can’t perfectly reconstruct faces, so what you get is a stereotype based on the voice. Meh.
  2. Minivac 601 Replica (Instructables) — Created by information theory pioneer Claude Shannon as an educational toy for teaching digital circuits, the Minivac 601 Digital Computer Kit was billed as an electromechanical digital computer system.
  3. Nines Are Not Enough: Meaningful Metrics for CloudsWe show that this problem shares some similarities with the challenges of applying statistics to make decisions based on sampled data. We also suggest that defining guarantees in terms of defense against threats, rather than guarantees for application-visible outcomes, can reduce the complexity of these problems.
  4. Announcing Envoy Mobile (Lyft Engineering) — as Simon Willison

    Continue reading “Four short links: 19 June 2019”

What are model governance and model operations?

This post is by Ben Lorica, Harish Doddi, David Talby from All - O'Reilly Media

Click here to view on the original site: Original Post

A look at the landscape of tools for building and deploying robust, production-ready machine learning models.

Our surveys over the past couple of years have shown growing interest in machine learning (ML) among organizations from diverse industries. A few factors are contributing to this strong interest in implementing ML in products and services. First, the machine learning community has conducted groundbreaking research in many areas of interest to companies, and much of this research has been conducted out in the open via preprints and conference presentations. We are also beginning to see researchers share sample code written in popular open source libraries, and some even share pre-trained models. Organizations now also have more use cases and case studies from which to draw inspiration—no matter what industry or domain you are interested in, chances are there are many interesting ML applications you can learn from. Finally, modeling tools are improving, and

ml tools
Demand for tools for managing ML in the enterprise

Continue reading “What are model governance and model operations?”

Four short links: 18 June 2019

This post is by Nat Torkington from All - O'Reilly Media

Click here to view on the original site: Original Post

JavaScript Spreadsheets, Pessimism, Privacy Policies, and AI Ethics

  1. jExcela lightweight vanilla JavaScript plugin to create amazing web-based interactive tables and spreadsheets compatible with Excel or any other spreadsheet software. You can create an online spreadsheet table from a JS array, JSON, CSV, or XSLX files. You can copy from excel and paste straight to your jExcel spreadsheet and vice versa. It is very easy to integrate any third-party JavaScript plugins to create your own custom columns, custom editors, and customize any feature into your application.
  2. Why Are We So Pessimistic? (Brookings) — The belief or perception that things are much worse than they really are is widespread, and I believe it comes with significant detrimental impacts on societies.
  3. We Read 150 Privacy Policies. They Were an Incomprehensible Disaster (NYT) — Only Immanuel Kant’s famously difficult “Critique of Pure Reason” registers a more challenging readability score than Facebook’s privacy

    Continue reading “Four short links: 18 June 2019”

The quest for high-quality data

This post is by Ihab Ilyas, Ben Lorica from All - O'Reilly Media

Click here to view on the original site: Original Post

Machine learning solutions for data integration, cleaning, and data generation are beginning to emerge.

“AI starts with ‘good’ data” is a statement that receives wide agreement from data scientists, analysts, and business owners. There has been a significant increase in our ability to build complex AI models for predictions, classifications, and various analytics tasks, and there’s an abundance of (fairly easy-to-use) tools that allow data scientists and analysts to provision complex models within days. As model building become easier, the problem of high-quality data becomes more evident than ever. A recent O’Reilly survey found that those with mature AI practices (as measured by how long they’ve had models in production) cited “Lack of data or data quality issues” as the main bottleneck holding back further adoption of AI technologies.

data bottleneck holding back further adoption of AI technologies

Even with advances in building robust models, the reality is that noisy data and incomplete data remain the biggest hurdles to

Continue reading “The quest for high-quality data”

Four short links: 17 June 2019

This post is by Nat Torkington from All - O'Reilly Media

Click here to view on the original site: Original Post

Multiverse Databases, Detecting Photoshopping, Simulation Platform, and Tail-Call Optimization: The Musical

  1. Towards Multiverse Databases (Morning Paper) — The central idea behind multiverse databases is to push the data access and privacy rules into the database itself. The database takes on responsibility for authorization and transformation, and the application retains responsibility only for authentication and correct delegation of the authenticated principal on a database call. Such a design rules out an entire class of application errors, protecting private data from accidentally leaking.
  2. Detecting Photoshopped Fakes (Verge) — Adobe worked with Berkeley researchers to develop software that can spot Photoshopping in an image. (via BoingBoing).
  3. Open Sourcing AI Habitat (Facebook) — a new simulation platform created by Facebook AI that’s designed to train embodied agents (such as virtual robots) in photo-realistic 3D environments. […] To illustrate the benefits of this new platform, we’re also sharing Replica, a data set of

    Continue reading “Four short links: 17 June 2019”

Four short links: 14 June 2019

This post is by Nat Torkington from All - O'Reilly Media

Click here to view on the original site: Original Post

Information Operations, Game Creator, History Lessons, and Physical Pen Testing

  1. Information Operations on Twitter: Principles, Process, and Disclosure (Twitter) — We believe that people and organizations with the advantages of institutional power and which consciously abuse our service are not advancing healthy discourse but are actively working to undermine it. By making this data open and accessible, we seek to empower researchers, journalists, governments, and members of the public to deepen their understanding of critical issues impacting the integrity of public conversation online, particularly around elections. This transparency is core to our mission. Twitter is leading in this area; it’s great to see. I hope this makes others lift their game.
  2. Create 3D Games with Friends, No Experience Required (Google) — Our prototype is called Game Builder, and it is free on Steam for PC and Mac.
  3. Five Lessons from History — all are relevant to business as well as

    Continue reading “Four short links: 14 June 2019”

Highlights from the O’Reilly Software Architecture Conference in San Jose 2019

This post is by Jenn Webb from All - O'Reilly Media

Click here to view on the original site: Original Post

Experts explore software architecture security, design heuristics, Next Architecture, and more.

Experts from across the software architecture world are coming together in San Jose for the O’Reilly Software Architecture Conference. Below you’ll find links to highlights from the event.

Security and deception: Lessons from a professional liar

Michael Carducci takes an entertaining look at why humans are so easy to fool, and he explores what we can do to overcome our weaknesses and build more secure software.

Cultivate your personal design heuristics

Rebecca Wirfs-Brock explores how you can grow as a designer by becoming conscious of your heuristics.

Architect as storyteller

Nathaniel Schutta explains why an architect’s job is to be a storyteller.

Next Architecture

Chris Guzikowski discusses the convergence of microservices, cloud, containers, and

Continue reading “Highlights from the O’Reilly Software Architecture Conference in San Jose 2019”