Kick off:
The ethics debate on artificial intelligence is heating up. You’ve got all the big tech leaders chiming in on it. This week, Satya Nadella, Microsoft’s CEO wrote in Slate that “the most productive debate we can have isn’t one of good versus evil: The debate should be about the values instilled in the people and institutions creating this technology.” And Eric Schmidt wrote an essay in Fortune basically saying, don’t freak out.
One of the ways people are trying to get around AI gone wrong is by saying that AI is just going to support things we already do, so that a human can ensure the AI is behaving. I think this kind of control will prove to be an illusion: when things are going wrong, AI will have to make decisions without a human check and these are the times when the choices can be the hardest. A good example is what happens in difficult decisions with driverless cars? What happens when a Tesla gets in a wreck, which occurred this week? What should a driverless car do if it has to choose between saving its driver or saving bystanders? This is a hard, foundational problem for AI: how can it make choices that are inherently moral? In particular, because there is a time constraint during a crash, it doesn’t have a chance to ask a human what to do. MIT has a good new study out in Science Magazine about people’s preferences for what driverless cars should do. The researchers found that the public generally has a utilitarian point of view on the matter. Good explainer here.
This Week:
Many banking regulations in the last twenty years have required financial institutions to dig into the backgrounds of their customers. One way they did that was by subscribing to databases that track suspected terrorists and other people. This week one of the major databases of that information leaked, reminding everyone using data in their businesses that it might be hacked any time. Now that’s some sensitive data.
Speaking of sensitive data. Stolen medical data is trading on the dark web. I spoke on a panel organized by Swissnex on Thursday night, and another panelists argued that “data itself has no intrinsic value. It’s like air.” Illicit datasets trading at $400,000 on the dark web, however, seems to argue the opposite. Maybe data is only truly valuable when very few other people have it.
In the most basic form, companies “collect” data from users all the time. And Facebook won an important ruling in Belgium allowing it to continue collecting data from people who are not users of its site. This has big repercussions for many different sorts of web sites.
Along these lines, Google took a step this week towards showing us what data it has about us.
In Industry:
Data in jails. On Thursday, the White House launched a justice initiative to oversee how data is used to figure out who should be in jail, and ultimately to avoid overcrowded jails. This is bubbling up at the state level, too. In Wisconsin the Supreme Court is set to rule on a case about whether a computer algorithm can be used to determine likelihood of repeat offenses, a factor in sentencing decisions.
All the mapping applications out there – everything from Waze to many car navigation system – got news this week. Google is adding lots of new satellite data to Google maps.
For data scientists in every industry: here’s a great blog about a new ML technique called lda2vec for summarizing text in a way that’s not only usable by computers but where the model results can be interpreted by people. It’s a great writeup and has some nice diagrams that give you a good sense of how this stuff actually works.
Quirky Corner:
U.S. Customs wants to know your Twitter handle.
What’s happening at Ufora:
I was part of a couple great gathering this week. On Tuesday, I presented at the Artificial Intelligence meetup in New York. It was a great crowd interested in Pyfora, our open source data platform. The other talk was by neuroscientist Jeremy Freeman who gave a great overview of recent advances in neural nets. On Thursday, I joined a smart panel about the use of data and data science in in finance hosted by Swissnex. Based on the audience questions, I’d say that anxiety around privacy and the ethical use of data is running high.

Braxton McKee is the technical lead and founder of Ufora, a software company that has built an adaptively distributed, implicitly parallel runtime. Before founding Ufora with backing from Two Sigma Ventures and others, Braxton led the ten-person MBS/ABS Credit Modeling team at Ellington Management Group, a multi-billion dollar mortgage hedge fund. He holds a BS (Mathematics), MS (Mathematics), and M.B.A. from Yale University.