How would you explain what probabilistic programing is to an excel user or to a company executive?
AP) Probabilistic programming uses programming languages to enable probabilistic machine learning applications. In probabilistic machine learning, you use a probabilistic model of your domain along with inference and learning algorithms to predict the future, infer past causes of current observations, and learn from experience to produce better predictions. Using programming languages, probabilistic programming enables probabilistic machine learning applications to be developed with far less effort than before by providing an expressive language for representing models and general-purpose inference and learning algorithms that automatically apply to models written in the language.
What is the holy grail, the long term vision of Probabilistic Programing?
AP) Our long term vision is to provide a clear, English-like language that a domain expert who has little knowledge of machine learning can use to describe data. A probabilistic programming system would automatically learn a probabilistic model of the domain, without the user needing to choose or configure inference algorithms.
We’ve recently been hearing that big data is a big headache and often big noise. Is probabilistic programing the ibuprofen of big data headaches?
AP) Probabilistic programming is sometimes called “big model” rather than “big data”. The idea is that you can use richer, more detailed models than you would be able to use otherwise. This applies no matter how much data you have.
You are leading a consulting company that offers solutions to clients based on PP. How difficult is it to sell it when everybody lives in the mania of deep learning?
AP) We don’t find it hard to sell probabilistic programming. There are several features in particular that set it apart from deep learning methods. (1) It’s easy to include domain knowledge (and lots of it) in models; (2) probabilistic programming can work well in domains where you don’t have a lot of data; (3) probabilistic programming models are explainable and understandable, whereas deep learning models can be hard to interpret; (4) probabilistic programming can predict outputs that belong to rich data types of variable size, such as sentences or social networks.
Tell us about your new book on Practical Probabilistic Programing. What is the target audience? How come we got a practical book about PP, before an academic book on the subject was published?
AP) Practical Probabilistic Programming aims at helping users, who could be programmers, students, or experts in other areas, understand and use probabilistic programming. I wrote a practical book because my interest over the last few years has been on developing practical tools and applications. The reason we go a practical book rather than an academic book is that I’m not in academia anymore so I feel no pressure to write a theoretical textbook. As a small company, we’re interested in developing applications our customers can use, and I wanted to share that experience.
Can you tell us about a success story with PP? One maybe that deep learning would have failed 🙂
AP) We developed an application that learns the lineage of malware samples. When someone writes a new piece of malware, they often borrow code they or someone else has written in the past. So malware has a lineage or family history. We built an application that extracts all sorts of features of malware, clusters the malware into families, and then computes the most likely lineage of each family. It’s this last step that used probabilistic programming. We combined separate probabilistic models of the timestamps of each malware sample in the family, along with a connecting model of the lineage of each family, and ran our inference algorithms to compute the most likely lineage. The lineages we learned using our probabilistic reasoning technique were much better than ones we got with our previous algorithm. And this is the kind of probabilistic reasoning application that would have been really difficult to program without probabilistic programming.
How hard you think it is to teach a domain expert PP?
AP) With the current state of the art, a domain expert can pick up the basics and be writing simple probabilistic programs in a couple of hours. We’ve had some Figaro novices (Figaro is our probabilistic programming system) develop quite impressive applications in a short time using simple models. However, it takes longer to obtain the knowledge and experience to put together sophisticated applications, particularly in knowing how to set up inference algorithms to work on a given problem. This is why our current research focus is on making inference optimization as automatic as possible. You just press a button and we automatically figure out how to decompose and optimize the problem. We’re also working on developing a language that doesn’t require any knowledge of programming languages to use.
Avi Pfeffer, Principal Scientist, Charles River Analytics
Dr. Avi Pfeffer is a leading researcher on a variety of computational intelligence techniques including probabilistic reasoning, machine learning, and computational game theory. Avi has developed numerous innovative probabilistic representation and reasoning frameworks, such as probabilistic programming, which enables the development of probabilistic models using the full power of programming languages, and statistical relational learning, which provides the ability to combine probabilistic and relational reasoning. He is the lead developer of Charles River Analytics’ Figaro probabilistic programming language. As an Associate Professor at Harvard, he developed IBAL, the first general-purpose probabilistic programming language. While at Harvard, he also produced systems for representing, reasoning about, and learning the beliefs, preferences, and decision making strategies of people in strategic situations. Avi received his Ph.D. in computer science from Stanford University and his B.A. in computer science from the University of California, Berkeley.