Our past Technical Chair, interviewed Hussein Mehanna, Engineering Director – Core ML, Facebook, regarding his upcoming presentation Applying Deep Learning at Facebook Scale, scheduled for 09/23/16 at MLconf Atlanta.
One of the criticism against deep learning models was the complexity of inference. In your talk you will explain how you reduced the inference time. Does this mean there is no advantage of shallow models versus deep models anymore?
HM) No, I don’t think so. In fact, one of the tricks that are growing in popularity these days is using deep expensive models to learn and then use those to teach shallow models (dark matter transfer). There seems to be a theory that during learning you need more capacity and complexity but that could be reduced at inference. In fact at times it even improves accuracy. So I think shallow models will stay.
Do we need to compromise accuracy to make deep learning inference fast?
HM) Actually not necessarily, at times it may even improve generalization as complex models overfit. That said, figuring out how to reduce the computational load of a model is still non trivial. Making this simpler and more automatic is something that will help the industry.
TensorFlow, Torch, Theano, Mxnet, CNTK,…. Can you help us survive the babel of deep learning platforms? Can you help the MLconf audience what to choose, or how to choose?
HM) Yes, I probably can. That said, MLconf audience should feel happy because diversity increases the chances that they get tools closer to their needs. We are in a creative chaos phase in AI but things are converging.
I have noticed that some deep learning platforms are good with dense data and others with sparse. At Facebook you are dealing with both. How did you manage to unify both under one platform?
HM) Good question – I will need to check with our legal system before I answer that. All I can say now is that we treat both as a first class citizen and we are investing in algorithms that operate in the intersection. This is majorly beneficial for sparse scenarios since traditional deep learning has been dense focused as its easier to get hold of images than social data.
You implemented deep learning at scale. What is the gap between theory and practice? What are the tricks that make the difference that you don’t find written in a paper?
HM) That’s a fantastic question. Any ML algorithm is really dependent on the data. If you change the data, you change the problem completely. That’s the biggest difference in my opinion between Academia and Industry. It makes a lot of sense for academia to standardize their datasets but most of those don’t represent what the industry uses. Think about the intersection in data between the imagenet dataset and a system that needs to recognize consumer products. Probably very different. The other difference is that industrial systems receive continuous improvements that accumulate over time and so baselines in industry are much more tuned.
What was the most surprising fact that you have discovered about deep learning? Can you share a paper with us that had a great influence on you?
HM) I am going to seem biased towards Facebook AI Research a bit but I adore the character level deep learning for NLP. The fact you can learn from raw textual input with no preprocessing as you would with images is just extremely powerful. In my early college days, I just could not bear all the special rules that riddled NLP and I always believed there is a better solution. This paper provides good basis for that, we now have more sophisticated stuff in the team but that paper was a great start.
Hussein Mehanna, Engineering Director – Core ML, Facebook
I am the Director of the Core Machine Learning group at Facebook. Our team focuses on building state of the art ML/AI Platforms combined with applied research in event prediction and text understanding. We work closely with product teams in Ads, Feed, Search, Instagram and others to improve their user experiences.
In 2012, I joined Facebook as the original developer on the Ads ML platform. That quickly developed into a Facebook wide platform serving more than 30+ teams. Prior to Facebook, I worked at Microsoft on Search query alterations and suggestions in Bing and on communication technologies in Lync. I hold a masters degree in Speech Recognition from the University of Cambridge, UK where I worked on noise robustness modeling.
Guest Blog Post: "Why I Hate Chatbots"
Our Co-Founder and the CEO at HiringSolved, Shon Burton, recently shared his musings on bots and what his team has been working on to streamline the recruiting process..
The last chatbot I wrote was called “Larry”. It was written in 1999 to simulate a conversation with my boss at the time, Larry Rocha. “Larry’s” main function was called RochaBrain() and it contained conditional triggers and responses which allowed my workmates, including Larry himself to enjoy a good laugh while interacting with it. It was just believable enough to be pretty funny for a minute or two, albeit in a limited context.
So why the sudden hate for chatbots? Let me explain. As a kid growing up in the 80’s, I was fascinated with AI. I’ve collected and modified different versions Eliza, Racter, Alice and more. Computer programs that could hold a human-like conversation were not only fun to play with, they were and still are regarded as one of the primary indicators of artificial intelligence and even consciousness. The famous Turing test, a thought experiment postulated by the great Alan Turing over 60 years ago, is still referenced today as a primary measuring tool and metric for artificial intelligence.
If Turing himself were alive today, he may have different ideas about how to define and measure artificial intelligence. The concepts of “thought”, “intelligence” and “consciousness” are entirely philosophical in nature in that they defy rigorous scientific definition. These ideas are entertaining to ponder but not very practical. Chatbots in my view, are similar; entertaining but not practical. The high-water mark of a Turing Test certified chatbot, is a casual conversation in which the software’s responses are indistinguishable from a human. I think that we can do better.
AI needs to be better than a human. More useful. More like Alexa and Siri on steroids, rather than HAL or C3PO. Shifty, insecure and homicidal, HAL was far too human. Even Tony Stark’s J.A.R.V.I.S is too human. Too opinionated, too chatty. Data from STTNG is also too human for some of the same reasons. On the topic of Star Trek, the computer intelligence powering the Enterprise D strikes the right balance. It communicates very well but it doesn’t exhibit any of the human eccentricities or flaws which would impede its usefulness.
So what is a better AI and how do we build it? Useful AI is an exercise in UX Design. The main advantage that artificial intelligence has over human intelligence is it’s ability to quickly access and process massive amounts of information. Combine this ability with a human level understanding of language and you have something interesting: A conversational interface to knowledge and computation. Human creativity will still be required but it will be greatly magnified with this new tool.
We’re experimenting with a tool like this at HiringSolved. We call it RAI. We think of it as Siri for recruiting. RAI is a closed domain tool working on a retrieval based model with access to HiringSolved’s massive index of talent data. RAI is not ground-breaking from a Turing Test perspective. It won’t fool anyone into thinking it’s human, but for practical recruiting tasks it’s already quite useful. By combining the practical conversational interface of tools like Siri and Alexa, with a Wolfram Alpha inspired ‘knowledge engine’, we’ve developed a recruiting assistant with which a non-expert user can conduct a talent search conversationally.
Current systems require the user to construct searches using keywords and boolean operators. RAI simply allows the user to say: “I need to find female javascript developers with reactjs experience within 25 miles of Austin, Texas” and RAI produces results. From there, RAI can help the user construct and send a message to the selected candidates. This isn’t going to replace a human recruiter but it does reduce the number of steps involved in recruiting, letting the recruiters be more personal in their approach. If you would like to experiment with RAI sign up here. We’re releasing user accounts in small batches next month.
Shon Burton is CEO and Founder of HiringSolved, a market leading HR Tech startup focusing on the talent acquisition and talent search market. Prior to HiringSolved, Shon founded MLconf along with Courtney Burton. In his spare time Shon enjoys long walks on the beach with Courtney, playing Legos with his daughter and and spending hours restoring ancient, obtuse computers.
Interview with Dr. Le Song, Assistant Professor in the College of Computing, Georgia Institute of Technology
Our past Technical Chair, interviewed Dr. Le Song, Assistant Professor in the College of Computing, Georgia Institute of Technology, regarding his upcoming presentation Understanding Deep Learning for Big Data, scheduled for 09/23/16 at MLconf Atlanta.
You have done a lot of work on kernel methods. A year ago there was indication that kernel methods are not dead and that could match or outperform deep nets. Is that the case? Is it time for them to retire?
LS) I think kernel methods and deep learning are cousins. The field needs to combine them rather than throwing either away.
In fact, they share lots of similarities, such as both tries to learn nonlinear functions. One can design kernel functions to capture problem structure as much as one can choose the architecture of deep learning according to the problem at hand. More interestingly, the feature maps of kernel functions have one-to-one correspondence with the activation units in deep learning. For instance, the arc-cosine kernel is an infinite combination of rectified linear unit. Hence researchers also call kernel method infinite neural networks.
The main algorithmic difference between kernel methods and deep learning is that the parameters of the kernels are typically fixed and only the classifiers are learned, while the parameters of the activation units in deep learning are learned together with classifiers. This difference allows deep learning models to be more flexible given the same number of parameters. In the face of big data, this also means that deep learning is more scalable given the same model flexibility. Very few works have compared deep learning seriously with kernel methods in big data regime simply because nobody knows how to scale up kernel methods without sacrificing model flexibility. We did some fundamental work in scaling up kernel methods, and we found the matching performance between kernel methods and deep learning. Especially, nowadays when researchers are talking about learning the kernel functions, the line between kernel methods and deep learning really blurs.
The main theoretical difference between kernel methods and deep learning is that there is an almost completely set of theory to explain the behavior of kernel methods, while there is almost no satisfactory theory to explain the behavior of deep learning. When researchers talk about kernel methods, it means not just the set of algorithms, but also the set of unique tools and theories developed to analyze and provide guarantees for kernel methods. Researchers, including us, are working on explaining the generalization ability of deep learning models. In fact, we are now working on using tools from kernels methods to understand the generalization ability of deep learning models.
I think the next big thing will be kernel methods + deep learning + graphical models.
Deep nets have high information capacity versus linear models. If the data is noisy then all this capacity is filled with noise. Are there any practical methods for deciding where it is worth using a deep net versus a simpler model?
LS) Use cross-validation.
Given the success of deep nets, data scientists and researchers always wonder if it is time to give up traditional ML. Is there a reason why we should keep investing on other schools of ML, like bayesian modeling, graphical models, etc?
LS) Definitely, we should keep investing on other schools of ML, like bayesian modeling, graphical models, and we should keep teaching students bayesian modeling and graphical models. Not to mention that the possibility of new and better methods coming up from these other areas, they can simply inspire deep learning model. We recently invented a completely new deep learning architecture based on belief propagation algorithm from graphical models. Now our method is the best method for learning representation for discrete structures, such as taxonomies, drug molecules and social networks.
Given all this research in deep learning there seems to be plethora of architectures. Is there a map or a guideline for choosing the right architecture? Do we just have to follow the rule “do whatever google does”?
LS) A good architecture for a particular problem is heavily based on domain knowledge and previous successful models. For instance, convolution neural networks are based on previous image feature extractors such as pyramid features, recurrent neural networks are based on previous successful sequence models such as hidden Markov models. Google definitely has more computing resources to blindly search for a better architecture for a set of canonical problems, but your problem may be different from these canonical problems.
Neural nets have been there for decades. Why did it take so long to blast off? What do you think people missed in the 80s?
LS) I think neural networks are just more scalable than others nonlinear models out there. For big data, you need both flexibility and scalability. Neural networks have both.
Apart from the deep learning architecture labyrinth, scientists have also to go through the platform labyrinth (tensorflow, mxnet, theano, torch, keras, etc). Do you have any preference, advice?
LS) I prefer tensorflow and mxnet. You can program in python, and easily interface with other programs. mxnet is very efficient and faster than tensorflow. mxnet also has multi-GPU and multi-machine automatic parallelization at high speed.
Dr. Le Song, Assistant Professor in the College of Computing, Georgia Institute of Technology
Le Song is an assistant professor in the College of Computing, Georgia Institute of Technology. He received his Ph.D. in Machine Learning from University of Sydney and NICTA in 2008, and then conducted his post-doctoral research in the Department of Machine Learning, Carnegie Mellon University, between 2008 and 2011. Before he joined Georgia Institute of Technology, he was a research scientist at Google. His principal research direction is machine learning, especially nonlinear methods and probabilistic graphical models for large scale and complex problems, arising from artificial intelligence, social network analysis, healthcare analytics, and other interdisciplinary domains. He is the recipient of the NSF CAREER Award’14, AISTATS’16 Best Student Paper Award, IPDPS’15 Best Paper Award, NIPS’13 Outstanding Paper Award, and ICML’10 Best Paper Award. He has also served as the area chair for leading machine learning conferences such as ICML, NIPS and AISTATS, and action editor for JMLR.
This Week in Data by Braxton Mckee (CEO, Ufora) & MLconf Alumni Speaker, Issue #9
Kick Off
Every day, data models run by various companies are putting you into various categories. Rarely do you get to see their findings on you. So, it’s kind of cool that there’s a way to get inside Facebook’s data model. People are paying attention to this because of the media firestorm over how Facebook’s news feed algorithm seems to be biased against conservative political news. You can go to facebook.com/ads/preferences and see exactly what the social media company thinks about you. (Full explainer here.) As more companies integrate data models into their businesses, I expect we will see broader push – perhaps, legislation even – requiring this sort of transparency.
We also continue to see Facebook dealing with backlash over possible political leanings in the news feeds. On Friday the company announced it is reducing the role of human editors even further, in favor of having algorithms select trending topics. It’s fascinating that tech companies, on the one hand, get to reduce concerns about subjective human choices by pointing to their machines and models but at the same time, people should remember that at some point, humans input some of the rules into many of these models.
In the News
“Is big data in big trouble?”, asks TechCrunch. The answer is that some data companies like Tableau are overhyped and now as they miss their earnings forecasts, some investors are disappointed. At the heart of this, I think, is an over-investment in Hadoop-backed technologies. Hadoop moved the needle a lot in data analysis, but even as it is becoming outmoded (by Spark and other platforms that work better on distributed computing) companies are still doubling down on Hadoop. I think the Hadoop train will come to a halt at some point.
On Amazon Kinesis: Analyzing streaming data is a crucial part of bringing machine learning to bear on real-world problems. Amazon has had a system called Kinesis for a while now, but they officially took the “Beta” label off of it this week. Amazon’s done a good job at servicing both developers (by allowing the them to use Kinesis infrastructure to build powerful applications) and also business analysts, by allowing users to query data streams using SQL in realtime. The ideas in Kinesis aren’t new, but Amazon’s implementations are rock-solid and they’ve done a good job integrating the service with all of the other infrastructure services that Amazon provides. A nice writeup here.
It’s hard competing with the biggest tech companies. So Rackspace, an early cloud-computing company that went public in 2008, has found. This week it went private again.
In Industry
I’ve been following self-driving cars with interest in this newsletter because I think it’s one of the most visible ways we’re seeing artificial intelligence and data models turn into products that will be right in front of normal consumers. So, I wanted to note that the the world’s first network of self-driving taxis began this past week in Singapore. How this network fares will be held up as an example in many driverless car debates, I’m sure.
You all have experienced the clogged-up Internet, where downloads take forever. One driver of the problem is all the images and streaming movies we are watching. Cool to see how Google is looking to artificial intelligence to work on better image compression.
This is a great series of charts that shows you which companies and which industries are investing the most in artificial intelligence research and patents and which types of AI are most commonly being pursued. Fujitsu? Who knew.
In Research
A cool paper on using data on two effects to figure out whether A causes B or B causes A just from the data.. Normally this is hard to do because you have to actually run experiments to see if A is really causing B or just correlated with it. This paper contains some new techniques for inferring it from the data itself by seeing how noise in A affects noise in B.
The professor, José Daniel García, at the University of Madrid, is working on some cool stuff: his REPARA project is working to automatically rewrite programs so that they can be run on multiple graphics cards simultaneously. This has been a consistent interest of mine: how can we build systems that take software that’s already written, figure out how the programs work, and then automatically rewrite them so that they run much faster? I think of it as AI for computer programming itself.
I’m finding myself increasingly interested in statistical natural language processing, where computers try to understand written text by processing huge volumes of data (like news articles), but without any prior knowledge about language or grammar. So, I was excited to see that the Google Brain team released model code for their cutting edge news article summarizer. The models are constructed in TensorFlow, Google’s open-source machine learning framework. It’s great to see so much great AI research being done out in the open!
I really liked this technical blog on optimization techniques. In particular I loved the following graphics by Alec Radford showing how different methods of optimization interact with different kinds of functions:
Animations that may help your intuitions about the learning process dynamics. Top: Contours of a loss surface and time evolution of different optimization algorithms. Notice the “overshooting” behavior of momentum-based methods, which make the optimization look like a ball rolling down the hill. Bottom: A visualization of a saddle point in the optimization landscape, where the curvature along different dimension has different signs (one dimension curves up and another down). Notice that SGD has a very hard time breaking symmetry and gets stuck on the top. Conversely, algorithms such as RMSprop will see very low gradients in the saddle direction. Due to the denominator term in the RMSprop update, this will increase the effective learning rate along this direction, helping RMSProp proceed. Images credit: Alec Radford.
Quirky Corner
Dumbing down data – One of the risks of bubbles in certain parts of our economy is that you get a bunch of people working in those areas who become “experts” but, in fact, they don’t know what they are doing really and would be quickly exposed as charlatans if there were less money floating around that area. There’s a degree to which I think this has happened in the data space. How many “data scientists” are there now who wouldn’t have passed an advanced math degree program. This book by Cathy O’Neil, a former math professor at Barnard, exposes lots of the shoddy non-mathematics based thinking that gets used by data scientists catering to businesses and makes an argument that the mainstreaming of data leads to bad results for society, including more inequality. Worthwhile read.
Keep some skepticism when you hear businesses pitching you on fancy new types of things you are doing with fancy new labels. Om Malik, the tech writer and founder of GigaOm, had a good essay in The New Yorker this week about how all these new terms – “artificial intelligence,” for instance — are just a continuation of other things we’ve long been doing.
Biking, not golf — I often marvel at how many fewer of my peers golf than in my parents’ generation. So check out this article, which says that cycling is the new golfing for the tech industry. Totally prefer two hours on my bike than time on a golf cart.
What’s happening at Ufora
My colleague Ronen Hilewicz gave a talk at the Women in Machine Learning and Data Science meetup in New York about scaling up machine learning algorithms. His main point? You shouldn’t have to rewrite your program to get it to work with huge datasets. You can watch his talk here. (Password is 5LjHhe6N)
We are up to our necks in consulting work so I am changing this newsletter to become a monthly update. That means I’ll bring you the very best stuff I see at the end of the month. I’m excited to correspond with any of you individually anytime.
Braxton McKee is the technical lead and founder of Ufora, a software company that has built an adaptively distributed, implicitly parallel runtime. Before founding Ufora with backing from Two Sigma Ventures and others, Braxton led the ten-person MBS/ABS Credit Modeling team at Ellington Management Group, a multi-billion dollar mortgage hedge fund. He holds a BS (Mathematics), MS (Mathematics), and M.B.A. from Yale University.
Interview With Alex Korbonits, Data Scientist, Remitly
Our past Technical Chair, interviewed Alex Korbonits, Data Scientist at Remitly, about his thoughts on artificial intelligence as it relates to the arts.
What kind of features do you need to use when you train a model that generates art?
AK) Art is perceived. Since AlexNet in 2012, deep neural networks have come to be synonymous with the state-of-the-art in computer vision and a range of other perception tasks, pushing the boundaries of what machine learning models can achieve much further than ever before. Hand-crafted features are out and distributed feature representations are in. In other words, you don’t create specific features, instead, you create the model architecture. Stephen Merity made an excellent point about this recently in a blog post: “In deep learning, architecture engineering is the new feature engineering“.
In that sense, the question then becomes: what kinds of architectures do you need to use when you train a model that generates art? First of all, if you want to generate art you’d better use a generative model :). After that your choice of architecture should follow the data it’s modeling. For visual art you probably would want convolutions. For music you’d want recurrence. For film you’d want both.
Give us an overview of generative models that have been successful in generating art.
AK) There have been so many exciting and interesting projects in this space within the last year alone that I sadly have to limit myself to just the ones I know.
First of all, Google’s “Deep Dream” took the world by storm last summer. Google took their well-understood discriminative classifier, the Inception network built for the 2014 ImageNet competition (also known as GoogLeNet in an homage to Yann LeCun’s LeNet), and decided to go deeper, as it were, by building a generative visualization tool to give intuition for the kinds of features/concepts the classifier was learning at different layers of the network. They wrote up a blockbuster Google Research blog post about it, along with releasing a gitHub repo and an iPython Notebook demonstrating Deep Dream. Entire startups/homespun projects were created around Deep Dream to take any image and effectively create a version of it that would be reminiscent of a scene from Fear and Loathing in Las Vegas. The horror, the horror.
Second, Andrej Karpathy released his own code in a gitHub repo for a character-level recurrent neural network implementation (called char-rnn) in Torch whose creative properties he demonstrated in a very popular blog post. Among other things, he trained a character-level LSTM (long short term memory network) on different corpora, including War and Peace, the completed works of Shakespeare, an open source set of LaTeX files of papers in algebraic geometry, and the source code for Linux. This kicked off a series of humorous web applications such as an ironical clickbait headline generator reminiscent of BuzzFeed as well as myriad Twitter bots generating sophisticated fake tweets for any given well-known persona for whom it is easy to mock ;). Traditionally, a lot of the generative models used for this kind of thing were simple markov chains, but LSTMs learn long-term dependencies that are comparatively impressive. Even training an LSTM on James Joyce’s Ulysses doesn’t look too far off from the real thing.
Third, an interesting area of using generative models to create art is in music. There’s a lot of work to be done here in terms of richness and training on audio recordings, but some low-hanging fruit has already been found. A couple of implementations of this that I know of thus far: (1) using an LSTM (even Karpathy’s char-rnn) to take music with a text encoding (such as MIDI files) to generate music from training data (which I had some fun playing with a year ago training on MIDI files of Beethoven piano sonatas but didn’t post); and (2) an awesome blog post at Daniel Johnson’s blog hexahedria, wherein he describes an implementation of what he calls a “Biaxial Recurrent Neural Network for Music Composition” that seems to generate results far superior to mine.
Fourth, I would be remiss not to discuss the well-known “A Neural Algorithm of Artistic Style”. This has now spawned an app called Prisma that, while not quite as white-hot popular as Pokemon Go, is nonetheless very much in vogue as you are reading this. A generative model is trained on a base image and a style image, whose output is a composition that resembles the base image in the “style” of the style image. Now you can even transfer style while keeping the original colors, which was not the case in the original implementation. Functionally this means that, for example, I can take a normal photograph of the Space Needle in Seattle and generate a version of it inspired by a favorite cubist painting by Braque.
Last, and perhaps most recently, we saw that on Ars Technica a group of artists released a short film called Sunspring whose script was entirely composed by “an AI” (which I have a hunch was trained on a derivative of Karpathy’s char-rnn), trained on a large corpora of science fiction scripts and presumably given different plot-related prompts/primers to generate short sequences that made up the entirety of the final script. We now see real artists using state-of-the-art machine learning models to assist in the creative process. How cool is that?
Can deep learning embeddings provide dimensions that are associated with art measures, that humans pick up?
AK) Yes.
A recent paper, “Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings” explores this idea. First, the authors determine semantically meaningful directions within the embeddings to locate and expose bias. They then exploit the existence of these directions to combat bias. Super cool.
In this sense, I think it is totally possible to create/discover dimensions/directions within an embedding that are associated with human-interpretable art measures. You could stay within the medium you’ve embedded to explore and interpret the space — e.g., paintings — or you could marry up your embedding of paintings with some word embeddings (perhaps via descriptions of your paintings or painting metadata) to help understand how you’re moving around it. Chris Moody from Stitch Fix gave a great talk at Data Day Seattle this summer combining word embeddings and images of clothing items to make recommendations to customers. There’s no reason you couldn’t do this with art. For example, you could explain the artistic/aesthetic differences in style between two similar paintings by different artists (such as cubist portraits by Picasso and Braque) by comparing color palette, brushstrokes, shading, or other aspects you want to inspect along the relevant directions in your embedding.
Is there a word2vec embedding for art so that you can transform a piece of music by just adding a vector to it?
AK) Mozart + Metallica – Beatles = Beethoven? As long as you can take a piece of music and properly model it, there’s no reason in principle why you couldn’t embed a specific piece of music into some kind of metric space (either as a single point or perhaps more intuitively as a sequence of points) that you could then manipulate within that space to transform along specific directions. E.g., you could modulate the key of a piece or a specific passage from major to minor, or perhaps along a direction that changes the instrument playing from clarinet to oboe.
I haven’t come across music2vec yet but I eagerly await its arrival. I imagine that this would come about via sequence-to-sequence models such as LSTMs since music, like text, is inherently sequential.
For visual art embeddings you could use convolutional neural networks.
I think autoencoders could be useful here too.
If we can computerize art by massively generating it automatically, don’t we create an inflation? What is the value of artificial art? Isn’t art supposed to be rare and unique?
AK) This question was posed 80 years ago in a very famous essay titled “The Work of Art in the Age of Mechanical Reproduction” by Walter Benjamin, wherein the art historical properties of prints of famous works of art are examined. At issue was the value of art in an age where it was — all of the sudden — possible to print a poster of Picasso’s Les Demoiselles d’Avignon you could sell to anyone trying to find something to put up on the walls of their bedrooms, dorm rooms, apartments, homes, offices, restaurants… you name it. The mechanical reproduction of art doesn’t devalue art itself simply by virtue of increasing access and awareness of rare and unique art. However, it does affect the typical experience of art from active engagement to passive consumption.
Another famous take on this question is in Clement Greenberg’s essay “Avante-garde and Kitsch“, wherein two diametrically opposed categories of art — avante-garde art and kitsch art — are described and contrasted to highlight the purposes and properties of each.
Greenberg paraphrases Aristotle in suggesting that if all art and literature are imitation (of reality), then avante-garde art is the imitation of imitation: it is art concerned with the process of creating art for art itself, independent of external meaning. He then contrasts this with kitsch by saying that kitsch is not concerned with the process of art but with the effect of art (on a consumer). At first glance, it would seem as though generating art with artificial intelligence is sort of both avante-garde and kitsch. However, this process is very mechanistic and literally formulaic. We’re not at a point where ML models are all of the sudden generating art that is as original or as emblematic of artistic genius as, e.g., the first Pollock. Sure, generative machine learning models trained on art are mimetic w.r.t. the process of creating art (e.g., DRAW: A Recurrent Neural Network For Image Generation), but these models create art that has the effect of looking like art we already know about. I.e., it’s definitely kitschy to generate a Van Gogh styled photo of a landscape someone took on their cell phone. Or a Warhol or Lichtenstein selfie. At this point, it’s kitschy to generate a work of art end-to-end from a model.
Let me be clear: kitschy art generated by a model is not artificial art. It’s art. Just because it is possible to massively generate art in an automatic way does not mean that all other art is thereby devalued. If anything, I would argue that creating huge quantities of kitschy art helps highlight the uniqueness, rarity, and value of art that is not automatically generated.
We’re beginning to see the power of artists using artificial intelligence as a part of the overall creative/artistic process. As we saw with Sunspring, it’s possible to use AI as a tool in this process, not to replace and automate the process itself. You can use AI as part of the process of creating art and still be avante-garde. AI is adding value to the creation of new art and is highlighting the importance of art in a society that otherwise seems to be increasingly and singularly obsessed with STEM.
This is an exciting time where new tools are being developed and artists are trying them out. Can’t wait to see what comes next.
Alex Korbonits is a Data Scientist at Remitly, Inc., where he works extensively on feature extraction and putting machine learning models into production. Outside of work, he loves Kaggle competitions, is diving deep into topological data analysis, and is exploring machine learning on GPUs. Alex is a graduate of the University of Chicago with degrees in Mathematics and Economics.
How The Brightest Minds In AI Are Using Gaming to Reinforce Learning
Kurzweil’s Law Of Accelerating Returns can be summed up by saying “Humans use the best tools (technology) available to create the next generation of technology”, enabling exponential technological improvement. Open Source projects and the proliferation of knowledge sharing continue to accelerate these effects. We’re excited and inspired by some of these recent developments in gaming and AI!
Here are a few interesting projects at the intersection of AI and Gaming:
Earlier this summer, Microsoft announced that Project Malmo, a Minecraft-based testing ground for Artificial Intelligence, has made their open-source code available on github. This project is an effort to help researchers learn more AI tasks, using reinforcement learning. Another interesting aspect of the public project is the creation of bots that can talk to each other and to humans.
No Man’s Sky, a long-awaited and hyped game known for being procedurally generated by an algorithm, not written by a human, is stealing many precious hours from excited gamers this week. (Trust me, my husband and MLconf Co-Founder must have dedicated half of last weekend to this game!) After years of hype and anticipation, the game is now available on Playstation 4 and Microsoft Windows. Check out this article in the Atlantic, which paints a picture of the complexity of the game that generates full planets once a human is playing there.
ViZDoom is a Doom-based AI research platform for reinforcement learning from raw visual information. It allows developing AI bots that play Doom using only the screen buffer. ViZDoom is primarily intended for research in machine visual learning, and deep reinforcement learning, in particular.
Google DeepMind, known for conquering Lee Sedol at AlphaGo, and creating an algorithm that can learn to play 49 arcade games, has also been chinking away at “Montezuma’s Revenge” with 50+million frames under their belt. In their recent research paper, the group of contributors explain that the winning ingredient in their success in this project was employing intrinsic motivation, in which they fed the exploration system digital rewards, similar to the human experience of adrenaline, which encouraged the system to explore more frames, thus being more successful.
*Side Note: Ali Eslami, one of the authors of another popular paper by Google DeepMind researchers, Attend, Infer, Repeat: Fast Scene Understanding with Generative Models will be presenting at MLconf SF on 11/11/16, don’t miss it! Mention “Ali18” and save on tickets.
Upcoming MLconf Atlanta speaker, Chris Fregly, Co-Founder at PipelineIO, and his team have been helping a few large gaming companies apply their massive datasets towards more ML/AI use cases including the following:
- Level-up recommendations: ie. who they should fight next to reach the next level?
- In-game purchase recommendations: ie. extra weapons, health, food
- Chat-log NLP analysis between players: ie. prevent bullying and improve player engagement/retention
- Cheat-pattern classification analysis: ie. prevent online gambling fraud
Although the technology is evolving so quickly, it’s still not a perfect science. According to Fregly, the problem of labeling still exists.. At this point, it still takes a human to identify cheating/not cheating in gamer’s behavior.
As mentioned before, we’re in a time of enormous acceleration and it’s exciting! Gaming is just one field in which development and learning in artificial intelligence is improving. We’re excited to have some of these leaders involved in the MLconf community and we anticipate additional blog posts to follow as this work evolves.
Courtney Burton is the Founder of Sessions Events and MLconf- The Machine Learning Conference.