As recently as this spring, we’re seeing an influx of releases of bots in all industries, from lean startups to some of the largest companies in existence. According to VentureBeat, just this week, Microsoft announced that they have a relationship assistant in the works. According to the article and announcement, Microsoft will incorporate a mobile-friendly “relationship assistant” into its Dynamics CRM customer relationship management (CRM) software. This assistant is designed to recommend actions for salespeople to take..”This can make customer service more efficient and improve customer satisfaction”, Nadella said. From the tone of the announcement, the tool is meant to help humans with efficiency, not take their jobs.
Microsoft isn’t the only tech giant to have recently made such an announcement.. Early last week, Salesforce announced the launch of an AI platform, Einstein, that will work within Sales Cloud and Marketing Cloud. Salesforce also announced a new research division, focusing on deep learning, led by Richard Socher, formerly co-founder and chief executive of A.I. startup MetaMind, which Salesforce acquired earlier this year. The tone of this announcement is similar to that of Microsoft- these AI enhancements are aimed to help sales and marketing teams to be more efficient, not to replace them. More details can be found in this VentureBeat article here.
The tech behemoths aren’t the only ones releasing business minded bots- some smaller startups are making waves with their work with bots in the recruiting industry. On September 3rd, our co-founder, Shon Burton posted his guest blog post “Why I hate Chatbots” In this post, Burton described a new recruiting AI tool, “RAI” and how he believes sourcers and recruiters will benefit from using the tool. Burton states “It won’t fool anyone into thinking it’s human, but for practical recruiting tasks it’s already quite useful. By combining the practical conversational interface of tools like Siri and Alexa, with a Wolfram Alpha inspired ‘knowledge engine’, we’ve developed a recruiting assistant with which a non-expert user can conduct a talent search conversationally.” This week, we saw a press release: https://hiringsolved.com/hello/rai
Also this week, Burton was featured in an interview piece in Fast Company about RAI, specifically focusing on the Diversity Boost feature within RAI, a feature that changes the relevance algorithm within a search, rather than a simple filter, helping recruiters to find a diverse pool of candidates to interview for open positions.
Other new projects to mention are: Mya, which came out of private beta as a job-seeker-facing chatbot to assist with the job application process this summer, as well as an MLconf sponsor: Talla. When asked to participate in this blog, Talla provided a few words:
“At Talla we launched a few bots and realized that the bot-as-an-application model is limited. Therefore, we’ve taken our product up to the platform level. Now, our bot is just the UI for an intelligent layer that lives across your HR applications, and can automate many different workflows.
A very simple interface can be used to do many things, like create custom employee onboarding flows that are delivered conversationally, answer basic benefits questions, and send simple polls throughout the organization within a company’s existing chat platform. All of these workflows are based on common core technologies, like automating or scheduling messages, tapping into common data sources, like HR systems, and performing NLP on structured or unstructured text.”
Though these various bots aim to solve different problems that exist in relationship management; whether in sales, marketing, recruiting, or onboarding, they all seem to aim to assist, enhance and raise efficacy, not replace. We’ll keep our eyes out for developments within these mentioned projects.. Stay tuned!
*[Photo: Flickr user interestedbystandr]
Interview with Sergey Razin, Ph.D., Chief Technology Officer, SIOS Technology
What is topological behavior analysis?
Topological Behavior Analysis (TBA) is the real-time algorithmic analysis of computer data that originates from complex virtualization and cloud environments. It derives from Topological Data Analysis that leverages K-means as its foundation.
Computer environments have many different layers that generate a large volume of statistical data – from the user experience layer (i.e. press of a button) to the data on the storage system, with many layers in between (cell phone towers, providers, networks, servers, etc.). All that data needs to be ingested, modeled (trained) and provide the “answers” to variety of questions in automated fashion that IT/DevOps may have, such as:
- Is there a problem?
- What is the root cause?
- What should I do about it?
K-means provides the ability to abstract and define the behavior of workloads and their impact on the infrastructure in a form of clusters (vs individual time series which would not scale) as well as to capture the seasonal behaviors that extremely necessary to understand the behaviors that can be very specific to the industry where the computer environment is being used (e.g., sales fluctuations in retail).
Combining K-means with Topological Data Analysis provides the ability to perform detect the anomalies based on multi-dimensional models that learn the interplay between the features of the statistical data that represent the behavior.
How do you combine k-means with mixture for TBA?
While developing a product feature that predicts performance issues within a computer environment (virtualization, cloud, etc.), we have developed an algorithm that applies Monte Carlo Simulation on top of K-means based models.
Once again, this approach leverages K-means as the foundation that provides the ability to model the behaviors of the workload and its impact on the computer environment. From the learned behavior encapsulated in clusters that also represents the seasonal behaviors of the data, we are able to derive a prediction of the behavior by:
- Deriving the predicted expected behavior of the workload and its impact on the components of the infrastructure
(such as compute, network, storage) by applying Monte Carlo Simulation. - Once the prediction for the expected workload behavior is derived for the individual workload, we perform the “stacking” function that stacks the predicted expected behavior to determine whether it will reach the capacity of the infrastructure (whether it is at the compute, network or storage layers).
Leveraging K-means and Monte Carlo Simulation we can accurately predict the performance issues within the compute environment.
What are the challenges in predicting workloads in servers?
I have mentioned a couple of issues in my prior responses, but let me summarize:
- Amount of data (Big Data),
- Inter-play as well as dependency (statistical dependency) between the features, Dimensionality of the features,
- Real-time nature of the matter that pending real-time decisions to avoid failure of critical applications,
- Seasonality of the behavior,
- Dynamic nature of the environment that moves workloads within the environment and across geographies as well as dynamic nature of the workloads that depend on user interaction as well as application changes
While (a) – (e) can be addressed through the algorithms mentioned earlier (f) requires almost weather-forecasting like analysis.
First, there is a prediction of the future based on learned behaviors. This is analogous to a 7-day weather forecast. However, like a weather forecast, severe storms (or issues in a computer environment) can start and move rapidly, affecting both the forecast and the recommendations that may be made as a result.
That is why in addition to forecasting the future, it is important to identify issues and provide recommendations (automated) in real time on how to address such issues without affecting parts of the system that were not affected by the storm and therefore should continue with the previously forecasted recommendations.
That’s where forecasting based on Monte Carlo Simulations needs to work in unison with Topological Behavior Analysis as causality algorithms (mentioned later) in real-time to track all the dynamic changes in the environment.
Why can’t you use time series modeling?
Unfortunately, time series modeling is the state of the art for most tools in the IT space, i.e. use of time series analysis. This is the case because most IT tools were built with a Computer Science approach rather than a Data Science approach. Before virtualization and cloud computing became popular, understanding and optimizing computing environments was seen as an infrastructure problem instead of a data problem. Expertise in data and statistical modeling was not a requirement or event considered. As a result, most IT tools were built with a solid knowledge of Computer Science and the IT space (i.e. architecture, design patterns, etc.). Time series analysis was the apogee of Machine Learning and implemented in IT tools today simply because it is easy to implement and understand. However, time series analysis cannot address challenges (a) – (f) in my response to the previous question.
- The amount of data that radiates from all the layers in the IT operations environment is simply impossible to deal with the individual data points and higher level of abstraction that is capable to represent the behavior is required (such as clusters) which directly relates to (a) and (c) challenges mentioned earlier.
- Time series modeling cannot capture the multi-dimensionality, interplay, and uncertainty within the features of the data (especially at scale) that is required to accurately identify the meaningful anomalies within the IT operations environment.
- Finally, some important data is not time series data but may include other features (such as data related to changes in the infrastructure, configuration, and code).
As a result, I have identified the gap and an opportunity to develop a new solution that addresses all of the challenges mentioned earlier and will ultimately deliver my vision of a self-driving datacenter that is based on data and data science, eliminating the human guesswork used today.
Why isn’t deep learning and option?
Deep learning is an option.
Today we are just scratching the surface of applying statistical modeling to IT operations data (that is not limited to metrics, but can also include code changes in the application, etc.). Our causality algorithm is already a network (Bayesian-like network) that is driven by posterior and conditional probabilities (still a pretty “shallow” model). However, we are in the process of experimenting with TensorFlow to introduce “deep”-er networks into our analysis that will enable us to address larger scale and more complex use cases (especially relevant to change management, networking and security where that are a lot of features to be explored).
In addition, our current platform operates on-premises and our goal is to push our platform into the “cloud” which would allow exposure to more compute capacity (for compute intensive operations including GPU) and more data that are essential for “deeper” models (i.e. more data and more computational power).
For example, one of the complex use cases (applied in performance and security analysis) is how to identify bad code changes that cause a problem and predict whether the bad code can cause security, reliability and performance problems. As use cases grow in scale and complexity, deep learning models will allow to determine the right features and to more dynamically and accurately discover issues that arise and their root cause(s).
As CTO, Sergey is responsible for driving product strategy and innovation at SIOS Technology Corp. A noted authority in advanced analytics and machine learning, Sergey pioneered the application of these technologies in the areas of IT security, media, and speech recognition. He is currently leading the development of innovative solutions based on these technologies that enable simple, intelligent management of applications in complex virtual and cloud environments.
Prior to joining SIOS, Sergey was an architect for EMC storage products and EMC CTO office where he drove initiatives in areas of network protocols, cloud and storage management, metrics, and analytics. Sergey has also served as Principal Investigator (PI), leader in research, development and architecture in areas of big data analytics, speech recognition, telephony, and networking.
Sergey holds PhD in computer science from the Moscow State Scientific Center of Informatics. He also holds a BS in computer science from the University of South Carolina.
Interview with John Melas-Kyriazi, Senior Associate at Spark Capital
Our past Technical Chair, interviewed John Melas-Kyriazi, Senior Associate at Spark Capital, regarding his thoughts on the intersection of Machine Learning and Venture Capital..
Previously, you have stated that big companies already own the data and they are not willing to share them. There is a big move for open data from universities, government, hospitals, etc. Do you see an opportunity for startups to mine them and come up with cool products?
JM-K) Yes, I do think there’s an interesting opportunity here.
Startups typically don’t bring proprietary data to the table — they’re startups, after all — so they have a few different strategies for building their own datasets. Many startups generate data through the use of their product (think user-generated content on Waze, or genetic data from 23andMe) that becomes a core competitive advantage over time. Another strategy, which is relevant to this question, is to aggregate third-party data that’s traditionally been locked in silos. Just imagine what interesting machine learning applications you could build on top of research data from universities, or across patient data from many different medical providers, to take two examples. However, this is difficult to pull off. The key challenge for a startup is getting permission to use that data, which can often be sensitive, from the relevant data owners.
Now, fully open data access sounds great on paper, but it would be a blessing and a curse for startups. It would become easier for startups to access that data; however, if one startup can, others can too, and any interesting new dataset would attract a flock of entrepreneurs and engineers competing to build the best applications. Low barriers to entry would make it difficult (although of course not impossible) for any one startup to create a truly outsized impact.
Data is hard to collect, algorithms are for free, but still putting them together to make an application that solves a specific enterprise problem is not easy. Do you believe that we are going to see a shift towards application oriented startups? Are we going to see the same explosion of app companies the same way we saw it in 80s/90s when databases became a standard in the enterprise world?
JM-K) It’s hard to compare one period of innovation to another, but I agree that we will continue to see a tremendous amount of activity from application-layer startups that leverage data and machine learning. As the tools for building these types of companies become cheaper and easier to use, and as relevant training data becomes easier to access, the benefits of machine learning technology will continue to become democratized and more widely used by smart software engineers.
Further, I think that machine learning technology will ultimately get woven into the fabric of many/most existing applications. While ML-native startups are roaring onto the scene, existing software companies will take a number of different strategies to get up to speed: 1) acquire startups with substantial machine learning IP and talent; 2) aggressively recruit machine learning engineers and data scientists; 3) build internal competency and leverage the growing portfolio of open source machine learning tools.
What is your opinion about data trading? We trade all sorts of commodities at high volumes. Are we going to see the data-markets grow?
JM-K) As we move from deterministic (rule-based) software to increasingly probabilistic methods in programming, data will continue to increase in value to a wider audience of developers and companies. I have no doubt that markets for data will continue to grow in importance, and we will start to see more businesses focused on brokering data sales, building online data marketplaces and collaborative data-oriented communities.
Established tech companies like Apple, Google, and Salesforce have acquired a substantial number of machine learning startups over the past five years. Will this trend continue?
JM-K) Consolidation in the machine learning space is natural given the massive talent gap that currently exists in the market. A few years ago, established tech companies were acqui-hiring teams of mobile engineers by the handful. Now, data science and machine learning are hot, and the easiest way to add machine learning talent to your company is to acquire a startup with a highly-functioning ML team.
Additionally, I do believe that many machine learning startups will face serious long-term defensibility challenges if they do not have best-in-class data. For some, joining forces with a tech company who brings superior data to the table is an applaudable and logical outcome.
John Melas-Kyriazi is a senior associate at Spark Capital. John is interested in the AI and machine learning space and as a firm, Spark Capital has invested in a number of companies focused on AI/ML, including Cruise Automation and Sift Science. Before joining Spark, John left a Ph.D. program at Stanford to help run StartX, a startup accelerator program affiliated with Stanford University. John received a B.S. in Engineering Physics and an M.S. in Materials Science & Engineering from Stanford.
Interview with Hussein Mehanna, Engineering Director – Core ML, Facebook
Our past Technical Chair, interviewed Hussein Mehanna, Engineering Director – Core ML, Facebook, regarding his upcoming presentation Applying Deep Learning at Facebook Scale, scheduled for 09/23/16 at MLconf Atlanta.
One of the criticism against deep learning models was the complexity of inference. In your talk you will explain how you reduced the inference time. Does this mean there is no advantage of shallow models versus deep models anymore?
HM) No, I don’t think so. In fact, one of the tricks that are growing in popularity these days is using deep expensive models to learn and then use those to teach shallow models (dark matter transfer). There seems to be a theory that during learning you need more capacity and complexity but that could be reduced at inference. In fact at times it even improves accuracy. So I think shallow models will stay.
Do we need to compromise accuracy to make deep learning inference fast?
HM) Actually not necessarily, at times it may even improve generalization as complex models overfit. That said, figuring out how to reduce the computational load of a model is still non trivial. Making this simpler and more automatic is something that will help the industry.
TensorFlow, Torch, Theano, Mxnet, CNTK,…. Can you help us survive the babel of deep learning platforms? Can you help the MLconf audience what to choose, or how to choose?
HM) Yes, I probably can. That said, MLconf audience should feel happy because diversity increases the chances that they get tools closer to their needs. We are in a creative chaos phase in AI but things are converging.
I have noticed that some deep learning platforms are good with dense data and others with sparse. At Facebook you are dealing with both. How did you manage to unify both under one platform?
HM) Good question – I will need to check with our legal system before I answer that. All I can say now is that we treat both as a first class citizen and we are investing in algorithms that operate in the intersection. This is majorly beneficial for sparse scenarios since traditional deep learning has been dense focused as its easier to get hold of images than social data.
You implemented deep learning at scale. What is the gap between theory and practice? What are the tricks that make the difference that you don’t find written in a paper?
HM) That’s a fantastic question. Any ML algorithm is really dependent on the data. If you change the data, you change the problem completely. That’s the biggest difference in my opinion between Academia and Industry. It makes a lot of sense for academia to standardize their datasets but most of those don’t represent what the industry uses. Think about the intersection in data between the imagenet dataset and a system that needs to recognize consumer products. Probably very different. The other difference is that industrial systems receive continuous improvements that accumulate over time and so baselines in industry are much more tuned.
What was the most surprising fact that you have discovered about deep learning? Can you share a paper with us that had a great influence on you?
HM) I am going to seem biased towards Facebook AI Research a bit but I adore the character level deep learning for NLP. The fact you can learn from raw textual input with no preprocessing as you would with images is just extremely powerful. In my early college days, I just could not bear all the special rules that riddled NLP and I always believed there is a better solution. This paper provides good basis for that, we now have more sophisticated stuff in the team but that paper was a great start.
Hussein Mehanna, Engineering Director – Core ML, Facebook
I am the Director of the Core Machine Learning group at Facebook. Our team focuses on building state of the art ML/AI Platforms combined with applied research in event prediction and text understanding. We work closely with product teams in Ads, Feed, Search, Instagram and others to improve their user experiences.
In 2012, I joined Facebook as the original developer on the Ads ML platform. That quickly developed into a Facebook wide platform serving more than 30+ teams. Prior to Facebook, I worked at Microsoft on Search query alterations and suggestions in Bing and on communication technologies in Lync. I hold a masters degree in Speech Recognition from the University of Cambridge, UK where I worked on noise robustness modeling.
Guest Blog Post: "Why I Hate Chatbots"
Our Co-Founder and the CEO at HiringSolved, Shon Burton, recently shared his musings on bots and what his team has been working on to streamline the recruiting process..
The last chatbot I wrote was called “Larry”. It was written in 1999 to simulate a conversation with my boss at the time, Larry Rocha. “Larry’s” main function was called RochaBrain() and it contained conditional triggers and responses which allowed my workmates, including Larry himself to enjoy a good laugh while interacting with it. It was just believable enough to be pretty funny for a minute or two, albeit in a limited context.
So why the sudden hate for chatbots? Let me explain. As a kid growing up in the 80’s, I was fascinated with AI. I’ve collected and modified different versions Eliza, Racter, Alice and more. Computer programs that could hold a human-like conversation were not only fun to play with, they were and still are regarded as one of the primary indicators of artificial intelligence and even consciousness. The famous Turing test, a thought experiment postulated by the great Alan Turing over 60 years ago, is still referenced today as a primary measuring tool and metric for artificial intelligence.
If Turing himself were alive today, he may have different ideas about how to define and measure artificial intelligence. The concepts of “thought”, “intelligence” and “consciousness” are entirely philosophical in nature in that they defy rigorous scientific definition. These ideas are entertaining to ponder but not very practical. Chatbots in my view, are similar; entertaining but not practical. The high-water mark of a Turing Test certified chatbot, is a casual conversation in which the software’s responses are indistinguishable from a human. I think that we can do better.
AI needs to be better than a human. More useful. More like Alexa and Siri on steroids, rather than HAL or C3PO. Shifty, insecure and homicidal, HAL was far too human. Even Tony Stark’s J.A.R.V.I.S is too human. Too opinionated, too chatty. Data from STTNG is also too human for some of the same reasons. On the topic of Star Trek, the computer intelligence powering the Enterprise D strikes the right balance. It communicates very well but it doesn’t exhibit any of the human eccentricities or flaws which would impede its usefulness.
So what is a better AI and how do we build it? Useful AI is an exercise in UX Design. The main advantage that artificial intelligence has over human intelligence is it’s ability to quickly access and process massive amounts of information. Combine this ability with a human level understanding of language and you have something interesting: A conversational interface to knowledge and computation. Human creativity will still be required but it will be greatly magnified with this new tool.
We’re experimenting with a tool like this at HiringSolved. We call it RAI. We think of it as Siri for recruiting. RAI is a closed domain tool working on a retrieval based model with access to HiringSolved’s massive index of talent data. RAI is not ground-breaking from a Turing Test perspective. It won’t fool anyone into thinking it’s human, but for practical recruiting tasks it’s already quite useful. By combining the practical conversational interface of tools like Siri and Alexa, with a Wolfram Alpha inspired ‘knowledge engine’, we’ve developed a recruiting assistant with which a non-expert user can conduct a talent search conversationally.
Current systems require the user to construct searches using keywords and boolean operators. RAI simply allows the user to say: “I need to find female javascript developers with reactjs experience within 25 miles of Austin, Texas” and RAI produces results. From there, RAI can help the user construct and send a message to the selected candidates. This isn’t going to replace a human recruiter but it does reduce the number of steps involved in recruiting, letting the recruiters be more personal in their approach. If you would like to experiment with RAI sign up here. We’re releasing user accounts in small batches next month.
Shon Burton is CEO and Founder of HiringSolved, a market leading HR Tech startup focusing on the talent acquisition and talent search market. Prior to HiringSolved, Shon founded MLconf along with Courtney Burton. In his spare time Shon enjoys long walks on the beach with Courtney, playing Legos with his daughter and and spending hours restoring ancient, obtuse computers.
Interview With Alex Korbonits, Data Scientist, Remitly
Our past Technical Chair, interviewed Alex Korbonits, Data Scientist at Remitly, about his thoughts on artificial intelligence as it relates to the arts.
What kind of features do you need to use when you train a model that generates art?
AK) Art is perceived. Since AlexNet in 2012, deep neural networks have come to be synonymous with the state-of-the-art in computer vision and a range of other perception tasks, pushing the boundaries of what machine learning models can achieve much further than ever before. Hand-crafted features are out and distributed feature representations are in. In other words, you don’t create specific features, instead, you create the model architecture. Stephen Merity made an excellent point about this recently in a blog post: “In deep learning, architecture engineering is the new feature engineering“.
In that sense, the question then becomes: what kinds of architectures do you need to use when you train a model that generates art? First of all, if you want to generate art you’d better use a generative model :). After that your choice of architecture should follow the data it’s modeling. For visual art you probably would want convolutions. For music you’d want recurrence. For film you’d want both.
Give us an overview of generative models that have been successful in generating art.
AK) There have been so many exciting and interesting projects in this space within the last year alone that I sadly have to limit myself to just the ones I know.
First of all, Google’s “Deep Dream” took the world by storm last summer. Google took their well-understood discriminative classifier, the Inception network built for the 2014 ImageNet competition (also known as GoogLeNet in an homage to Yann LeCun’s LeNet), and decided to go deeper, as it were, by building a generative visualization tool to give intuition for the kinds of features/concepts the classifier was learning at different layers of the network. They wrote up a blockbuster Google Research blog post about it, along with releasing a gitHub repo and an iPython Notebook demonstrating Deep Dream. Entire startups/homespun projects were created around Deep Dream to take any image and effectively create a version of it that would be reminiscent of a scene from Fear and Loathing in Las Vegas. The horror, the horror.
Second, Andrej Karpathy released his own code in a gitHub repo for a character-level recurrent neural network implementation (called char-rnn) in Torch whose creative properties he demonstrated in a very popular blog post. Among other things, he trained a character-level LSTM (long short term memory network) on different corpora, including War and Peace, the completed works of Shakespeare, an open source set of LaTeX files of papers in algebraic geometry, and the source code for Linux. This kicked off a series of humorous web applications such as an ironical clickbait headline generator reminiscent of BuzzFeed as well as myriad Twitter bots generating sophisticated fake tweets for any given well-known persona for whom it is easy to mock ;). Traditionally, a lot of the generative models used for this kind of thing were simple markov chains, but LSTMs learn long-term dependencies that are comparatively impressive. Even training an LSTM on James Joyce’s Ulysses doesn’t look too far off from the real thing.
Third, an interesting area of using generative models to create art is in music. There’s a lot of work to be done here in terms of richness and training on audio recordings, but some low-hanging fruit has already been found. A couple of implementations of this that I know of thus far: (1) using an LSTM (even Karpathy’s char-rnn) to take music with a text encoding (such as MIDI files) to generate music from training data (which I had some fun playing with a year ago training on MIDI files of Beethoven piano sonatas but didn’t post); and (2) an awesome blog post at Daniel Johnson’s blog hexahedria, wherein he describes an implementation of what he calls a “Biaxial Recurrent Neural Network for Music Composition” that seems to generate results far superior to mine.
Fourth, I would be remiss not to discuss the well-known “A Neural Algorithm of Artistic Style”. This has now spawned an app called Prisma that, while not quite as white-hot popular as Pokemon Go, is nonetheless very much in vogue as you are reading this. A generative model is trained on a base image and a style image, whose output is a composition that resembles the base image in the “style” of the style image. Now you can even transfer style while keeping the original colors, which was not the case in the original implementation. Functionally this means that, for example, I can take a normal photograph of the Space Needle in Seattle and generate a version of it inspired by a favorite cubist painting by Braque.
Last, and perhaps most recently, we saw that on Ars Technica a group of artists released a short film called Sunspring whose script was entirely composed by “an AI” (which I have a hunch was trained on a derivative of Karpathy’s char-rnn), trained on a large corpora of science fiction scripts and presumably given different plot-related prompts/primers to generate short sequences that made up the entirety of the final script. We now see real artists using state-of-the-art machine learning models to assist in the creative process. How cool is that?
Can deep learning embeddings provide dimensions that are associated with art measures, that humans pick up?
AK) Yes.
A recent paper, “Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings” explores this idea. First, the authors determine semantically meaningful directions within the embeddings to locate and expose bias. They then exploit the existence of these directions to combat bias. Super cool.
In this sense, I think it is totally possible to create/discover dimensions/directions within an embedding that are associated with human-interpretable art measures. You could stay within the medium you’ve embedded to explore and interpret the space — e.g., paintings — or you could marry up your embedding of paintings with some word embeddings (perhaps via descriptions of your paintings or painting metadata) to help understand how you’re moving around it. Chris Moody from Stitch Fix gave a great talk at Data Day Seattle this summer combining word embeddings and images of clothing items to make recommendations to customers. There’s no reason you couldn’t do this with art. For example, you could explain the artistic/aesthetic differences in style between two similar paintings by different artists (such as cubist portraits by Picasso and Braque) by comparing color palette, brushstrokes, shading, or other aspects you want to inspect along the relevant directions in your embedding.
Is there a word2vec embedding for art so that you can transform a piece of music by just adding a vector to it?
AK) Mozart + Metallica – Beatles = Beethoven? As long as you can take a piece of music and properly model it, there’s no reason in principle why you couldn’t embed a specific piece of music into some kind of metric space (either as a single point or perhaps more intuitively as a sequence of points) that you could then manipulate within that space to transform along specific directions. E.g., you could modulate the key of a piece or a specific passage from major to minor, or perhaps along a direction that changes the instrument playing from clarinet to oboe.
I haven’t come across music2vec yet but I eagerly await its arrival. I imagine that this would come about via sequence-to-sequence models such as LSTMs since music, like text, is inherently sequential.
For visual art embeddings you could use convolutional neural networks.
I think autoencoders could be useful here too.
If we can computerize art by massively generating it automatically, don’t we create an inflation? What is the value of artificial art? Isn’t art supposed to be rare and unique?
AK) This question was posed 80 years ago in a very famous essay titled “The Work of Art in the Age of Mechanical Reproduction” by Walter Benjamin, wherein the art historical properties of prints of famous works of art are examined. At issue was the value of art in an age where it was — all of the sudden — possible to print a poster of Picasso’s Les Demoiselles d’Avignon you could sell to anyone trying to find something to put up on the walls of their bedrooms, dorm rooms, apartments, homes, offices, restaurants… you name it. The mechanical reproduction of art doesn’t devalue art itself simply by virtue of increasing access and awareness of rare and unique art. However, it does affect the typical experience of art from active engagement to passive consumption.
Another famous take on this question is in Clement Greenberg’s essay “Avante-garde and Kitsch“, wherein two diametrically opposed categories of art — avante-garde art and kitsch art — are described and contrasted to highlight the purposes and properties of each.
Greenberg paraphrases Aristotle in suggesting that if all art and literature are imitation (of reality), then avante-garde art is the imitation of imitation: it is art concerned with the process of creating art for art itself, independent of external meaning. He then contrasts this with kitsch by saying that kitsch is not concerned with the process of art but with the effect of art (on a consumer). At first glance, it would seem as though generating art with artificial intelligence is sort of both avante-garde and kitsch. However, this process is very mechanistic and literally formulaic. We’re not at a point where ML models are all of the sudden generating art that is as original or as emblematic of artistic genius as, e.g., the first Pollock. Sure, generative machine learning models trained on art are mimetic w.r.t. the process of creating art (e.g., DRAW: A Recurrent Neural Network For Image Generation), but these models create art that has the effect of looking like art we already know about. I.e., it’s definitely kitschy to generate a Van Gogh styled photo of a landscape someone took on their cell phone. Or a Warhol or Lichtenstein selfie. At this point, it’s kitschy to generate a work of art end-to-end from a model.
Let me be clear: kitschy art generated by a model is not artificial art. It’s art. Just because it is possible to massively generate art in an automatic way does not mean that all other art is thereby devalued. If anything, I would argue that creating huge quantities of kitschy art helps highlight the uniqueness, rarity, and value of art that is not automatically generated.
We’re beginning to see the power of artists using artificial intelligence as a part of the overall creative/artistic process. As we saw with Sunspring, it’s possible to use AI as a tool in this process, not to replace and automate the process itself. You can use AI as part of the process of creating art and still be avante-garde. AI is adding value to the creation of new art and is highlighting the importance of art in a society that otherwise seems to be increasingly and singularly obsessed with STEM.
This is an exciting time where new tools are being developed and artists are trying them out. Can’t wait to see what comes next.
Alex Korbonits is a Data Scientist at Remitly, Inc., where he works extensively on feature extraction and putting machine learning models into production. Outside of work, he loves Kaggle competitions, is diving deep into topological data analysis, and is exploring machine learning on GPUs. Alex is a graduate of the University of Chicago with degrees in Mathematics and Economics.