Emily is a staff research scientist at Google AI in NYC. Emily‘s research focus is natural language understanding. Emily leads a team of researchers and engineers (spread across NYC, Mountain View, and San Francisco) working on making models for language understanding more robust. Emily received her PhD in Computer and Information Science from the University of Pennsylvania in 2013 and her BS in Computer Science from Yale in 2007.
Upcoming Abstract Summary
Representations from natural language data: successes and challenges
Text representations learned via language modeling have been incredibly effective, surpassing the prior state-of-the-art across a variety of downstream tasks. I’ll give an overview of one such model: BERT (Bidirectional Encoder Representations from Transformers; Devlin et. al, 2018), developed by the Google AI Language group. BERT creates pre-trained representations that can be fine-tuned, which allows it to be used effectively in a variety of tasks with minimal architecture modifications.
However, despite large gains on standard benchmarks, neural network models still make mistakes due to mismatches between the training-test inputs and the inputs an NLP system would be asked to handle “in the wild”. I’ll discuss two case studies from our group that demonstrate how aligning model training to be sensitive to linguistic properties of expected real-world inputs can greatly improve accuracies.
The first (Zhang et. al, 2018) addresses text containing multiple languages, which is prevalent online. A feed-forward network with a simple globally constrained decoder outperforms previously published multilingual approaches in terms of both accuracy and speed, yielding an 800x speed-up and a 19.5% averaged absolute gain on three codemixed datasets.
The second (Elkahky et. al, 2018) addresses noun-verb ambiguity in English part-of-speech tagging, a frequent source of egregious errors. We create a new dataset of over 30,000 naturally-occurring non-trivial examples of noun-verb ambiguity. Enhancing the strongest existing tagger with contextual word embeddings and targeted training data gives a 52% relative improvement. Downstream, using just this enhanced tagger yields a 28% reduction in error over the prior