Extracting key-fields from a variety of document types remains a challenging problem. Services such as AWS, Google Cloud and open-source alternatives provide text extraction to “digitize” images or pdfs to return phrases, words and characters. Processing these outputs is unscalable and error-prone as varied documents require different heuristics, rules or models and new types are uploaded daily. In addition, a performance ceiling exists as downstream models depend on good yet imperfect OCR algorithms. We propose an end-to-end solution utilizing computer-vision based deep learning to automatically extract important text-fields from documents of various templates and sources. These produce state-of-the-art classification accuracy and generalizability through training on millions of images. We compare our in-house model accuracy, processing time and cost with 3rd party services and found favorable results to automatically extract important fields from documents. Bill.com is working to build a paperless future. We process millions of documents a year ranging from invoices, contracts, receipts and a variety of others. Understanding those documents is critical to building intelligent products for our users.
Session Summary
Using Deep Learning to Understand Documents
MLconf 2022 San Francisco
Eitan Anzenberg
Bill.com
Chief Data Scientist
Learn more »