RS/ Q1) Tell us briefly about yourself and your work.
AG) I am the Resilience Engineering Lead at One Concern, a Palo Alto-based benevolent AI company, with a mission to save lives and livelihoods before, during and after disasters. I have been involved with One Concern for almost 3 years, essentially since the company’s inception. I developed the resiliency framework that lays out the conceptual architecture for us to meet our vision to achieve planetary-scale resilience, quantify it at a community level, and constantly improve resilience over time. As part of this effort, I have created specific metrics for measuring how access to critical resources is impacted in disasters and to identify those communities that are most at risk before, during and after disasters. Besides working on resilience, I have also contributed to development of our Seismic product to estimate near real time impacts in an earthquake, created evaluation metrics for our machine learning models and set up the simulation pipeline to generate earthquakes for preparedness trainings and mitigation policy design.
Prior to One Concern, I completed my PhD thesis from Stanford on creating the framework to estimate community level seismic risk from man-made or induced earthquakes, with a case study implementation in Oklahoma. As part of this framework, I implemented and developed novel statistical methods to quantify how seismic rates change over time with changes in oil and gas extraction processes responsible for causing these earthquakes. Prior to my PhD, I worked as a structural design engineer at Magnusson Klemencic Associates in Seattle and was responsible for the design of an iconic building in Boise – Jack’s Urban Meeting Place – along with helping out with design of buildings in San Francisco, Delhi and Kuala Lumpur.
RS/ Q2) I had not previously heard of “resilience engineering.” Tell us a little bit about that field. Is it new? How many people worldwide graduate with a degree in resilience engineering?
AG) As far as I am aware, resilience engineering is not a degree offered by universities. Through our work at One Concern, we realized that most of the work around disaster impacts occurs in silos by hazard or by field of study, while resilience is an inherent characteristic of communities that spans across both expected disasters and unexpected events and across many fields like risk engineering, economics, data science, behavioral analysis, policy and education, urban design and social sciences. As a result, we decided to create this position and team at One Concern to focus on holistic resilience analysis. At One Concern, we have defined Resilience as the ability of communities to maintain their normal functionality and to adapt and grow in the face of acute shocks and long term stresses. We are also starting to see universities like Stanford, Harvard and MIT create multi-disciplinary programs to bring together experts from different fields and companies to work on resilience.
RS/ Q3) What is the story behind the founding of One Concern?
AG) One Concern spun off from a machine learning project in Andrew Ng’s class at Stanford. The co-founders, Ahmad Wani, Nicole Hu and Timothy Frank along with Yawar Aziz demonstrated in that project that rapid damage assessment after earthquakes was possible using available data and machine learning. Ahmad was also motivated to help people during disasters from his recent experience being stuck on his roof while floods in Kashmir inundated his town. Combined with the success of their project and encouragement from Andrew Ng, One Concern was founded to rapidly measure impacts after disasters, prepare for and mitigate against them and thus save lives and livelihoods.
RS/ Q4) What sort of open data do you use for your models? And what type of data do you (One Concern) collect on your own?
AG) We use a combination of open data, proprietary data and our self-collected data. Our open data sources include the US census data that we use for understanding demographics and building inventory, weather forecast data to estimate flood inundation and USGS earthquake data to simulate earthquakes.
One Concern has also created a Digital Anthropology team which is tasked with collecting disaster relevant data from across the world. Some of the regions where our team has been deployed include Bogota to collect data about buildings, demographics and socio-cultural parameters, and Mexico City and Lombok to collect post-earthquake damage data. We use a variety of data collection tools like apps, cameras and remote sensing imagery. A critical component of data processing for us is that individual sources do not accurately capture the required features of our models. As a result, our data processing team uses several statistical and data mining techniques to augment data from different sources to ensure that the final datasets represent the ground truth.
RS/ Q5) Are you able to share an example of how your data was used for a natural disaster? Do you collaborate with government agencies?
AG) One Concern’s primary clients are government emergency management departments, so we absolutely collaborate with government agencies. It is critical for us to understand their needs and gaps in the tools that they are using so we can provide them the necessary information during high-stress and high-stakes disaster situations.
Since we have been deployed, our client jurisdictions have not experienced a major disaster, however for all the small earthquakes that have occurred recently, users were able to go to the One Concern Seismic platform and quickly observe the estimated impacts and decide whether to activate the Emergency Operation Centers or not. In the meantime, users have been regularly using our Preparedness Module to better prepare for future earthquakes. Before we launched our product, it was extremely expensive and time consuming for emergency managers to generate realistic earthquake simulations for trainings and hence they were training with the same scenarios over and over. Our Preparedness module has significantly improved their ability to request different scenarios and set up more frequent training exercises with very different impacts.
RS/ Q6) Are you able to share which of the deep learning architectures (RNN, GAN, LSTM, etc.) you use in your work?
AG) We use both deep learning and traditional machine learning algorithms. However given our IP policy, I cannot share the exact models. Though I would add that we are solving a lot of different problems, each of which require its own unique approaches. A few examples of the problems that we are tackling are augmenting data from different sources as I mentioned above, estimating real time impacts from earthquakes, forecasting flood water inundation levels days in advance, using remote sensing to identify disaster extents, using real time data to rapidly update impact estimates and capturing building characteristics from images.
RS/ Q7) I imagine you work on global data with global companies. What challenges do you face when working in a field that is so globally interdependent?
AG) You are absolutely right – disasters are local but their impacts can be felt globally. We observe this commonly in supply chain disruptions and economic ripple effects, for example, disruptions to the auto industry after Japan quakes and reductions in consumer spending after US hurricanes. I want to highlight two challenges when working with machine learning models at a global scale – 1) data availability and 2) model applicability.
Data availability changes drastically from one country to another, even from one state or region to another. This is also the case with data resolution, for example, seismic sensor data is more densely available in the western US, compared to elsewhere in the country. To develop scalable products, we have had to develop multi-layer models where we can add layers and improve model accuracy as more data components become available, and to create data processing techniques to ingest variable resolution data for model consumption.
Disaster impacts also vary from region to region. If local data is not available, then the models will not capture local effects. To resolve this, we modify the features to make them transferable across geographies. For example, earthquake magnitude is not transferable because the same magnitude can cause very different damage based on the type of fault rupture and wave propagation through the soil. This wouldn’t be a concern if we had seismic data from a lot of places but big earthquakes are very rare. Instead, we use multiple shaking parameters at local sites that take these factors into account and are thus transferable across geographies.
RS/ Q8) Here are a few of the trending topics in data science. What are your high-level thoughts, in 1-2 sentences, for each topic. (Feel free to pick a couple of topics if you prefer, rather than all 5)
- The first step should be to identify the objectives of your problem so you can choose or design the right metric to evaluate your models. Check whether the metric is convex or quasi-convex if you are also using it as the loss function. It is also crucial that the assumptions and outputs of the models are understood and clearly stated.
- If possible, getting peer review for your models can help in ensuring that you did not miss something inadvertently. I also like to ensure that when we start working with a new dataset, we always separate out test sets that are never available during training, and that test sets evolve as the datasets and the models expand.
- Diversity is key to ensuring a creative environment that fosters new ideas, perspectives and research. Working with a diverse group of people has also helped me in my personal growth, like learning to create a space for everyone to contribute.
- My PhD research taught me that methods created for one field can be highly beneficial to other fields. Open source is fundamental to collaboration and bringing together people from around the world to improve what has already been done but to also find novel ways to use what has already been created.
Opportunities for entry level data scientists
- We are looking for mission driven natural phenomena scientists and machine learning experts, who want to come help us make the world more resilient!
RS) Thank you for participating in this interview.
Bio: Abhineet Gupta
Abhineet Gupta is the Resilience Engineering Lead at One Concern. He works on quantifying resilience through cities’ abilities to withstand stresses and shocks from natural and manmade disasters, and to provide the technological solutions to increase resilience. Abhineet graduated from Stanford in 2017 with a Master’s and PhD in earthquake engineering and PhD minor in Computer Science with a focus on artificial intelligence. For his PhD, he developed tools to evaluate seismic risk from man-made earthquakes in regions like Oklahoma. Before his PhD, he worked as a structural design engineer at Magnusson Klemencic Associates in Seattle and designed buildings in Boise, India and Malaysia.
- LinkedIn: https://www.linkedin.com/in/guptaabhineet/
- One Concern: https://www.oneconcern.com/product
Bio: Reshama Shaikh
Reshama Shaikh is a data scientist/statistician and MBA with skills in Python, R and SAS. She worked for over 10 years as a biostatistician in the pharmaceutical industry. She is also an organizer of the meetup groups NYC Women in Machine Learning & Data Science and PyLadies. She received her M.S. in statistics from Rutgers University and her M.B.A. from NYU Stern School of Business.
- Twitter: @reshamas