The MLconf Team

Interview with Sergey Razin, Ph.D., Chief Technology Officer, SIOS Technology

What is topological behavior analysis?

Topological Behavior Analysis (TBA) is the real-time algorithmic analysis of computer data that originates from complex virtualization and cloud environments. It derives from Topological Data Analysis that leverages K-means as its foundation.
Computer environments have many different layers that generate a large volume of statistical data – from the user experience layer (i.e. press of a button) to the data on the storage system, with many layers in between (cell phone towers, providers, networks, servers, etc.). All that data needs to be ingested, modeled (trained) and provide the “answers” to variety of questions in automated fashion that IT/DevOps may have, such as:

Is there a problem?
What is the root cause?
What should I do about it?

K-means provides the ability to abstract and define the behavior of workloads and their impact on the infrastructure in a form of clusters (vs individual time series which would not scale) as well as to capture the seasonal behaviors that extremely necessary to understand the behaviors that can be very specific to the industry where the computer environment is being used (e.g., sales fluctuations in retail).
Combining K-means with Topological Data Analysis provides the ability to perform detect the anomalies based on multi-dimensional models that learn the interplay between the features of the statistical data that represent the behavior.

How do you combine k-means with mixture for TBA?

While developing a product feature that predicts performance issues within a computer environment (virtualization, cloud, etc.), we have developed an algorithm that applies Monte Carlo Simulation on top of K-means based models.
Once again, this approach leverages K-means as the foundation that provides the ability to model the behaviors of the workload and its impact on the computer environment. From the learned behavior encapsulated in clusters that also represents the seasonal behaviors of the data, we are able to derive a prediction of the behavior by:

Deriving the predicted expected behavior of the workload and its impact on the components of the infrastructure
(such as compute, network, storage) by applying Monte Carlo Simulation.
Once the prediction for the expected workload behavior is derived for the individual workload, we perform the “stacking” function that stacks the predicted expected behavior to determine whether it will reach the capacity of the infrastructure (whether it is at the compute, network or storage layers).

Leveraging K-means and Monte Carlo Simulation we can accurately predict the performance issues within the compute environment.

What are the challenges in predicting workloads in servers?

I have mentioned a couple of issues in my prior responses, but let me summarize:

Amount of data (Big Data),
Inter-play as well as dependency (statistical dependency) between the features, Dimensionality of the features,
Real-time nature of the matter that pending real-time decisions to avoid failure of critical applications,
Seasonality of the behavior,
Dynamic nature of the environment that moves workloads within the environment and across geographies as well as dynamic nature of the workloads that depend on user interaction as well as application changes

While (a) – (e) can be addressed through the algorithms mentioned earlier (f) requires almost weather-forecasting like analysis.
First, there is a prediction of the future based on learned behaviors. This is analogous to a 7-day weather forecast. However, like a weather forecast, severe storms (or issues in a computer environment) can start and move rapidly, affecting both the forecast and the recommendations that may be made as a result.
That is why in addition to forecasting the future, it is important to identify issues and provide recommendations (automated) in real time on how to address such issues without affecting parts of the system that were not affected by the storm and therefore should continue with the previously forecasted recommendations.
That’s where forecasting based on Monte Carlo Simulations needs to work in unison with Topological Behavior Analysis as causality algorithms (mentioned later) in real-time to track all the dynamic changes in the environment.

Why can’t you use time series modeling?

Unfortunately, time series modeling is the state of the art for most tools in the IT space, i.e. use of time series analysis. This is the case because most IT tools were built with a Computer Science approach rather than a Data Science approach. Before virtualization and cloud computing became popular, understanding and optimizing computing environments was seen as an infrastructure problem instead of a data problem. Expertise in data and statistical modeling was not a requirement or event considered. As a result, most IT tools were built with a solid knowledge of Computer Science and the IT space (i.e. architecture, design patterns, etc.). Time series analysis was the apogee of Machine Learning and implemented in IT tools today simply because it is easy to implement and understand. However, time series analysis cannot address challenges (a) – (f) in my response to the previous question.

The amount of data that radiates from all the layers in the IT operations environment is simply impossible to deal with the individual data points and higher level of abstraction that is capable to represent the behavior is required (such as clusters) which directly relates to (a) and (c) challenges mentioned earlier.
Time series modeling cannot capture the multi-dimensionality, interplay, and uncertainty within the features of the data (especially at scale) that is required to accurately identify the meaningful anomalies within the IT operations environment.
Finally, some important data is not time series data but may include other features (such as data related to changes in the infrastructure, configuration, and code).

As a result, I have identified the gap and an opportunity to develop a new solution that addresses all of the challenges mentioned earlier and will ultimately deliver my vision of a self-driving datacenter that is based on data and data science, eliminating the human guesswork used today.

Why isn’t deep learning and option?

Deep learning is an option.
Today we are just scratching the surface of applying statistical modeling to IT operations data (that is not limited to metrics, but can also include code changes in the application, etc.). Our causality algorithm is already a network (Bayesian-like network) that is driven by posterior and conditional probabilities (still a pretty “shallow” model). However, we are in the process of experimenting with TensorFlow to introduce “deep”-er networks into our analysis that will enable us to address larger scale and more complex use cases (especially relevant to change management, networking and security where that are a lot of features to be explored).
In addition, our current platform operates on-premises and our goal is to push our platform into the “cloud” which would allow exposure to more compute capacity (for compute intensive operations including GPU) and more data that are essential for “deeper” models (i.e. more data and more computational power).
For example, one of the complex use cases (applied in performance and security analysis) is how to identify bad code changes that cause a problem and predict whether the bad code can cause security, reliability and performance problems. As use cases grow in scale and complexity, deep learning models will allow to determine the right features and to more dynamically and accurately discover issues that arise and their root cause(s).

As CTO, Sergey is responsible for driving product strategy and innovation at SIOS Technology Corp. A noted authority in advanced analytics and machine learning, Sergey pioneered the application of these technologies in the areas of IT security, media, and speech recognition. He is currently leading the development of innovative solutions based on these technologies that enable simple, intelligent management of applications in complex virtual and cloud environments.
Prior to joining SIOS, Sergey was an architect for EMC storage products and EMC CTO office where he drove initiatives in areas of network protocols, cloud and storage management, metrics, and analytics. Sergey has also served as Principal Investigator (PI), leader in research, development and architecture in areas of big data analytics, speech recognition, telephony, and networking.
Sergey holds PhD in computer science from the Moscow State Scientific Center of Informatics. He also holds a BS in computer science from the University of South Carolina.

Interview with John Melas-Kyriazi, Senior Associate at Spark Capital

Our past Technical Chair, interviewed John Melas-Kyriazi, Senior Associate at Spark Capital, regarding his thoughts on the intersection of Machine Learning and Venture Capital..
Previously, you have stated that big companies already own the data and they are not willing to share them. There is a big move for open data from universities, government, hospitals, etc. Do you see an opportunity for startups to mine them and come up with cool products?
JM-K) Yes, I do think there’s an interesting opportunity here.
Startups typically don’t bring proprietary data to the table — they’re startups, after all — so they have a few different strategies for building their own datasets. Many startups generate data through the use of their product (think user-generated content on Waze, or genetic data from 23andMe) that becomes a core competitive advantage over time. Another strategy, which is relevant to this question, is to aggregate third-party data that’s traditionally been locked in silos. Just imagine what interesting machine learning applications you could build on top of research data from universities, or across patient data from many different medical providers, to take two examples. However, this is difficult to pull off. The key challenge for a startup is getting permission to use that data, which can often be sensitive, from the relevant data owners.
Now, fully open data access sounds great on paper, but it would be a blessing and a curse for startups. It would become easier for startups to access that data; however, if one startup can, others can too, and any interesting new dataset would attract a flock of entrepreneurs and engineers competing to build the best applications. Low barriers to entry would make it difficult (although of course not impossible) for any one startup to create a truly outsized impact.
Data is hard to collect, algorithms are for free, but still putting them together to make an application that solves a specific enterprise problem is not easy. Do you believe that we are going to see a shift towards application oriented startups? Are we going to see the same explosion of app companies the same way we saw it in 80s/90s when databases became a standard in the enterprise world?
JM-K) It’s hard to compare one period of innovation to another, but I agree that we will continue to see a tremendous amount of activity from application-layer startups that leverage data and machine learning. As the tools for building these types of companies become cheaper and easier to use, and as relevant training data becomes easier to access, the benefits of machine learning technology will continue to become democratized and more widely used by smart software engineers.
Further, I think that machine learning technology will ultimately get woven into the fabric of many/most existing applications. While ML-native startups are roaring onto the scene, existing software companies will take a number of different strategies to get up to speed: 1) acquire startups with substantial machine learning IP and talent; 2) aggressively recruit machine learning engineers and data scientists; 3) build internal competency and leverage the growing portfolio of open source machine learning tools.
What is your opinion about data trading? We trade all sorts of commodities at high volumes. Are we going to see the data-markets grow?
JM-K) As we move from deterministic (rule-based) software to increasingly probabilistic methods in programming, data will continue to increase in value to a wider audience of developers and companies. I have no doubt that markets for data will continue to grow in importance, and we will start to see more businesses focused on brokering data sales, building online data marketplaces and collaborative data-oriented communities.
Established tech companies like Apple, Google, and Salesforce have acquired a substantial number of machine learning startups over the past five years. Will this trend continue?
JM-K) Consolidation in the machine learning space is natural given the massive talent gap that currently exists in the market. A few years ago, established tech companies were acqui-hiring teams of mobile engineers by the handful. Now, data science and machine learning are hot, and the easiest way to add machine learning talent to your company is to acquire a startup with a highly-functioning ML team.
Additionally, I do believe that many machine learning startups will face serious long-term defensibility challenges if they do not have best-in-class data. For some, joining forces with a tech company who brings superior data to the table is an applaudable and logical outcome.

John Melas-Kyriazi is a senior associate at Spark Capital. John is interested in the AI and machine learning space and as a firm, Spark Capital has invested in a number of companies focused on AI/ML, including Cruise Automation and Sift Science. Before joining Spark, John left a Ph.D. program at Stanford to help run StartX, a startup accelerator program affiliated with Stanford University. John received a B.S. in Engineering Physics and an M.S. in Materials Science & Engineering from Stanford.

Guest Blog Post: "Why I Hate Chatbots"

Our Co-Founder and the CEO at HiringSolved, Shon Burton, recently shared his musings on bots and what his team has been working on to streamline the recruiting process..
The last chatbot I wrote was called “Larry”. It was written in 1999 to simulate a conversation with my boss at the time, Larry Rocha. “Larry’s” main function was called RochaBrain() and it contained conditional triggers and responses which allowed my workmates, including Larry himself to enjoy a good laugh while interacting with it. It was just believable enough to be pretty funny for a minute or two, albeit in a limited context.
So why the sudden hate for chatbots? Let me explain. As a kid growing up in the 80’s, I was fascinated with AI. I’ve collected and modified different versions Eliza, Racter, Alice and more. Computer programs that could hold a human-like conversation were not only fun to play with, they were and still are regarded as one of the primary indicators of artificial intelligence and even consciousness. The famous Turing test, a thought experiment postulated by the great Alan Turing over 60 years ago, is still referenced today as a primary measuring tool and metric for artificial intelligence.
If Turing himself were alive today, he may have different ideas about how to define and measure artificial intelligence. The concepts of “thought”, “intelligence” and “consciousness” are entirely philosophical in nature in that they defy rigorous scientific definition. These ideas are entertaining to ponder but not very practical. Chatbots in my view, are similar; entertaining but not practical. The high-water mark of a Turing Test certified chatbot, is a casual conversation in which the software’s responses are indistinguishable from a human. I think that we can do better.
AI needs to be better than a human. More useful. More like Alexa and Siri on steroids, rather than HAL or C3PO. Shifty, insecure and homicidal, HAL was far too human. Even Tony Stark’s J.A.R.V.I.S is too human. Too opinionated, too chatty. Data from STTNG is also too human for some of the same reasons. On the topic of Star Trek, the computer intelligence powering the Enterprise D strikes the right balance. It communicates very well but it doesn’t exhibit any of the human eccentricities or flaws which would impede its usefulness.
So what is a better AI and how do we build it? Useful AI is an exercise in UX Design. The main advantage that artificial intelligence has over human intelligence is it’s ability to quickly access and process massive amounts of information. Combine this ability with a human level understanding of language and you have something interesting: A conversational interface to knowledge and computation. Human creativity will still be required but it will be greatly magnified with this new tool.
We’re experimenting with a tool like this at HiringSolved. We call it RAI. We think of it as Siri for recruiting. RAI is a closed domain tool working on a retrieval based model with access to HiringSolved’s massive index of talent data. RAI is not ground-breaking from a Turing Test perspective. It won’t fool anyone into thinking it’s human, but for practical recruiting tasks it’s already quite useful. By combining the practical conversational interface of tools like Siri and Alexa, with a Wolfram Alpha inspired ‘knowledge engine’, we’ve developed a recruiting assistant with which a non-expert user can conduct a talent search conversationally.
Current systems require the user to construct searches using keywords and boolean operators. RAI simply allows the user to say: “I need to find female javascript developers with reactjs experience within 25 miles of Austin, Texas” and RAI produces results. From there, RAI can help the user construct and send a message to the selected candidates. This isn’t going to replace a human recruiter but it does reduce the number of steps involved in recruiting, letting the recruiters be more personal in their approach. If you would like to experiment with RAI sign up here. We’re releasing user accounts in small batches next month.

Shon Burton is CEO and Founder of HiringSolved, a market leading HR Tech startup focusing on the talent acquisition and talent search market. Prior to HiringSolved, Shon founded MLconf along with Courtney Burton. In his spare time Shon enjoys long walks on the beach with Courtney, playing Legos with his daughter and and spending hours restoring ancient, obtuse computers.

Interview with Dr. Le Song, Assistant Professor in the College of Computing, Georgia Institute of Technology

Our past Technical Chair, interviewed Dr. Le Song, Assistant Professor in the College of Computing, Georgia Institute of Technology, regarding his upcoming presentation Understanding Deep Learning for Big Data, scheduled for 09/23/16 at MLconf Atlanta.
You have done a lot of work on kernel methods. A year ago there was indication that kernel methods are not dead and that could match or outperform deep nets. Is that the case? Is it time for them to retire?
LS) I think kernel methods and deep learning are cousins. The field needs to combine them rather than throwing either away.
In fact, they share lots of similarities, such as both tries to learn nonlinear functions. One can design kernel functions to capture problem structure as much as one can choose the architecture of deep learning according to the problem at hand. More interestingly, the feature maps of kernel functions have one-to-one correspondence with the activation units in deep learning. For instance, the arc-cosine kernel is an infinite combination of rectified linear unit. Hence researchers also call kernel method infinite neural networks.
The main algorithmic difference between kernel methods and deep learning is that the parameters of the kernels are typically fixed and only the classifiers are learned, while the parameters of the activation units in deep learning are learned together with classifiers. This difference allows deep learning models to be more flexible given the same number of parameters. In the face of big data, this also means that deep learning is more scalable given the same model flexibility. Very few works have compared deep learning seriously with kernel methods in big data regime simply because nobody knows how to scale up kernel methods without sacrificing model flexibility. We did some fundamental work in scaling up kernel methods, and we found the matching performance between kernel methods and deep learning. Especially, nowadays when researchers are talking about learning the kernel functions, the line between kernel methods and deep learning really blurs.
The main theoretical difference between kernel methods and deep learning is that there is an almost completely set of theory to explain the behavior of kernel methods, while there is almost no satisfactory theory to explain the behavior of deep learning. When researchers talk about kernel methods, it means not just the set of algorithms, but also the set of unique tools and theories developed to analyze and provide guarantees for kernel methods. Researchers, including us, are working on explaining the generalization ability of deep learning models. In fact, we are now working on using tools from kernels methods to understand the generalization ability of deep learning models.
I think the next big thing will be kernel methods + deep learning + graphical models.
Deep nets have high information capacity versus linear models. If the data is noisy then all this capacity is filled with noise. Are there any practical methods for deciding where it is worth using a deep net versus a simpler model?
LS) Use cross-validation.
Given the success of deep nets, data scientists and researchers always wonder if it is time to give up traditional ML. Is there a reason why we should keep investing on other schools of ML, like bayesian modeling, graphical models, etc?
LS) Definitely, we should keep investing on other schools of ML, like bayesian modeling, graphical models, and we should keep teaching students bayesian modeling and graphical models. Not to mention that the possibility of new and better methods coming up from these other areas, they can simply inspire deep learning model. We recently invented a completely new deep learning architecture based on belief propagation algorithm from graphical models. Now our method is the best method for learning representation for discrete structures, such as taxonomies, drug molecules and social networks.
Given all this research in deep learning there seems to be plethora of architectures. Is there a map or a guideline for choosing the right architecture? Do we just have to follow the rule “do whatever google does”?
LS) A good architecture for a particular problem is heavily based on domain knowledge and previous successful models. For instance, convolution neural networks are based on previous image feature extractors such as pyramid features, recurrent neural networks are based on previous successful sequence models such as hidden Markov models. Google definitely has more computing resources to blindly search for a better architecture for a set of canonical problems, but your problem may be different from these canonical problems.
Neural nets have been there for decades. Why did it take so long to blast off? What do you think people missed in the 80s?
LS) I think neural networks are just more scalable than others nonlinear models out there. For big data, you need both flexibility and scalability. Neural networks have both.
Apart from the deep learning architecture labyrinth, scientists have also to go through the platform labyrinth (tensorflow, mxnet, theano, torch, keras, etc). Do you have any preference, advice?
LS) I prefer tensorflow and mxnet. You can program in python, and easily interface with other programs. mxnet is very efficient and faster than tensorflow. mxnet also has multi-GPU and multi-machine automatic parallelization at high speed.

Dr. Le Song, Assistant Professor in the College of Computing, Georgia Institute of Technology
Le Song is an assistant professor in the College of Computing, Georgia Institute of Technology. He received his Ph.D. in Machine Learning from University of Sydney and NICTA in 2008, and then conducted his post-doctoral research in the Department of Machine Learning, Carnegie Mellon University, between 2008 and 2011. Before he joined Georgia Institute of Technology, he was a research scientist at Google. His principal research direction is machine learning, especially nonlinear methods and probabilistic graphical models for large scale and complex problems, arising from artificial intelligence, social network analysis, healthcare analytics, and other interdisciplinary domains. He is the recipient of the NSF CAREER Award’14, AISTATS’16 Best Student Paper Award, IPDPS’15 Best Paper Award, NIPS’13 Outstanding Paper Award, and ICML’10 Best Paper Award. He has also served as the area chair for leading machine learning conferences such as ICML, NIPS and AISTATS, and action editor for JMLR.

How The Brightest Minds In AI Are Using Gaming to Reinforce Learning

Kurzweil’s Law Of Accelerating Returns can be summed up by saying “Humans use the best tools (technology) available to create the next generation of technology”, enabling exponential technological improvement. Open Source projects and the proliferation of knowledge sharing continue to accelerate these effects. We’re excited and inspired by some of these recent developments in gaming and AI!
Here are a few interesting projects at the intersection of AI and Gaming:
Earlier this summer, Microsoft announced that Project Malmo, a Minecraft-based testing ground for Artificial Intelligence, has made their open-source code available on github. This project is an effort to help researchers learn more AI tasks, using reinforcement learning. Another interesting aspect of the public project is the creation of bots that can talk to each other and to humans.
No Man’s Sky, a long-awaited and hyped game known for being procedurally generated by an algorithm, not written by a human, is stealing many precious hours from excited gamers this week. (Trust me, my husband and MLconf Co-Founder must have dedicated half of last weekend to this game!) After years of hype and anticipation, the game is now available on Playstation 4 and Microsoft Windows. Check out this article in the Atlantic, which paints a picture of the complexity of the game that generates full planets once a human is playing there.
ViZDoom is a Doom-based AI research platform for reinforcement learning from raw visual information. It allows developing AI bots that play Doom using only the screen buffer. ViZDoom is primarily intended for research in machine visual learning, and deep reinforcement learning, in particular.
Google DeepMind, known for conquering Lee Sedol at AlphaGo, and creating an algorithm that can learn to play 49 arcade games, has also been chinking away at “Montezuma’s Revenge” with 50+million frames under their belt. In their recent research paper, the group of contributors explain that the winning ingredient in their success in this project was employing intrinsic motivation, in which they fed the exploration system digital rewards, similar to the human experience of adrenaline, which encouraged the system to explore more frames, thus being more successful.
*Side Note: Ali Eslami, one of the authors of another popular paper by Google DeepMind researchers, Attend, Infer, Repeat: Fast Scene Understanding with Generative Models will be presenting at MLconf SF on 11/11/16, don’t miss it! Mention “Ali18” and save on tickets.
Upcoming MLconf Atlanta speaker, Chris Fregly, Co-Founder at PipelineIO, and his team have been helping a few large gaming companies apply their massive datasets towards more ML/AI use cases including the following:

Level-up recommendations: ie. who they should fight next to reach the next level?
In-game purchase recommendations: ie. extra weapons, health, food
Chat-log NLP analysis between players: ie. prevent bullying and improve player engagement/retention
Cheat-pattern classification analysis: ie. prevent online gambling fraud

Although the technology is evolving so quickly, it’s still not a perfect science. According to Fregly, the problem of labeling still exists.. At this point, it still takes a human to identify cheating/not cheating in gamer’s behavior.
As mentioned before, we’re in a time of enormous acceleration and it’s exciting! Gaming is just one field in which development and learning in artificial intelligence is improving. We’re excited to have some of these leaders involved in the MLconf community and we anticipate additional blog posts to follow as this work evolves.

Courtney Burton is the Founder of Sessions Events and MLconf- The Machine Learning Conference.

Interview with Navdeep Gill, Data Scientist, H2O.ai

Our past Technical Chair, interviewed Navdeep Gill about his thoughts on various Machine Learning techniques, debugging models and visualizing big dimensional data.
What is the challenge in debugging a machine learning model?
NG) Currently, traditional software programming uses Boolean-based logic that can be tested to confirm that the software does what it was designed to do, using tools and methodologies established over the last few decades. In contrast, machine learning is essentially a black box programming method in which computers program themselves with data, producing probabilistic logic that diverges from the true-and-false tests used to verify systems programmed with traditional Boolean logic methods. The challenge here is that the methodology for scaling machine learning verification up to a whole industry is still in progress. We have some clues for how to make it work, but we don’t have the decades of experience that we have in developing and verifying “regular” software.
Instead of traditional test suite assertions that respond with true, false or equal, machine learning test assertions need to respond with assessments. For example, the results of today’s experiment had an AUC of .95 and is consistent with tests run yesterday. Another challenge is that machine learning systems trained on data produced by humans with inherent human biases will duplicate those biases in their models. A method of measuring what these systems do compared to what they were designed to do is needed in order to identify and remove bias.
Furthermore, traditional software is modular, lending to isolation of the input and outputs of each module to identify which one has the bug. However, in machine learning the system has been programmed with data. Any bug will be replicated throughout the system. Changing one thing changes everything. There are techniques for understanding that there is an error, and there are methods for retraining machine learning systems. However, there is no way to fix a single isolated problem, which is another huge challenge in itself.
To paraphrase, a better set of tools is needed. The entire tool set needs to be updated to move forward. People in industry, including at H2O.ai, are working on this exact problem to ensure the models that are built in production are as accurate as possible and produce results that are aligned with the real world.
What are the limitations of explaining a model with histograms, scatterplots?
NG) The limitations come in the dimensionality of datasets and the dimensionality built out in model building. Most machine learning models express interesting interactions among features in an n-dimensional space. Visualizing the effect of one or two variables on an outcome is simple. However, once you go into more than two dimensions we reach a problem with interpretability and approach the infamous curse of dimensionality. For example, tree models and deep learning involve a tremendous amount of interactions between features, and these interactions only become more complex as the feature space increases. This can pose a huge hurdle in terms of scaling these out to traditional visualizations, and thus increase the difficulty of explaining a machine learning model using traditional visualization methodology.
Is tensorboard a step forward in visual debugging of a model?
NG) Yes, I believe so. Tensorboard allows you to visualize the graph itself, which is quite helpful for most users of TensorFlow. When building and debugging new models, it is easy to get lost in the weeds. For me, holding a mental context for a new framework and model I’m building to solve a hard problem is already pretty taxing, so it can be really helpful to inspect a totally different representation of a model; the TensorBoard graph visualization is great for this.
What is the next step for visualizing machine learning?
NG) The next step involves building visualizations that can be used to educate, propose business value and be used in an interactive manner for all machine learning purposes. A great example of this is R2D3, which is a product by Tony Chu from H2O.ai. This visual tutorial walks you through a machine learning use case explaining each step along the way using insightful visuals that can peak the interest of any person in the field of machine learning. These kind of visualizations can also be built out to show stakeholders business value for almost any business problem that requires machine learning. In addition, they can help a residential Data Scientist get insight into their data/machine learning models with ease.
What is the solution to visualizing big high dimensional data?”
NG) Two possible solutions to this problem are dimensionality reduction and feature selection. Dimensionality reduction involves taking your feature space and projecting it down to a lower dimensional space, which can help visualize huge data sets. Some examples of this are PCA analysis, multidimensional scaling and t-SNE, to name a few.
Feature selection allows you to choose the “best” features from your data, which you can visualize. “Best” could be defined based on you cost function or variable importance. For example, you could visualize the top five features that are the most important to your model, or you can use some type of criterion such as Weight of Evidence or Information Value to choose the “most important” variables. In addition to the previous, one can also conduct visuals such as parallel coordinates, scatter plot matrices, Glyphplot’s, Andrew’s plots or Arc-Diagrams.

Navdeep Gill is a Data Scientist at H2O.ai. He graduated from California State University, East Bay with a M.S. degree in Computational Statistics, B.S. in Statistics and a B.A. in Psychology (minor in Mathematics). During his education he discovered an interest in machine learning, time series analysis, statistical computing, data mining and data visualization.
Prior to H2O.ai Navdeep worked at several startups and Cisco Systems, focusing on data science, software development and marketing research. Before that, he was a consultant at FICO working with small- to mid-size banks in the U.S. and South America focusing on risk management across different bank portfolios (car loans, home mortgages and credit cards).

Interview with Sergey Razin, Ph.D., Chief Technology Officer, SIOS Technology

What is topological behavior analysis?

How do you combine k-means with mixture for TBA?

What are the challenges in predicting workloads in servers?

Why can’t you use time series modeling?

Why isn’t deep learning and option?

Interview with John Melas-Kyriazi, Senior Associate at Spark Capital

Guest Blog Post: "Why I Hate Chatbots"

Interview with Dr. Le Song, Assistant Professor in the College of Computing, Georgia Institute of Technology

How The Brightest Minds In AI Are Using Gaming to Reinforce Learning

Interview with Navdeep Gill, Data Scientist, H2O.ai

Code of Conduct

Refund Policy

Press Inquiries

MLconf Blog Author

What is topological behavior analysis?

How do you combine k-means with mixture for TBA?

What are the challenges in predicting workloads in servers?

Why can’t you use time series modeling?

Why isn’t deep learning and option?

Don't miss a thing!