Data By the Bay has ended
Data By the Bay is the first Data Grid conference matrix with 6 vertical application areas  spanned by multiple horizontal data pipelines, platforms, and algorithms.  We are unifying data science and data engineering, showing what really works to run businesses at scale.
Back To Schedule
Tuesday, May 17 • 2:10pm - 2:50pm
Sparse data alternatives with neural network embeddings

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

The advent of continuous word representation technologies such as Word2Vec and GLOVE has transformed how Data Scientists and Machine Learning experts work with natural language data. One reason these algorithms are so successful is that they offer an efficient information preserving methodology to highly compress native features (word frequencies) to the dimensions of the embedded vector space. This is particularly effective in the sparse data context of word count frequencies. Recently word embedding algorithms have been generalized to generic graph networks contexts. In this talk we review results of applying this generalization to alternative sparse data contexts such as User-based as well as Item-based recommender algorithms.

avatar for Marvin Bertin

Marvin Bertin

Machine Learning Scientist, Skymind
MACHINE LEARNING SCIENTIST. I build intelligent applications with Machine Learning and Deep Learning for large-scale applications. Developed like2vec = product co-purchase graph + DeepWalk + Recommender System.
avatar for David Ott

David Ott

Student, Galvanize
avatar for Mike Tamir

Mike Tamir

Chief Data Science Officer, Uber ATG

Tuesday May 17, 2016 2:10pm - 2:50pm PDT