Data By the Bay is the first Data Grid conference matrix with 6 vertical application areas  spanned by multiple horizontal data pipelines, platforms, and algorithms.  We are unifying data science and data engineering, showing what really works to run businesses at scale.
Tuesday, May 17 • 2:10pm - 2:50pm
Sparse data alternatives with neural network embeddings

The advent of continuous word representation technologies such as Word2Vec and GLOVE has transformed how Data Scientists and Machine Learning experts work with natural language data. One reason these algorithms are so successful is that they offer an efficient information preserving methodology to highly compress native features (word frequencies) to the dimensions of the embedded vector space. This is particularly effective in the sparse data context of word count frequencies. Recently word embedding algorithms have been generalized to generic graph networks contexts. In this talk we review results of applying this generalization to alternative sparse data contexts such as User-based as well as Item-based recommender algorithms.

Marvin Bertin

Machine Learning Scientist, Skymind
MACHINE LEARNING SCIENTIST. I build intelligent applications with Machine Learning and Deep Learning for large-scale applications. | Developed like2vec = product co-purchase graph + DeepWalk + Recommender System.
David Ott

Student, Galvanize
Mike Tamir

Chief Data Scientist, InterTrust

Tuesday May 17, 2016 2:10pm - 2:50pm

