Loading…
This event has ended. View the official site or create your own event → Check it out
This event has ended. Create your own
Data By the Bay is the first Data Grid conference matrix with 6 vertical application areas  spanned by multiple horizontal data pipelines, platforms, and algorithms.  We are unifying data science and data engineering, showing what really works to run businesses at scale.
View analytic
Monday, May 16 • 2:10pm - 2:50pm
Real-time, Streaming Advanced Analytics, Approximations, and Recommendations using Apache Spark ML/GraphX, Kafka Stanford CoreNLP, and Twitter Algebird BONUS: Netflix Recommendations: Then and Now

Sign up or log in to save this to your schedule and see who's attending!

Agenda Intro Live, Interactive Recommendations Demo Spark ML, GraphX, Streaming, Kafka, Cassandra, Docker Types of Similarity Euclidean vs. Non-Euclidean Similarity User-to-User Similarity Content-based, Item-to-Item Similarity (Amazon) Collaborative-based, User-to-Item Similarity (Netflix) Graph-based, Item-to-Item Similarity Pathway (Spotify) Similarity Approximations at Scale Twitter Algebird MinHash and Bucketing Locality Sensitive Hashing (LSH) BONUS: Netflix Recommendation Algorithms: From Ratings to Real-Time DVD-Ratings-based $1M Netflix Prize (2009) Streaming-based "Trending Now" (2016) Wrap Up Q & A

Speakers
avatar for Chris Fregly

Chris Fregly

Research Scientist, PipelineIO
Chris Fregly is Founder and Research Scientist at PipelineIO - a Streaming Machine Learning and Artificial Intelligence Startup in San Francisco. | | Chris is a regular speaker at many conferences and Meetups throughout the world. He’s also an Apache Spark Contributor, Netflix Open Source Committer, and Founder of the Global Advanced Spark and TensorFlow Meetup, and Author of the upcoming O'Reilly Video Series on Deploying and... Read More →


Monday May 16, 2016 2:10pm - 2:50pm
Markov

Attendees (24)