Data By the Bay has ended
Data By the Bay is the first Data Grid conference matrix with 6 vertical application areas  spanned by multiple horizontal data pipelines, platforms, and algorithms.  We are unifying data science and data engineering, showing what really works to run businesses at scale.
Back To Schedule
Tuesday, May 17 • 9:05am - 9:40am
Data and Algorithmic Bias in the Web

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

The Web is the largest public big data repository that humankind has created. In this overwhelming data ocean, we need to be aware of the quality and, in particular, of the biases that exist in this data. In the Web, biases also come from redundancy and spam, as well as from algorithms that we design to improve the user experience. This problem is further exacerbated by biases that are added by these algorithms, specially in the context of search and recommendation systems. They include selection and presentation bias in many forms, interaction bias, etc. We give several examples and their relation to sparsity, novelty, and privacy, stressing the importance of the user context to avoid these biases.

avatar for Ricardo Baeza-Yates

Ricardo Baeza-Yates

Ricardo Baeza-Yates areas of expertise are information retrieval, web search and data mining, as well as data science and algorithms in general. He was VP of Research at Yahoo Labs, based in Sunnyvale, California, from August 2014 to March 2016. Before he founded and lead from 2006... Read More →

Tuesday May 17, 2016 9:05am - 9:40am PDT