Loading…
This event has ended. View the official site or create your own event → Check it out
This event has ended. Create your own
Data By the Bay is the first Data Grid conference matrix with 6 vertical application areas  spanned by multiple horizontal data pipelines, platforms, and algorithms.  We are unifying data science and data engineering, showing what really works to run businesses at scale.
View analytic
Tuesday, May 17 • 9:50am - 10:30am
The practice of acquiring good labels

Sign up or log in to save this to your schedule and see who's attending!

Engineers and researchers use human computation as a mechanism to produce labeled data sets for product development, research and experimentation. In a data-driven world, good labels are key. To gather useful results, a successful labeling task relies on many different elements: from clear instructions and user interface design to algorithms for quality control. In this talk, I will present a perspective for collecting high quality labels with an emphasis on practical implementations and scalability. I will focus on three main topics: programming crowds, debugging tasks with low agreement, and algorithms for quality control. I plan to show many examples and code along the way.

Speakers
avatar for Omar Alonso

Omar Alonso

Principal Data Scientist, Microsoft
Omar is a Principal Data Scientist Lead at Microsoft in Silicon Valley where he works on the intersection of social media, temporal information, knowledge graphs, and human computation for the Bing search engine. He holds a PhD from the University of California at Davis. @elunca



Tuesday May 17, 2016 9:50am - 10:30am
Gardner

Attendees (9)