Data By the Bay has ended
Data By the Bay is the first Data Grid conference matrix with 6 vertical application areas  spanned by multiple horizontal data pipelines, platforms, and algorithms.  We are unifying data science and data engineering, showing what really works to run businesses at scale.
Tuesday, May 17 • 1:10pm - 1:30pm
An Innovative Approach to Labeling Ground Truth in Speech

Sign up or log in to save this to your schedule and see who's attending!

Supervised machine learning algorithms require accurate and consistent data labels. However, complicated datasets may introduce ambiguity, resulting in irregular ground truths and challenging machine learning algorithm development. Consider the following truthing tasks for natural household speech: - *Labeling what was said* -- Think about how often people mispronounce words, talk over others, or simply mumble their speech. - *Segmenting when a given utterance/thought begins and ends* -- How many complete thoughts are in a spoken segment? What happens if speech is fragmented? How close to the start-and end- point of speech can we segment without cutting out essential data? - *Labeling sounds* -- Often there are non-human sounds in the background that we may or may not recognize. Additionally, people often make non-speech sounds that need to be considered. If that wasn't hard enough, now consider audio collected from households containing babies. Babies not only introduce more chaotic speech, but they have a language all their own that requires truth labels. Although many of aforementioned categories don't have a right or wrong way of being labeled, they do have the potential to introduce inconsistencies. To decrease the number of ground truth discrepancies, we created data tagging software called VersaTag. VersaTag is a GUI-based labeling system that can be distributed to volunteers to tag large quantities of audio. We are developing this software through an iterative process, decreasing truthing inconsistencies with each new improvement. VersaTag has already dramatically reduced the irregularities in our audio labels, and through the iterative development process, we are excited to continue improving!

avatar for Jill Desmond

Jill Desmond

Senior Data Scientist, VersaMe
Jill is the Senior Data Scientist at VersaMe. She is currently collecting data and developing algorithms to provide feedback to parents regarding the audio environment that their child is exposed to. Jill has a Ph.D. in Electrical Engineering from Duke University, where she researched... Read More →

Tuesday May 17, 2016 1:10pm - 1:30pm

Attendees (14)