Loading…
This event has ended. View the official site or create your own event → Check it out
This event has ended. Create your own
Data By the Bay is the first Data Grid conference matrix with 6 vertical application areas  spanned by multiple horizontal data pipelines, platforms, and algorithms.  We are unifying data science and data engineering, showing what really works to run businesses at scale.
View analytic
Tuesday, May 17 • 4:00pm - 4:40pm
Mining Noisy Transaction Data with Neural Nets

Sign up or log in to save this to your schedule and see who's attending!

Extracting relevant information from unstructured transaction data presents a challenge for those who may want to use such data for making business decisions such as underwriting loans or for monitoring credit worthiness. Most of our transaction data is in the form of transaction text describing the transaction often using abbreviations or unknown proper nouns. A common approach for text documents is to encode the words or documents into vectors using a neural net layer or multiple layers. These features may then be used in a classification algorithm or other models for predicting an outcome. To this end, we encoded transaction data of small 'sentences', often of only a few words, using skip-gram word2vec models along with RBM and Deep Belief Nets utilizing other features such as credit or debit value of transaction and institution information. The goal of this discussion is to describe the performance of the model and also considerations for training a nn in a large-data distributed framework like Spark. Tools used are Deeplearning4j, Spark, Scala.

Speakers
avatar for Frank Taylor

Frank Taylor

Data Scientist, Earnest, Inc.
I have a background in Physics specializing in statistical modeling of particle decays and later in optical signal processing. I am passionate about Big Data and its potential to gather insight into so many facets of humanity. As our tools get better and more scalable, we have the ability to answer greater questions and build more meaningful products that enrich our lives. Recently I have focused on deep learning and neural nets for the purpose... Read More →


Tuesday May 17, 2016 4:00pm - 4:40pm
Ada

Attendees (14)