I had a blast at the Southern Data Science Conference yesterday in Atlanta, GA, where I presented a talk titled “Intent Algorithms: The Data Science of Smart Information Retrieval Systems”. This was the first year the conference was held, and it’s already clear already that is going to hold the title as the preeminent Data Science conference in the Southeast United States. Top speakers, authors, and industry and academic practitioners were represented from the likes of Google, Lucidworks, NASA, Microsoft, Allen Institute for AI, Skymind, CareerBuilder, Glassdoor, Distil Networks, Takt, Elephant Scale, AT&T, Macy’s Technology, Lost Alamos National Laboratory, Georgia Tech, The University of Georgia, and the South Big Data Hub. I had a lot to cover on the topic of “intent algorithms”, so the talk went at quite a rapid pace (due to the 30 minute time limit) to be sure everyone walked away with a solid understanding of the topic. There’s a lot of good material and demos in the presentation, though, so it’s definitely worth checking out the video or slides below!
Slides:
https://www.slideshare.net/treygrainger/intent-algorithms
Video:
Talk Abstract:
Search engines, recommendation systems, advertising networks, and even data analytics tools all share the same end goal – to deliver the most relevant information possible to meet a given information need (usually in real-time). Perfecting these systems requires algorithms which can build a deep understanding of the domains represented by the underlying data, understand the nuanced ways in which words and phrases should be parsed and interpreted within different contexts, score the relationships between arbitrary phrases and concepts, continually learn from users’ context and interactions to make the system smarter, and generate custom models of personalized tastes for each user of the system.
In this talk, we’ll dive into both the philosophical questions associated with such systems (“how do you accurately represent and interpret the meaning of words?”, “How do you prevent filter bubbles?”, etc.), as well as look at practical examples of how these systems have been successfully implemented in production systems combining a variety of available commercial and open source components (inverted indexes, entity extraction, similarity scoring and machine-learned ranking, auto-generated knowledge graphs, phrase interpretation and concept expansion, etc.).