I was excited to be selected again this year to present at Lucene/Solr Revolution 2015 in Austin, Texas. My talk today focused on one the main areas of focus for me over the last year – building out a highly relevant and intelligent semantic search system. While I described and provided demos on the capabilities of the entire system (and many of the technical details for how someone could implement a similar system), I spent the majority of the time on the core Knowledge Graph we’ve built using Apache Solr to dynamically understand the meaning of any query or document that is provided as search input. This Solr-based Knowledge Graph – combined with a probabilistic, entity-based query parser, a sophisticated type-ahead prediction mechanism, spell checking, and a query-augmentation stage – is core to the Intent Engine we’ve built to be able to search on “things, not strings”, and to truly understand and match based upon the intent behind the user’s search.
Video:
Slides:
http://www.slideshare.net/treygrainger/leveraging-lucenesolr-as-a-knowledge-graph-and-intent-engine
Talk Summary:
Search engines frequently miss the mark when it comes to understanding user intent. This talk will describe how to overcome this by leveraging Lucene/Solr to power a knowledge graph that can extract phrases, understand and weight the semantic relationships between those phrases and known entities, and expand the query to include those additional conceptual relationships. For example, if a user types in (Senior Java Developer Portland, OR Hadoop), you or I know that the term “senior” designates an experience level, that “java developer” is a job title related to “software engineering”, that “portland, or” is a city with a specific geographical boundary, and that “hadoop” is a technology related to terms like “hbase”, “hive”, and “map/reduce”. Out of the box, however, most search engines just parse this query as text:((senior AND java AND developer AND portland) OR (hadoop)), which is not at all what the user intended. We will discuss how to train the search engine to parse the query into this intended understanding, and how to reflect this understanding to the end user to provide an insightful, augmented search experience.
Topics: Semantic Search, Finite State Transducers, Probabilistic Parsing, Bayes Theorem, Augmented Search, Recommendations, NLP, Knowledge Graphs
No intent engine required to realize "#LuceneSolrRev session best speaker" yields @treygrainger as top result- best/most "relevant" talk IMO
— Chad Tomas (@ChadTomas) October 15, 2015
@treygrainger 's talk at #LuceneSolrRev today was awesome. Extremely relevant (haha, terrible pun).
— Jerrrry (@jerrrry17) October 15, 2015
@treygrainger never ceases to amaze! Nuggets on every slide #LuceneSolrRev semantic search with #solr #FTW pic.twitter.com/KLcTyHS7pX
— Timothy Potter (@thelabdude) October 15, 2015
Posting lists are edges between terms which are entities. Inverted index as a knowledge graph #epiphany thanks @treygrainger #LuceneSolrRev
— Aditya Varun Chadha (@adichad) October 15, 2015
Thoughtful talk from @treygrainger good to see some interesting problems. We have built many similar pieces at THD pic.twitter.com/uDCWfoM3Ir
— Sada Kshirsagar (@manask) October 15, 2015
Awesome talk by @treygrainger on user intent understanding with knowledge graphs in #solr. stumped my chump-stumper. #LuceneSolrRev
— Aditya Varun Chadha (@adichad) October 15, 2015
@treygrainger thank you for teaching us how to determine the relationships of a list of "dirty" words #LuceneSolrRev #nsfw
— Kenn (@kennnotsouth) October 15, 2015
Interesting use of inverted index as knowledge graph to connect "intent" and "query" @treygrainger #LuceneSolrRev #solr
— Pradeep Bhattiprolu (@_pradeepb) October 15, 2015
Interesting use of inverted index as knowledge graph to connect "intent" and "query" @treygrainger #LuceneSolrRev #solr
— Pradeep Bhattiprolu (@_pradeepb) October 15, 2015
Search for things not strings @treygrainger #LuceneSolrRev #solr
— Pradeep Bhattiprolu (@_pradeepb) October 15, 2015
@treygrainger I wanted to listen but couldn't see past the cameras taking pictures of slides… #theslidesareposted #LuceneSolrRev
— Kenn (@kennnotsouth) October 15, 2015