There is much hype and discussion in the professions at the moment surrounding the topics of artificial intelligence (AI) and natural language processing (NLP). Clearly these are themes that Librarians and Information Professionals should all be aware of but what do we really mean when we talk about AI and NLP? And how do they fit within your library and information service?
Simply put, natural language processing can be defined to be the ability of a computer programme to understand human speech1, i.e. to process natural language. As such, natural language processing is just one of the many branches of artificial intelligence.
Instead of communicating with computer programmes in a technical commanding language, NLP enables a more natural form of communication. Well-known everyday examples of NLP can be found in Apple’s Siri or Amazon’s Alexa - both tools can respond to and understand natural human speech.
What use does NLP have in my library?
What use does natural language processing have in my library? Well, that’s a great question. There are certainly many different ways to incorporate NLP into your library, especially when it comes to supporting your search function. In this article, we shall explore four of the ways NLP can be used - through sentiment analysis, entity extraction, keyword searching and concept extraction.
NLP adds a whole new dynamic to traditional Boolean searching. Instead of interpreting the literal meaning of each word, NLP takes the wider context and intent of the word or term into account. This means that it is easy for you to surface an incredibly specific set of search results, containing only the most relevant information and your search becomes more efficient. What’s more, there is also now the scope to understand the overarching meaning of articles being searched, as opposed to solely the meaning of the words within.
“A computer can know the definition of a word, but it doesn’t understand the meaning of words within a larger context”
In this regard, natural language processing is able to interpret the overarching mood of an article. This wouldn’t be possible without the use of NLP (unless done manually of course), since the system is now able to go beyond viewing the article as a compilation of individual words and can instead consider the article as a whole. Such a process is known as sentiment analysis. Using sentiment analysis would enable you to assess whether articles written about your organisation, clients or competitors are positive or negative in their coverage. As such, you are then able to track the overall perception of your organisation or entity of interest in the news. This could be of particular use for the marketing department who need to have a clear understanding of their organisation’s image, or to monitor mentions of your clients which would be fed back to your fee-earners who could then, in turn, give clients a heads up if it is looking like their public image is starting to turn sour.
A second manifestation of natural language processing in your library can be found in what is known as entity extraction3. Since NLP interprets the context and article as a whole, it is able to tag each piece with specific entities, such as the geographic location e.g. United Kingdom or London. This is far more advanced than just using such a term in your Boolean search criteria as, instead of searching for mentions of “United Kingdom”, the system searches only for articles that are actually about the United Kingdom and so a narrower set of much more relevant results will be surfaced.
This ties in nicely with keyword tagging and searching, whereby NLP is able to go beyond mere mentions of a term in an article and instead consider whether that article is actually substantially focussing on, let’s say, Apple or indeed if it only mentions them in passing. Using such keyword searching also enables you to filter out a great deal of the white noise and avoid bringing up results that mention, in this case, apple the fruit rather than Apple the company.
Fourthly, natural language processing can categorise concepts into themes for you, even if they are not mentioned directly in the text itself. An example of such concept extraction could be found in searching for articles relating to a specific industry. Using NLP, the system is able to recognise brands and company names contained in the article text and place these into the appropriate industry. You are then able to search for that specific industry, and all of these relevant articles will come up in your results even if the industry itself isn’t directly mentioned.
- Search Content Management (2011) Definition - natural language processing http://searchcontentmanagement.techtarget.com/definition/natural-language-processing-NLP
- Vasco Pedro (2016) Artificial intelligence and language, Tech Crunch https://techcrunch.com/2016/03/12/artificial-intelligence-and-language/?ncid=rss
- AlchemyLanguage (2017) Overview, Watson Developer Cloud https://www.ibm.com/watson/developercloud/doc/alchemylanguage/