The news production and security industries are undergoing a growing threat from fake news across every media, as are all organisations where accurate, time-bound news on persons, entities and events in specific locations are critical to their business and security.

This threat is two-fold. It is partly due to the ubiquity of digital news sites and an avid readership whose built-in biases are being fed by advertising that can in turn fund the fake news producers. And partly to the existence of digital tools that make it easier to produce such fake content automatically on a massive scale over the internet and especially social media.

The majority of content concerned by the fake news syndrome is textual in nature, whether addressed to readers or listeners. Fake images associated with information sources also play a large role in purveying false information. But text is massively variegated in form (due to the range of words and linguistic structures that can be used), and also comes in different channels (speech/text) and languages. This adds further complexity to the task of fake news identification.


  • The misuse of text, images and videos to illustrate facts they have nothing to do with;
  • The creation and use of false accounts to slander someone’s reputation;
  • The creation and feeding of false websites that visually resemble real sites;
  • The creation and spreading of false documents (false evidence) ;
  • The use of bots to boost the viral nature of messages.


The International Federation of Library Associations has produced a nice infographic summarising the various options.


Is there a reliable automatable decision procedure for identifying the truth or falsity of news items, and can it be integrated into a standard technology stack?

The following automated Language Intelligence (LI) processes are central to the identification of fake news:

  • Text & Voice Search, e.g. Searching online for news and facts about news entities; Searching other journalists’ stories and sources to understand news stories; Searching voice streams for all of the above (voice search); Identifying entities & actions in large text corpora; Fact-checking.
  • Sentiment Analysis: Evaluating sentiment in social media streams around news topics; evaluating spoken content reliability from voice quality & psychographics.
  • Voice Transcription: Transcribing spoken stories from recordings, podcasts or radio into written form for fake news evaluation.
  • Captioning: When publishing spoken news stories online, either to be audience-inclusive or for foreign language understanding.
  • Translation: Translating content from a foreign language to be used in a news story (gisting) or Translating news content to evaluate reliability of the material.
  • Text Generation: Create readable texts automatically on the basis of a numerical data source such as financial data, sports results, etc.; creating texts that describe contents of videos or images.

At a workshop on 29-30 November 2018, LT-Innovate, EUROSINT and SAIL LABS will present various use cases that address the fake news problem using Language Intelligence, and open up a broader discussion into how LI can extend its applicability within the news and intelligence industries in their effort to track down fake information.