The European Parliament's Science and Technology Options Assessment office (STOA) recently published a study concerning the role of technology in achieving European “language equality in the digital age.”  It claims that more than twenty European languages are in danger of “digital extinction,” and that the language technology (LT) programme that might save them is “not properly represented in the agenda of European policy makers.” While the study deserves to be welcomed in principle, it makes a number of proposals, two of which we wish to comment on here.

The report's alerts on the linguistic state of the Union have often been heard before. This time, the vehicle will be a new Human Language Project (HLP), but the basic arguments are familiar: in the EU, pure and applied research, technology development, and industrial and market innovation should be richly funded by the public purse to drive a boulevard for LT through the desolate landscape of European linguistic inequality, so that all European language speakers can participate “equally” in the digital single market.

We want to draw attention to two major problems where this vision of a policy pathway that somehow leads seamlessly from research through development to innovation. The first is how a European research agenda should be conceived; the second is the need for a proper innovation methodology that can deliver a dynamic ecosystem to ensure future growth in this sector.

Supporting Competitive Research

The next couple of decades in European LT research are likely to feature some hard challenges from outside.

In addition to existing universities, large US web companies have built up thriving research groups in AI and LT (covering many language-related disciplines) and are regularly publishing alongside academics. These labs draw on a large and eager population of globally-sourced, highly motivated young scientists, who are helping determine our digital futures for language applications of all kinds.

At the same time, countries such as India and China are encouraging more and more of their young people to study and set up businesses in the attractive field of “near-market” artificial intelligence, which includes LT.

If Europe is to remain competitive in global language/technology research it will therefore have to design a much more ambitious agenda than simply the “new paradigm” suggested in the HLP paper.

In particular, the need for rapid progress where possible may make it necessary to embrace a more explicitly competitive research, with diverse groups experimenting in parallel with alternative programmes to solve a given problem. Concentrating EU academic efforts on the same basic disciplines of machine learning and “deep” linguistics may not be sufficient.

What is needed is a research roadmap that binds in cognate fields such as neuroscience, symbolic reasoning, psychology, quantum computing and even other hardware disciplines in its purview. We also need to encourage genuine blue-skies thinking (rather than just hammering away inside an engineering mindset) as breakthroughs can come from unexpected angles. The EU deserves an LT research agenda that is autonomous, wide-ranging, long-term, and competitively organised.

Putting Innovation to Work

Above all the longer research cycle must in practice be delinked from that of the shorter innovation cycle, which is best understood as an applied industrial process. Of course, research results should be leveraged at every opportunity to design successful technology. But only within the scope of an effective innovation methodology. There is no silver bullet that automatically transforms results from a new research program into an innovative application fit for the marketplace.

At LT-Innovate, we have looked at multiple cases of LT innovation, and have concluded that there is no well-defined “market” for LT as such, but there is a market for evolving solutions to legacy and emerging digital needs among the major verticals of the economy. These needs cover the three overlapping fields of

  • automated knowledge production/management/deployment;
  • augmented human-machine communication;
  • and the horizontal layer of language variability across marketplaces (i.e. Dutch, Greek, Basque, etc.) for which the HLP paper seeks equality.

The innovation methodology best suited to this market dynamic should focus on serving vertical domains – structured, complex industries such as transport & logistics, banking & insurance, travel & tourism, chemicals & pharmaceuticals, trade & commerce, entertainment & publishing, defence & security, public administration & services, etc. - that have the critical mass to draw on research as needed, and which share common technology problems that call for advanced LT solutions.

Innovation projects should be built around vertical LT platforms, starting with 4 or 5 industries, preferably funded under the H2020 programme to ensure sufficient early momentum. Coordinated exclusively by industrial players, these platforms will act as demonstrators that serve as launch pads for ecosystems of multilingual applications and services, tailored to the verticals in question in well-defined value chains. In certain cases, further input may well be needed from the research community to meet specific needs.

These vertical platforms will also benefit from a pan-European LT infrastructure funded via procurement to provide a non-competitive cloud of standardised tools and data resources available via APIs for the entire European LT development community. But addressing real-world business and engineering challenges must remain the key raison d'être of the proposed innovation platforms.

We believe that a “programme” informed by at least these two pillars could provide a more realistic approach to attaining a multilingual Europe capable of engaging the real economy, with specific types of support from EC funding.

Philippe Wacker & Andrew Joscelyne
Secretary General & Senior Advisor, LT-Innovate