Natural Language Generation (NLG) is a type of language technology, along with machine translation, natural language understanding, question answering, speech recognition, text-to-speech, and so on. Indeed, NLG is often combined with other types of language technologies, for example Arria Answers is a dialogue system that seamlessly integrates NLG with natural language understanding, speech recognition, and speech synthesis. However, NLG works very differently “under the hood” from most other language technologies.
I think one of the biggest reasons for this is that the input to NLG systems is data, whereas the input to most other language technology systems is language. Human language is messy and variable, so systems that take language as input spend a lot of effort trying to cope with this diversity. For example, a question-answering system should be able to respond to all of the following:
- Who is the prime minister of Britain?
- Who is the PM of UK?
- UK PM?
- Who PM of Britian? (spelling errors are common in human language)
Likewise, a speech-recognition system has to cope with all sorts of different accents and a huge variety of background noise (e.g. rain, baby crying, phone ringing, etc.).
Despite the messiness and diversity, however, human language changes slowly. The English I use in 2020 is almost the same as the English I used in 2019. The language has changed a bit since 1920 because of new words such as “Internet” and “antibiotic”, but its grammar has only changed in small ways, and I don’t have any problems reading a book written in 1920 or even 1820.
Because human language is incredibly diverse but changes slowly, the best way to build systems that work with language input is to use machine learning to analyze huge “corpora” of written or spoken language (often collected over years or decades, which is okay because of the slow change), and then build massive models that can cope with the variability and messiness of human language.
In contrast to most language technologies, data-to-text NLG systems (which is what Arria builds) produce language as an output but take structured data as an input. For example, Arria systems are used to generate news articles about election results from election data, and analyses of product sales in different regions from sales data. The data is usually a mix of numbers (e.g., sales figures) and symbols or names (e.g., countries). Of course, structured data can also be messy and variable, but usually much less so than human language. On the other hand, structured datasets can change quickly; for example, the contents of a sales spreadsheet will change a bit every time a new product is launched, and will change radically if the company reorganizes its sales division or buys a new sales IT platform. The insights that a business wants from its data can also change very rapidly.
In other words, the input to an NLG system (structured data) is simpler and cleaner than the input to other language tech systems (language), but it changes much more quickly. This means that the best way to build NLG systems is usually with “lightweight” models that aren’t as sophisticated as the ones used in machine translation or speech recognition (for example), but are easily configured and updated as the world changes with new input data, different business needs, and so on.
About the author: Arria Chief Scientist, Prof. Ehud Reiter, is a pioneer in the science of Natural Language Generation (NLG) and one of the world’s foremost authorities in the field of NLG. He is responsible for the overall direction of Arria’s core technology development as well as supervision of specific NLG projects. He is Professor of Computing Science in the University of Aberdeen School of Natural and Computing Sciences. Visit his blog here.