What is NLG?

What is Natural Language Generation?

Natural Language Generation (NLG) systems take the information contained in raw data and automatically express it in natural language, either in text or in voice. The input may be anything from sales performance data to log files from technical monitoring devices. The output may be provided in any natural language, such as English, French, Chinese or Tagalog, and may be combined with graphical elements to provide a multimodal presentation. For example, the log files of technical monitoring devices can be analysed for unexpected events and transformed into alert-driven messages; or numerical time-series data from hospital patient monitors can be rendered as hand-over reports describing trends and events for medical staff starting a new shift.

Who needs Natural Language Generation?

The three key advantages of NLG-produced text are scalability, tailorability and consistency. NLG can produce personalised text at a rate that is simply unattainable using human authors, and the quality of NLG texts does not vary in the way that a human’s might. NLG is ideal for anyone who:

  • has data that needs to be interpreted in order to make it actionable
  • wants to explain the ‘why’, not just present the ‘what’
  • needs to make fast decisions based on data
  • has too much data for a human expert to understand
  • requires consistency in their text output
  • wants to make their data accessible to non-technical people.
What is the difference between data analytics and NLG?

Data analytics interprets and finds patterns in large data sets; NLG delivers the information found in these patterns in plain English (or any other natural language). The Arria NLG Platform comprises both these processes, combining the skills of an analyst with those of a communicator.

What is Articulate Analytics™?

Articulate Analytics™ is the name Arria gives to its methodology, whereby we take into account the intended application of the output when we design the analysis and interpretation processes to be applied to the raw data. In finding patterns and trends, Articulate Analytics™ considers what kind of information is needed to generate a text that communicates what is important. This is a critical addition to purely data-driven analytics, which is focused on finding trends and events in the data independently of how they might be communicated to the target audience.

How Arria NLG Helps You

Do I have to give my data to Arria NLG or can the NLG Platform come to me?

The Arria NLG Platform can be deployed on your site, so that there’s no need to surrender sensitive data. Alternatively, you may find it more convenient to use our cloud-based NLG service.

Can I get a demo of the NLG Platform at work?

Of course! Get in contact with our sales team: sales@arria.com

What does the output of the NLG Platform look like?

The NLG Platform will adhere to your formatting conventions and include your branding, making the style of its output indistinguishable from that of your other publications.

Is the NLG Platform’s output grammatically correct?

Yes. The linguistic foundations of the NLG Platform ensure that this is the case.

What kind of industries can benefit from NLG?

Applications for NLG technology are everywhere. Virtually any application where data needs to be presented to professional experts, decision-makers or clients will gain an advantage from using Natural Language Generation to describe that data. NLG shines in safety-critical industries requiring consistent and reliable output, such as healthcare, the energy and resources sector, and engineering. The fine-grained personalisation that NLG makes possible means that the technology is also highly valuable in customer-oriented sectors such as finance and marketing.

Why is data easier to understand in textual form?

Humans come with built-in language understanding skills, while the skills needed to interpret tables, numbers and other data formats have to be acquired and are often only accessible to specialised experts. Anyone can understand a message such as ʻDanger! The engine temperature is exceeding the safety thresholdʼ; whereas gaining the same insight from a table of numerical temperature readings often requires expert knowledge.

Why is Arria NLG’s scientific approach to NLG important?

Arria offers the only NLG system on the market that is based on scientific insights from extensive research in Artificial Intelligence, Computational Linguistics and Cognitive Science. The Arria approach to NLG has been developed by scientific leaders in the field, and takes into account what we know about how people both process information and produce language.

What kind of data can the Arria NLG Engine deal with?

Arria’s NLG Engine can deal with data from virtually any domain. The Engine incorporates a comprehensive set of general data analysis and NLG techniques, and Arria’s domain experts and engineers are constantly working to expand the NLG Engine’s range of expertise. This is done by adding industry-specific rules and techniques for data analysis as well as knowledge about specific writing practices. This ensures that each text produced by the Arria NLG Engine reads as if it was written by a human expert in your particular industry.

Does the input data have to be in a certain format?

No. The NLG Engine is designed to be as flexible as possible in terms of the input data it expects. The system currently generates natural language based on numerical or symbolic data held in large SQL databases, spreadsheets, XML files, text-based log files and many other formats. The NLG Engine uses a proprietary internal data representation; raw input data is transformed into this internal representation by customisable input modules. Arria engineers will work with you in order to tailor the input routines to automatically transform your data into the NLG Engine’s internal representation.

About Arria NLG

Is Arria NLG hiring?

Always. If you are interested in joining Arria, check out the positions available on the careers page.

What kind of people work at Arria NLG?

Building sophisticated applications of linguistic technology requires both right-brain and left-brain skills, so you’ll find a fascinating mix of people at Arria. Our employees include professors and PhDs in Computer Science, linguists, software engineers, proofreaders, and entrepreneurial people who thrive in a start-up environment and are passionate and committed to the potential NLG offers.

Where is Arria NLG located?

Arria’s headquarters are in London, its science and engineering teams are based in Aberdeen and Sydney, and it has additional sales offices in New York and Auckland.

How long does the deployment process take?

We specialise in making complex data easy to understand and use. Whether your data comes from one or multiple sources we can generate natural language in 60-90 seconds, so the deployment process will depend on your individual use case. Get in touch to find out what the implementation process will be for your project and how quickly Arria NLG could be transforming your data.

What are the steps from first contact to a deployed NLG system?

After an initial meeting where we introduce the technology and how it works to your key stakeholders, we spend time with you to flesh out the specifics of one or more use cases that have the potential to bring significant value to your organisation. This then leads to a scope of work and high-level design for the deployment of an NLG application within your organisation. The initial deployment may be a Proof of Concept or pilot application, which is then followed by a full deployment; or, in appropriate circumstances, we may deliver a full production deployment of the Engine in the first instance.

What format does the output of the NLG Platform come in?

The output can be provided in any format required, including, for example, HTML, PDF, Word, XML, marked up text for speech synthesis, and so on. During the scoping process, Arria engineers will determine which output formats are ideal for your application.

How fast can the NLG Platform produce texts?

The scalability of the NLG Platform is one of its main advantages. In many situations, it can produce thousands of texts per minute. This makes it possible to generate extremely personalised or localised reports that would take too long and would be too expensive if produced by human writers.

Is the NLG Platform’s output as good as human-written text?

Yes. The language generated by the NLG Platform is virtually indistinguishable from language produced by human writers. In fact, some studies show that users prefer these automatically generated texts over human-written texts.

Do all texts produced by the NLG Platform sound the same?

Not unless you want them to. If you require the output to be engaging for a human audience, the NLG Platform can produce the same variety as human writers. This is useful for regular publications to a standard client base, such as newsletters or company updates. But in many industries linguistic consistency is crucial in order to avoid misunderstandings. This is especially the case for textual output in safety-critical industries, such as equipment monitoring alerts and weather reports in oil drilling operations. For these kinds of applications, the language generated by the Arria NLG Platform can provide a level of consistency that is extremely hard for human authors to achieve.

Can documents produced by the NLG Platform include graphics?

Yes, and it can automatically generate captions and annotations for the graphics contained in its output. Our philosophy is to use the best information delivery device for the task at hand.

How does the Arria NLG Engine work?

The Arria NLG Engine consists of two main processing components: the first carries out Analysis and Interpretation, and the second is responsible for Information Delivery. The Analysis and Interpretation stage derives facts and insights from the raw input data and turns them into basic information chunks called messages. The Information Delivery stage works out how to best communicate the information in these messages as a coherent text, possibly combined with graphics or provided as voice output.