Artificial intelligence has a problem with grammar

The hitch illuminates the nature of language

Feb 25th 2021

IF YOU FREQUENTLY Google language-related questions, whether out of interest or need, you’ve probably seen an advertisement for Grammarly, an automated grammar-checker. In ubiquitous YouTube spots Grammarly touts its ability not only to fix mistakes, but to improve style and polish too. Over more than a decade it has sprawled into many applications: it can check emails, phone messages or longer texts composed in Microsoft Word and Google Docs, among other formats.

Listen to this story.

Enjoy more audio and podcasts on iOS or Android.

Does it achieve what it purports to? Sometimes. But sometimes Grammarly doesn’t do what it should, and sometimes it even does what it shouldn’t. These strengths and failings hint at the essence of language and the peculiarity of human intelligence, as opposed to the artificial sort as it stands today.

Begin with the strengths. In a rough piece of student writing, Johnson counted 14 errors. Grammarly flagged five. For example, it sensibly suggested inserting a hyphen in “post cold war [world]”. It spotted a missing “the” in the phrase “with [the] European economy”. And it noticed an absent “about” in “wondering [about] the state of Europe”. By using Grammarly, the author of this essay could have avoided some red ink.

On the other hand, Grammarly has a problem with false positives, calling out mistakes that are not. The other two suggestions were not disastrous, but neither did they relate to “critical errors” as Grammarly maintains. In the assertion that enlargement had “created a fatigue” within the European Union, Grammarly needlessly suggested deleting the “a”. In another error-ridden sentence it recommended removing a comma, which fixed none of the problems. This false-positive tendency is not a deal-breaker for reasonably skilled writers who just want a second pair of eyes; you can dismiss any suggestion you like. But truly struggling scribblers might not know when Grammarly’s ideas would make their prose worse rather than better.

Then there are the false negatives, or the mistakes Grammarly fails to notice. Depending on the text, Grammarly can seem to miss more errors than it marks. The company’s chief executive, Brad Hoover, describes it as a “coach, not a crutch”—which sets expectations more appropriately than some of the ads do.

Artificial-intelligence systems like Grammarly are trained with data; for instance, translation software is fed sentences translated by humans. Grammarly’s training data involve a large number of standard error-free sentences (so it knows what good English should look like) and human-corrected sentences (so the software can find the patterns of fixes that human editors might make). Developers also manually add certain rules to the patterns Grammarly has taught itself. The software then looks at a user’s prose: if a string of words seems ungrammatical, it tries to spot how the putative mistake most closely resembles one from its training inputs.

All this shows how far artificial “intelligence” is from the human kind (which Grammarly wants to correct to “humankind”). Computers outpace humans at problems that can be cracked with pure maths, such as chess. Advances in language technology have been impressive in, for example, speech recognition, which involves another sort of statistical guess—whether or not a stretch of sound matches a certain string of words. One Grammarly feature that works fairly well is sentiment analysis. It can rate the tone of an email before you send it, after being trained on texts that have been assessed by humans, for example as “admiring” or “confident”.

But grammar is the real magic of language, binding words into structures, binding those structures into sentences, and doing so in a way that maps onto meaning. And at this crucial structure-meaning interface, machines are no match for humans. Computers can parse (grammatical) sentences fairly well, labelling things like nouns and verb phrases. But they struggle with sentences that are difficult to analyse, precisely because they are ungrammatical—in other words, written by the kind of person who needs Grammarly.

To correct such prose requires knowing what the writer intended. But computers don’t work in meaning or intention; they work in formulae. Humans, by contrast, can usually understand even rather mangled syntax, because of the ability to guess the contents of other minds. Grammar-checking computers illustrate not how bad humans are with language, but just how good.

This article appeared in the Culture section of the print edition under the headline "The human touch"