Writefull vs. humans on three sample texts

We recently announced that Writefull’s language models offer a proofreading service of nearly-human quality, having reached 88% of average human performance. Hard to believe? In this post, we show you how it’s possible by comparing Writefull with human proofreaders on three sample texts.

For the comparison, we used three short sample texts available from proofreading service websites. As these sample texts are meant to advertise the proofreading services, they should be of the highest quality - a best-case scenario.

We ran the original (pre-proofreading) version of the texts through Writefull’s Full Edit. Next, two applied linguists from the team classified all edits as:

Correct: necessary to fix language errors, or
Incorrect: adding errors to the original text, or
Optional: not necessary to fix errors, but OK

All edits and their classifications are shared below.

Results

The visualization below shows the three sample texts with all edits categorized. In two of the three texts, Writefull approaches or even outperforms the human proofreader: Writefull scores better in the first, worse in the second, and about the same in the third.

It’s important to emphasize that we only counted language edits. In some cases, the human-proofread samples offered comments beyond language, such as about text structure or content, but as Writefull revises language only, these comments were ignored in the analysis. The same goes for style changes, such as in-text citations, which are not checked by Writefull.

While we shouldn’t take these results as conclusive - we looked at only three snippets from a random set of proofreading services - they do show the level that Writefull’s AI has achieved. Writefull is offering a nearly-human proofreading service; but completing a document within minutes and at a fraction of the cost.

Or... at no cost. We're temporarily offering Writefull's proofreading service for free! Read more here.

Sample sources
Sample 1
Sample 2
Sample 3

About the author

Hilde is Chief Applied Linguist at Writefull.

Writefull vs. humans on three sample texts

We recently announced that Writefull’s language models offer a proofreading service of nearly-human quality, having reached 88% of average human performance. Hard to believe? In this post, we show you how it’s possible by comparing Writefull with human proofreaders on three sample texts.

Submitting to a journal: A pre-flight checklist

How to avoid plagiarism in your paper