To see how authors start their sentences, we analyzed 30 million sentences from a data set of Hindawi papers. We extracted all one-, two-, three-, four-, and five-word sentence starts, ranked these by frequency, extracted the connectors, and then grouped these according to their purpose.


Most frequent sentence starts

The sheet below shows the 20 most frequent sentence starts. We see most of them are transition words used to link sentences (However, In addition, etc.) while some are pointers to the paper or sections within it (In this paper, In this study, etc.).

Connectors grouped by purpose

Next, we extracted all connectors. We manually grouped these by type and subtype: for example, were they used to contrast, add, or summarize? We removed the long tail of infrequent phrases - think of misspellings or very creative wordings - so that we were left with the most frequently used transition words only.

The tree diagram below shows the landscape of academic writing connectors. The inner circle shows the types, the second circle the subtypes, and the outer circle the connectors. The color and size of the nodes indicate connectors' relative frequency within their subtype.

