Types of Errors Used in Medical Editing Tests

Ryan K. Boettger · January 1, 2012 · Journal of the American Medical Writers Association, 27(3)

Researcher’s note (June 2026). I wrote this in 2012 to bring evidence to a question editors usually settled by anecdote: which errors actually populate the editing tests that screen prospective medical editors, and how those errors cluster by frequency and dispersion across a real workplace sample. The finding that mattered most was that style and grammatical/mechanical errors carried a higher-than-expected frequency, and that style-intensive testing may be a convention distinguishing medical editing from other technical editing. That early interest in how we assess writing skill is the same thread I now pull through my work on AI in writing and assessment — building empirical, error-grounded ways to measure what writers and now writing tools actually get right and wrong.

Abstract

For medical communication to mature, more research that investigates the core knowledge and skills required to enter and succeed in the profession is needed. In this article, I report the types of errors found in 13 editing tests administered to prospective medical editors. These data will help prospective medical communicators prepare to take an editing test and help hiring managers evaluate how well their own test assesses their applicants. A contingency table analysis identified how evenly the errors were distributed across six broad categories, and a weighted index identified the errors that were most frequent and most dispersed within the sample. The weighted index includes 21 errors that were dispersed in at least 50% of the sample. The results indicate that grammatical/mechanical and style errors had a higher than expected frequency, suggesting that the sample’s hiring managers were more concerned with candidates’ understanding of these error types than errors classified in other categories. The most predominant error was “Unnecessary or missing capitalization,” and its occurrence was primarily related to the capitalization conventions outlined in the AMA Manual of Style. Finally, six style errors ranked in this study’s list of predominant errors. This result suggests that style-intensive editing tests may be a convention that differentiates medical editing from other technical editing.

Introduction

Editing in the health sciences encompasses skills and practices that differ from those in other types of technical editing. The manuscript types represent a variety of forms and styles, and the targeted audiences—practitioners, researchers, and patients, as well as national and international readerships—are diverse. In a 2009 issue of the AMWA Journal, Lang called for more research within several areas of medical communication, including studies that investigate the core knowledge and skills required to enter and succeed in the profession [1]. A popular topic among many technical editors is usage error, specifically their most bothersome ones, or the ones they associate with personal preference rather than grammatical correctness. With the exception of a recent study [2], anecdotal discussions among editors remain the best source of information on error.

In this article, I report the types, frequencies, and dispersions of errors found in 13 editing tests administered to prospective medical editors. Editing tests evaluate an applicant’s ability to spot obvious typographic errors as well as to fix rather than to introduce new errors [3]. The editing test remains a unique workplace document because it purposely contains errors and therefore serves as a first step in identifying how medical communicators prioritize specific errors. Results respond to Lang’s call for research on the professionalization of medical communication by providing a list of errors that were derived from an actual workplace document. These data will help prospective medical communicators prepare to take an editing test and help hiring managers evaluate how well their own test assesses their applicants.

The results from relevant error studies have primarily reported how distracting specific errors are to practitioners instead of examining the actual errors found in workplace writing [4]–[7]. Several empirical studies have identified the most frequent errors in college writing [8], [9], but these results do not necessarily reflect how practitioners, in general, or medical communicators, in particular, prioritize errors. Hairston’s study was the first to determine how business practitioners responded to specific usage errors [7]. These practitioners, who represented 63 occupations, reported being overwhelmingly bothered by errors classified as status markers; eg, “When Mitchell moved, he brung his secretary with him.” [7] The next tier of bothersome errors was grouped by mechanical mistakes—sentence fragments, fused sentences, and faulty parallelism. Two follow-up studies yielded results similar to those of Hairston’s original study [4], [5]; fused sentences, faulty parallel structure, sentence fragments, and danglers ranked as some of the most distracting errors. Although these studies generated important findings, the usefulness of the data is somewhat limited by methodologic design. In all of these studies, data were solicited via a questionnaire, which included errors that the researchers believed could be the most bothersome to practitioners. These results may not accurately reflect the errors that practitioners pay attention to. Similarly, data collected from questionnaires depend on self-reporting, which can motivate participants to respond in ways they think are appropriate to the research [10].

Error taxonomies of college writing compensated for these methodologic limitations by assessing the errors in actual student samples, but the results cannot necessarily be generalized to professional communication practices. Connors and Lunsford’s study produced a list of more than 50 formal and mechanical errors that college students made in their writing [8]. Misspellings outnumbered all other errors by 300% and were removed from the formal study for independent analysis. The researchers ranked the remaining errors by frequency, selecting the top 20 for further inquiry. The list began with “missing comma after an introductory element” (occurring 11.5% of the time) and ended with the “its/it’s error” (occurring 1.0% of the time).

Twenty years later, Lunsford and Lunsford extended the original study [9]. The results reflected how a broader range of academic text types and the expansion of technology changed the error patterns in college student writing. Due to an increase in argumentative essays, the new list included errors related to using sources, quotations, and attributions. Technology also played a role in the rank of specific errors. “Misspelling” now ranked fifth, and “wrong word” emerged as the top error. The researchers attributed these shifts to electronic spellcheckers. Technology helped students remedy misspellings, but a reliance on the automated spelling suggestions likely caused an increase in wrong words.

Both of these studies present comprehensive error taxonomies, but they do not specifically relate to professional communication in which context might yield a different list of prominent errors. Lunsford and Lunsford’s results represent the errors commonly seen among developing academic writers. Expert professional writers will likely make different errors in their writing, and organizations’ use of style manuals might also create different contexts for errors related to capitalization and number format as well as introduce new error types related to tone, word choice, or consistency. For example, in a recent study of the errors found in 41 editing tests from multiple industries (two of which were related to the health sciences), I found a high frequency of spelling and capitalization errors; however, many instances of these errors were related to the spelling of proper nouns, such as company or product names as well as the capitalization standards outlined in style manuals [2]. The study also showed a high frequency of eight different style errors, such as language consistency, unnecessary passive construction, and faulty parallelism, but half of the errors disappeared from the prominent errors list when the dispersion was calculated (ie, how many tests a particular error appeared in). I reported that although style errors appeared frequently in specific tests, the error types were not representative of the sample. These findings suggest that examining various technical editing disciplines (eg, medical, engineering, computer science, education) results in qualitatively and quantitatively different error distributions.

Medical editing has much in common with other technical editing, but it also exhibits distinct features that reflect how its communicators work with subject-matter experts and convey technical information. The writing common to medical communication spans a variety of document types, such as journal articles, regulatory documents, grant proposals, educational resources, and marketing materials. These texts often include original research, which involves a synthesis of statistical data and a discussion of human subjects. These ideas must be presented with clarity, concision, ethicalness, and sensitivity, and the writing must adhere to discipline-specific nomenclature and stylistic conventions. Specialized style manuals like the 1,000+ page AMA Manual of Style [11] help medical communicators create consistency, and the AMWA Journal has published original research on relevant ethical topics such as working with statistics [12], the language patterns used to humanize patients [13], and the use of passive voice [14]. However, no study has examined these issues in total and in an actual workplace document such as an editing test, which can distinguish a good editor from a great one. The present study was designed to explore the types of errors associated with medical communication and to determine how these errors align with or differ from those in other types of technical editing.

Methods

Because of the privatization of editing tests, the study’s sample proved difficult to collect. I obtained 13 editing tests through personal requests to 13 different medical communication companies. I signed a nondisclosure agreement with most companies to further protect the integrity of the tests. Participating companies represented a variety of subfields within the health sciences, including health care management, pharmaceutical research, and medical and scientific communication editing services. Every error in the tests was classified according to the latest edition of the designated style manual; 10 tests required the AMA Manual of Style, two required the Publication Manual of the American Psychological Association, and one required the Chicago Manual of Style.

Raters

Four raters were involved in the classification process. All had formal education and professional experience in technical and (or) medical editing. Two raters independently classified the errors in nine of the 13 editing tests by using the assessment keys provided by the company’s hiring managers. Using the assessment keys ensured that errors were classified from the organization’s perspective. Whenever possible, errors were classified by the error/error patterns name listed in the Connors and Lunsford and the Lunsford and Lunsford studies; however, multiple new errors related to style were identified in this sample. Every error was then classified into one of six broad error categories: grammar and mechanics, punctuation, spelling, style, content, and design.

A third rater with more than a decade of medical editing experience helped classify the four tests that did not have assessment keys. This rater completed each test as if she were an applicant for the targeted position. The original two raters then classified the errors from these four tests just as they had the rest of the sample. Percent agreement between these raters identified an 81.0% consensus level, an acceptable level of agreement [10], [15]. A fourth rater made the final decision in instances when the two raters disagreed on how an error should be classified.

Measures

I used two measures to explore the sample. The contingency table analysis identified how evenly the errors were distributed across six broad categories: grammar and mechanics, punctuation, spelling, style, content, and design. Determining this distribution can help prospective applicants prepare for an editing test and can show hiring managers how the content of their test compares with that of other organizations. To my knowledge, no other study has grouped errors into broad categories to measure their distribution across the sample; therefore, the null hypothesis assumed that if the errors were evenly distributed, then each category would contain 144 errors. This number was determined by dividing the total number of errors (864) by the number of categories (6).

To gain a better understanding of the distribution of errors in the data sample, I also present the results of what I will subsequently refer to as an error’s “weighted index.” The weighted index factored the frequency and the dispersion of each error into a single numerical value. Although a lone frequency list provides useful information on the frequency (or popularity) of errors, it cannot account for errors that cluster in a small number of the sample (ie, weakly dispersed errors) compared with errors that appear consistently throughout the sample (ie, highly dispersed errors). This study’s weighted index thus provides a means for considering both how frequent and how dispersed an error is within the 13-test sample. This index weights each error’s frequency and dispersion 50/50 because I could not identify an existing model that would have suggested a different weighting. I welcome suggestions on alternative weightings.

Results

Eight-hundred and sixty-four errors and 60 error types were identified within the sample. Each test contained an average of 66.46 errors (median=69.5, SD=37.48) and an average of 23.38 error types (median=26.0, SD=9.90).

Contingency Table Analysis of Errors by Category

The contingency table analysis determined if the errors were evenly distributed across the six broad categories (Table 1).

Style errors and grammatical/mechanical errors had a higher than expected frequency. Content errors were not significantly distributed because the individual frequencies for the category were too close to the expected frequency of 144. Therefore, this error type was distributed among the sample as would be expected if each type was predicted to have equal representation. Punctuation, spelling, and design errors had a lower than expected frequency.

Table 1. Errors Organized by Broad Error Category, Frequency, Significance Level of their Contingency Table Analysis, and Confidence Intervals

Category	Frequency^a	p_binomial^b	95% CI
Style	281	.000	.294—.358
Grammar and mechanics	234	.000	.241—.302
Content	136	.494	.134—.183
Punctuation	115	.008	.111—.158
Spelling	57	.000	.050—.085
Design	41	.000	.034—.064

^a Every error in the sample (864 total errors) was classified into one of six broad error types. ^b The null hypothesis assumed that if the errors were evenly distributed, each category would contain 144 errors. This number was determined by dividing the total number of errors (864) by the number of categories (6).

Weighted Index (Frequency and Dispersion) of Errors by Type

The weighted index factored how frequent and how dispersed each error was within the sample. The sample’s top 21 errors were identified, which were dispersed in at least seven of the 13 tests studied (Table 2). Thirty-nine additional errors were identified in the sample (Appendix A, online exclusive).

More than 60% of the top 21 errors related to grammar and mechanics or style. The five grammatical and mechanical errors included the most prevailing error in the sample: “unnecessary or missing capitalization,” which was the second most frequent error but dispersed through 92% of the sample. The four additional grammatical and mechanical errors all appeared infrequently but were dispersed throughout 69% or more of the sample, accounting for their strong rankings in the top half of Table 2: “missing or wrong article” and “unnecessary shift in verb tense” (both ranked as the fourth most prominent error), “misplaced/dangling modifier” (ranked fifth), and “incorrect singular/plural application” (ranked eighth).

Six style errors ranked in this study’s list of predominant errors: “redundant, expendable, or incomparable language” (ranked second); “vague or missing language” (ranked third); “faulty parallel structure” (ranked 10th); “inconsistent terminology” (ranked 12th); “informal or discriminatory language” (ranked 17th); and “unnecessary passive construction” (ranked 18th). Five of these errors appeared infrequently (ie, less than 50% of the tests) but were strongly dispersed in the sample. Misspellings, the most prevailing error in many of the earlier-discussed taxonomies, ranked sixth in this study’s weighted index.

Table 2. The Most Predominant Errors, Ranked by Their Weighted Index

Rank	Error	Broad Error Category^a	Frequency Index^b	Dispersion Index^c	Weighted Index^d
1	Unnecessary or missing capitalization	Grammar	.063	.923	.493
2	Redundant, expendable, or incomparable language	Style	.049	.846	.447
3	Vague or missing language	Content	.035	.846	.440
4	Missing or wrong article	Grammar	.028	.769	.398
4	Unnecessary shift in verb tense	Grammar	.028	.769	.398
5	Misplaced/dangling modifier	Grammar	.016	.769	.393
6	Misspelling	Spelling	.066	.692	.379
7	Incorrect number format	Style	.051	.692	.372
8	Incorrect singular/plural application	Grammar	.035	.692	.364
9	Hyphen, en-, or em-dash error	Punctuation	.034	.692	.363
10	Faulty parallel structure	Style	.025	.692	.359
11	Missing comma with a nonrestrictive element	Punctuation	.020	.692	.356
12	Inconsistent terminology	Style	.017	.692	.355
13	Wrong word	Content	.036	.615	.326
14	Lack of subject-verb agreement	Grammar	.028	.615	.321
15	Equation error (eg, incorrect calculation or symbol)	Content	.027	.615	.321
16	Incorrect or missing preposition	Grammar	.024	.615	.320
17	Informal or discriminatory language	Style	.049	.538	.293
18	Unnecessary passive construction	Style	.015	.538	.277
19	Space missing or needed	Design	.014	.538	.276
20	Missing period	Punctuation	.012	.538	.275

^a Each error was classified into one of six broad categories; these results were used to calculate contingency table analysis. ^b The frequency index was calculated by dividing each specific error’s frequency by the total number of errors (864) found across all tests in the sample. ^c The dispersion index was calculated by dividing the number of tests each specific error was found in by the total number of editing tests (13). ^d The weighted index was determined by relativizing the frequency index against the dispersion index (ie, adding the frequency and the dispersion indices and dividing by two).

Discussion

Results from the contingency table analysis determined if the errors were evenly distributed across six broad categories (Table 1). Style errors and grammatical/mechanical errors had a higher than expected frequency, which could suggest that the sample’s hiring managers found these errors easier to assess. However, this explanation does not account for the lower than expected frequency of punctuation and spelling errors, which are arguably the easiest error types to include and assess in an editing test. Similarly, style errors could be considered the most subjective error type and therefore the most difficult to assess. For example, different evaluators could have various opinions on what constituted redundant language, the second most prominent error in the weighted index (Table 2). The overall results from the contingency table analysis seem to suggest that hiring managers are more concerned with prospective employees’ mastery of grammar, mechanics, and style and less concerned with their punctuation, spelling, and design abilities. Results from the weighted index further support this claim.

More than 60% of the top 21 errors were related to grammar/mechanics or style. The most prevailing error was “unnecessary or missing capitalization” (Table 2). Recent error taxonomies have noted a higher-than-expected frequency of capitalization errors but offered different explanations for its increased presence [2], [9]. Lunsford and Lunsford attributed the high frequency of capitalization errors in college writing to technology and the development of the students; Microsoft Word automatically capitalized words that follow a period (eg, a period used in an abbreviation) and students often capitalized terms to suggest significance (eg, “High School Diploma”). However, the high frequency and dispersion of capitalization errors in my earlier study of editing tests from a variety of industries was the result of style manual guidelines [2]. The types of capitalization errors identified in the current study align with that finding: the majority of capitalization issues related to the AMA Manual of Style guidelines, including the capitalization of organisms and pathogens, viruses, tests, and sociocultural designations as well as the decapitalization of common words derived from proper nouns (eg, parkinsonism). The finding extends the exploration of how context often shapes why specific writing errors are made as well as stresses the importance of knowing the stylistic standards and practices that govern a specific discipline.

In addition to errors related to capitalization, the weighted index identified “missing or wrong article,” “unnecessary shift in verb tense,” “misplaced/dangling modifier,” and “incorrect singular/plural application” as the most predominant grammatical errors in the sample. Prospective medical editors should note that these error types are the ones that this sample’s hiring managers were most concerned with their applicants recognizing and fixing.

Six style errors ranked highly in this study’s weight index: “redundant, expendable, or incomparable language”; “vague or missing language”; “faulty parallel structure”; “inconsistent terminology”; “informal or discriminatory language”; and “unnecessary passive construction.” The combination of the types of style errors in this sample emphasizes the importance of consistency and concision in medical editing as much as the respect that an editor must demonstrate toward people. Ninety-two percent of the sample (12 editing tests) included passages that discussed the safety or treatment of people, including patients, research participants, or health care workers. In particular, the presence of “informal or discriminatory language” as a predominant error reflects the AMA Manual of Style’s human-centered guidelines that caution against labeling people with their disabilities or using terms that suggest helplessness.

The instances classified as “unnecessary passive construction” also suggest an awareness of human subjects and highlight the impressive detail that hiring managers afforded to designing these tests. Modified examples from the sample —“The patient recovered well and was discharged on the same day of surgery” and “The primary tumor was seen in the nasal cavity in 7 patients”—further illustrate how the context surrounding each sentence played an integral role in which subject (the person or the object) to emphasize. These examples reflect the AMA Style Manual’s encompassing definition of voice: active voice should be used to clarify ideas or focus on the subject performing the tasks, and passive voice should be used to shift focus on mechanisms or processes [11].

The higher-than-expected frequency of style errors in this sample (Table 2) suggests how the screening of medical editors might differ from that of other technical editors. My error taxonomy of editing tests from multiple industries demonstrated that the majority of style errors were concentrated frequently in a small number of the sample, which included only two tests from the health sciences [2]. The results of the present study, however, suggest that medical editors should be more mindful of style errors on their screening tests and that style-intensive tests may be a fixed convention in the field.

Of the remaining prominent errors in the sample, “misspelling” merits some discussion. The contingency table analysis showed a lower-than-expected frequency for this error, and the weighted index ranked this error sixth. These results are somewhat surprising, given that previous studies have shown a high frequency of misspellings [2], [6], [8], [9], most notably, Connors and Lunsford’s study, in which this error outnumbered all others by 300% [8]. Several explanations for the present study’s results exist, all of which require further research.

First, the prevailing errors in this study were derived by relativizing each error’s frequency against its dispersion. To my knowledge, no other study has provided a weighted index of formal errors. In fact, “misspelling” was the most frequent error in this sample but dispersed through 69% of the tests, accounting for its weighted ranking of sixth. The available published data are not sufficiently fine-grained for deeper analysis, so it is not possible to determine if spelling errors in earlier studies ranked high simply because of frequency or because of a combination of variables such as dispersion.

Next, the context of the spelling errors might explain its lower priority for the sample’s hiring managers. Whereas “misspelling” outnumbered all others by 3:1 in the Connors and Lunsford study [8], it ranked sixth in the later follow-up study and “wrong word” errors claimed the top spot [9]. The researchers attributed this shift to how students used electronic spellcheckers; the technology helped students remedy misspellings, but a reliance on the automated spelling suggestions likely resulted in the increased presence of wrong words. A breakdown of the spelling errors in the current sample showed that 62% were words that an electronic spellchecker could not detect, including compound words, homonyms, and proper nouns. The remaining sample could be detected by the spellchecker, including general misspellings (26%) and British or alternative spellings of words (12%). Hiring managers may be less concerned with candidates’ understanding of misspellings than expected, but the strong presence of errors undetectable by a spellchecker certainly suggests a means for assessing applicants’ attention to detail. However, “wrong word” ranked lower (13) than “misspelling,” so it is not yet possible to conclude if these results are an anomaly or an illustration of how developing student writers and expert hiring managers in the health sciences perceive error differently.

Conclusions

The results provide prospective medical editors and hiring managers with new insights into preparing for or creating an editing test. For medical editors, the list of predominant errors should alleviate some of the anxiety associated with taking an editing test. In addition to just understanding error types, however, future test takers must consider how context dictates the presence of specific errors. Understanding the contexts associated with capitalization, spelling, and passive voice (as examples) also offers hiring managers information on creating new or refining existing assessment tools. All of these conclusions must be heavily hedged, however, because of the sample size of the study.

Additional research with larger samples will extend the results as well as indicate if the privatization of editing tests (and therefore the lack of publicly available examples) has produced disparity in what editing tests actually assess. The measures used to analyze this study’s sample provide a first step to this deeper research. The contingency table analysis provides a means for measuring error distributions in previous studies, and these results could suggest how different populations as well as different industries prioritize error. Results from this study indicate that medical editing tests might be more style-intensive than technical editing tests. The weighted index, to my knowledge, is the first to consider both the frequency and the dispersion of specific errors. Future error research must consider error dispersion, particularly those errors that appear infrequently but are strongly dispersed in a sample. The ranking of “misplaced/dangling modifier” makes a strong case for using a weighted index in future research; it appeared infrequently but was found in 10 of the 13 tests, accounting for its weighted rank as the fifth most predominant error. These deeper explorations of error will better educate medical communicators on the skills they need to succeed in their profession and enhance the value associated with medical writing and editing.

Acknowledgment

I thank Stefanie Beaubien, Carrie Klein, Wanda J. Reese, and Larissa True for their assistance with classifying the sample. I also thank Stefanie Wulff and Katharine O’Moore-Klopf for their feedback on earlier drafts.

Author disclosure: The author notes that he has no commercial interests that may pose a conflict of interest with this article.

Author contact: ryan.boettger@unt.edu

References

Lang T. Just who are we and what are we doing, anyway? Needed research in medical writing. AMWA J. 2009;24(3):106-112.
Boettger RK. Examining error in the technical communication editing test. Presented at annual meeting of the Society for Technical Communication; May 18, 2011; Sacramento, CA.
Hart GJS. Editing tests for writers. Intercom. 2003:12-15.
Gilsdorf JW, Leonard DJ. Big stuff, little stuff: a decennial measurement of executives’ and academics’ reactions to questionable usage errors. J Bus Comm. 2001;38(4):439-475.
Leonard DJ, Gilsdorf JW. Language in change: academics’ and executives’ perceptions of usage errors. J Bus Comm. 1990;27(2):137-158.
Beason L. Ethos and error: How business people react to errors. Coll Compos Comm. 2001;53(1):33-64.
Hairston M. Not all errors are created equal: Nonacademic readers in the professions respond to lapses in usage error. Coll Engl. 1987;43(8):794-806.
Connors RJ, Lunsford AA. Frequency of formal errors in current college writing, or Ma and Pa Kettle do research. Coll Compos Comm. 1988;34(4):395-409.
Lunsford AA, Lunsford KJ. “Mistakes are a fact of life”: a national comparative study. Coll Compos Comm. 2008;59(4):781-806.
Frey LR, Botan CH, Kreps GL. Investigating Communication: An Introduction to Research Methods. 2nd ed. Boston, MA: Allyn and Bacon; 2000.
Iverson C, Christiansen S, Flanagin A, et al. AMA Manual of Style: A Guide for Authors and Editors. 10th ed. New York, NY: Oxford University Press; 2007.
Lang T. Common statistical errors even YOU can find. Part 1: Errors in descriptive statistics and in interpreting probability values. AMWA J. 2003;18(3):67-71.
Knatterud ME. With respect to patients and readers: deadly terms to excise. AMWA J. 2008;23(3):113-117.
Amdur RJ, Kirwan J, Morris CG. Use of the passive voice in medical journal articles. AMWA J. 2010;25(3):98-104.
Watt J, van den Burg S. Research Methods for Communication Science. Boston, MA: Allyn and Bacon; 1995.