Jump to main content

Conference Contributions

Progress of the HeidelGram project is regularly presented at various academic conferences. Below, you will find information about previous conference contributions.

2025

ICAME46 in Vilnius, Lithuania (June 2025)

Onomastic Referencing in 18th-Century British Grammar Writing

Beatrix Busse, Nina Dumrukcic, Sophie Du Bois

Keywords: historical corpus, network analysis, English grammar writing, reference analysis, eighteenth century

The grammarians of the 18th century were well-known for their proclamations of what constitutes ‘correct’ language use, also known as the ‘doctrine of correctness’ (Leonard 1929/1962). For these, they appealed to the authority of analogy with Latin, reason, or logic. If we look at English grammar writing, Robert Lowth’s A Short Introduction to English Grammar from 1762 and Lindley Murray’s The English Grammar from 1795 are the most influential, and, probably, major prescriptivists (Tieken-Boon van Ostade 2011, p. 4) of the 18th century. Chapman (2008, p. 36) claims that the 18th-century grammarians were not solely prescriptivists, but that some investigated and published on language, attempting to address linguistic questions such as vernacular grammar or universals. 

Within the HeidelGram project (https://heidelgram.de), investigations of British grammar writing from the 16th, 17th, and 19th centuries have indicated shifts in not only who is being referenced in these grammar texts, but also which strategies are employed to do so. This paper aims to continue the diachronic analyses of what we call onomastic, that is, name-based referencing by investigating who is referenced in 18th-century British grammars of English and in what way.

The HeidelGram corpus (Busse et al. 2015–) is a carefully designed corpus that comprises a representative selection of 16th- to 19th-century grammars of English. The 18th-century data comprises a total of about 1,5 million tokens taken from 24 grammar books. A citation network (see White 2011) of grammars and grammarians is created, which illustrates which texts from the 18th-century sub-corpus of the HeidelGram corpus refer to whom and who the most influential figures of the period are. Each onomastic reference is extracted automatically from corpus annotations using a custom tool written in Python and R in order to generate a network. The references are categorized into the seven person types that were established during the network analysis of the 16th-century grammar books (see Busse et al. 2021, 2024), such as grammar author and ancient scholar. Frequency and concordance analyses provide further quantitative and qualitative information which allow for a better understanding of these onomastic references.

It has been observed that the 18th-century “grammarians rely heavily on each other’s work” (Locher 2008, p. 131), which suggests that there is significant referencing among the authors. We therefore expect to find higher proportions of references to grammar authors as compared to other person categories. References to works that are commonly considered to be prescriptive, such as Lowth’s and Murray’s, indicate how the authors in the corpus position themselves about the prescriptivism–descriptivism continuum. An ego network of references to Lowth will indicate his influence on the later 18th-century grammar authors.

References:

Busse, B., Gather, K., & Kleiber, I. (2015–). HeidelGram. A Corpus of English Grammar Books between 1550 and 1900. https://heidelgram.de

Busse, B., Dumrukcic, N., & Du Bois, S. (2024). Onomastic Referencing Strategies in a Corpus of 17th-Century Grammars of English. ICAME45, Vigo, Spain.

Busse, B., Kleiber, I., Dumrukcic, N., & Du Bois, S. (2021). A Corpus-Based Network Analysis of 16th-Century British Grammar Writing. CL2021, Limerick, Ireland.

Busse, B., Gather, K., & Kleiber, I. (2020). A Corpus-Based Analysis of Grammarians’ References in 19th-Century British Grammars. In A. Cermakova & M. Malá (Eds.), Diskursmuster - Discourse Patterns: Vol. 20. Variation in Time and Space: Observing the World Through Corpora. De Gruyter.

Busse, B., Gather, K., & Kleiber, I. (2019). Paradigm Shifts in 19th-Century British Grammar Writing: A Network of Texts and Authors. In B. Bös & C. Claridge (Eds.), Norms and Conventions in the History of English. John Benjamins.

Busse, B., Gather, K., & Kleiber, I. (2018). Assessing the Connections between English Grammarians of the Nineteenth Century: A Corpus-Based Network Analysis. In Eric Fuß, Marek Konopka, Beata Trawiński, & Ulrich H. Waßner (Eds.), Grammar and Corpora 2016 (pp. 435–442). Heidelberg University Publishing.

Chapman, D. (2008). The eighteenth-century grammarians as language experts. In I. Tieken-Boon van Ostade (Ed.), Grammars, Grammarians and Grammar-Writing in Eighteenth-Century England (pp. 21-36). Berlin, New York: De Gruyter Mouton.

Locher, M. A. (2008). Chapter 7: The Rise of Prescriptive Grammars on English in the 18th Century. In J. A. Fishman, M. A. Locher, & J. Strässler (Eds.), Contributions to the Sociology of Language. Standards and Norms in the English Language (Vol. 95, pp. 127–148). Mouton de Gruyter. https://doi.org/10.1515/9783110206982.1.127

Leonard, S. A. (1962). The Doctrine of Correctness in English Usage 1700-1800. Russel & Russel Inc. (Original work published 1929)

Tieken-Boon van Ostade, I. (2008). Grammars, grammarians and grammar writing: An introduction. In I. Tieken-Boon van Ostade (Ed.), Topics in English Linguistics: Vol. 59. Grammars, Grammarians and Grammar-Writing in Eighteenth-Century England (pp. 1–14). Mouton de Gruyter.

Tieken-Boon van Ostade, I. (2011.). The bishop's grammar: Robert Lowth and the rise of prescriptivism in English. Oxford: Oxford University Press. 

White, H. D. (2011). Scientific and scholarly networks. The SAGE handbook of social network analysis, 271-285.

2024

ICAME45 in Vigo, Spain (June 2024)

Onomastic Referencing Strategies in a Corpus of 17th-Century Grammars of English

Beatrix Busse, Nina Dumrukcic, Sophie Du Bois

Keywords: historical corpus, network analysis, English grammar writing, reference analysis, seventeenth century

The 17th century represents an eventful time with regards to language development and instruction. The English language expanded to areas of language use where the classical languages had previously predominated in the late 17th century, which led to increased standardization in language usage (Nevalainen 2006, 42). This and other sociopolitical developments sparked a shift in favor of English being recognized as a separate academic discipline. (Beal 2004, 102). The examination of onomastic references, i.e. name-based references, in the grammatical literature of the 17th century offers a unique perspective on language usage, sociolinguistics, and the cultural nuances embedded in linguistic artifacts of this period.

Such examinations of English grammar writing lie at the core of the HeidelGram project. Previous studies within the project have investigated linguistic means employed by 16th- and 19th-century grammarians when referring to other persons within their works, ultimately aiming for a full diachronic perspective. Most recently, the types of persons referenced by grammarians of the 17th century have been investigated. The present study further quantitatively and qualitatively analyzes the types of references made by 17th-century grammar authors. This allows us to identify where the authors position themselves in relation to others as well as changing or stable trends in referencing strategies.

The HeidelGram corpus, carefully curated to encompass a representative selection of grammatical works from the 16th to 19th centuries, serves as a valuable source for understanding how name-based references were employed in linguistic instruction at the time. For this study, onomastic references within the 17th-century component of the HeidelGram corpus were systematically extracted and visualized in a citation network (see White 2012) using a custom-built tool based on Python and R. The 17th-century component of the corpus encompasses 17 texts, which add up to about 590.000 tokens. From these texts, a total of 2586 onomastic references were extracted. Each reference to a person was manually analyzed and assigned a reference category. There are six reference categories, which were originally established for the 19th-century grammar data (see Busse et al. 2018, 2019, 2020), such as opinion and quotation. The applicability of this categorization to the 17th-century data will be evaluated via inter-rater reliability measures. 

Our previous work on the 16th and 19th century grammar books (Busse et al. 2018, 2019, 2020) has portrayed the potential of utilizing network analysis – a methodological tool for mapping relationships and patterns, as shown in the pilot network text analysis study on a sample of 17th century letters compiled from the Early Modern Letters Online (EMLO) (McGillivray and Sangati 2018). The application of network analysis allows us to construct and visualize the intricate connections between onomastic references within the grammar books. By mapping these linguistic networks, we aim to uncover patterns, clusters, and semantic relationships that contribute to a deeper understanding of the language norms and perceptions in 17th-century English. Our predictions are that the 17th century component of the corpus shall include even more direct quotations than the 16th century due to the increased availability of printed books where authors could directly cite another person’s work. 

The reference strategies employed by grammarians to reference other authors show us how they position themselves with regards to certain beliefs and paradigms. The study elucidates the sociolinguistic dynamics of the time by revealing patterns in the selection and representation of names within the grammatical discourse. The categorization of onomastic references allows for an exploration of the social, cultural, and historical dimensions embedded in the linguistic fabric of the 17th century.

References

Beal, Joan C. 2004. English in Modern Times. London: Arnold.

Busse, Beatrix, Ingo Kleiber, Nina Dumrukcic, Sophie Du Bois. 2021. “A corpus-based network analysis of 16th-century British grammar writing.” CL2021, Limerick, Ireland, 2021.

Busse, Beatrix, Kirsten Gather, and Ingo Kleiber. 2020. “A Corpus-Based Analysis of Grammarians’ References in 19th-Century British Grammars.” In Variation in Time and Space: Observing the World Through Corpora, edited by Anna Cermakova and Markéta Malá. Diskursmuster - Discourse Patterns 20. Berlin: De Gruyter.

Busse, Beatrix, Kirsten Gather, and Ingo Kleiber. 2019. “Paradigm Shifts in 19th-Century British Grammar Writing: A Network of Texts and Authors.” In Norms and Conventions in the History of English, edited by Birte Bös and Claudia Claridge 347. Amsterdam: John Benjamins.

Busse, Beatrix, Kirsten Gather, and Ingo Kleiber. 2018. “Assessing the Connections Between English Grammarians of the Nineteenth Century: A Corpus-Based Network Analysis.” In Grammar and Corpora 2016, edited by Eric Fuß, Marek Konopka, Beata Trawiński, and Ulrich H. Waßner, 435–42. Heidelberg: Heidelberg University Publishing.

McGillivray, Barbara & Sangati, Federico. (2018). Pilot study for the COST Action “Reassembling the Republic of Letters”: Language-driven network analysis of letters from the Hartlib's Papers.

Nevalainen, Terttu. 2006. An Introduction to Early Modern English. Edinburgh: Edinburgh University Press.

2023

CL2023 in Lancaster, UK (July 2023)

Network of Grammar Lexemes in 16th- and 17th-Century English Grammar Writing

Beatrix Busse, Nina Dumrukcic, Sophie Du Bois, Ingo Kleiber

Keywords: historical corpus, English grammar writing, grammatical phenomena, network analysis, seventeenth century

Most terms that originally referred to fields of study, “have also come to denote the particular field of language itself” (Fenn 2022, 23), blurring the lines between grammar and linguistics. Before the introduction of approaches such as generative grammar (Chomsky 1957) and systemic functional grammar (Halliday 1994) the earliest stages of English grammar writing relied on the categorization of Latin grammar (Algeo 1985). However, in the late 17th century, the English language became more standardized and extended to domains of language use, where the classical languages had previously reigned (Nevalainen 2006, 42). This and other socio-political events invoked a turn towards English being acknowledged as a subject of study in itself (Beal 2004, 102). The scope of grammar has changed from including orthography and prosody, to focusing on morphology and syntax (Walmsley 1999, 2495). This study investigates the diachronic shift in terminology employed for grammatical fields of study and their related concepts. A list of terms known to be used for fields of study that fall under ‘grammar’ is extracted from the Historical Thesaurus of English (Kay et al. 2023). For example, Morphology only emerges as a term in 1869–, and was previously referred to as Etymology c1475– and Wordlore 1840–.

The corpus of 16th- and 17th-century English grammars associated with the HeidelGram project and the list of search terms allow us to explore semantic fields and shifts by means of concordance and collocational analysis. We will generate networks of the different structures that occur in grammars during this period enabling us to compare whether there have been notable shifts in what grammatical categories the authors discuss. Our aim is to detect traces of modern theoretical approaches such as Lexical-Functional Grammar (see Bresnan et al. 2016) in the earliest grammars and how they evolved over time.

References:

Algeo, John. 1985. “The Earliest English Grammars” in Historical and Editorial Studies in Medieval and Early Modern English: For Johan Gerritsen. Edited by Mary-Jo Arn and Hanneke Wirtjes, 191–207. Groningen: Wolters-Noordhoff.

Beal, Joan C. 2004. English in Modern Times. London: Arnold.

Bresnan, Joan, Ash Asudeh, Ida Toivonen, and Stephen Wechsler. 2016. Lexical-Functional Syntax. Second edition. Chichester: Wiley Blackwell.

Chomsky, Noam. 1957. Syntactic Structures. The Hague/Paris: Mouton.

Fenn, Peter. 2022. A Student's Advanced Grammar of English (SAGE). 2., revised edition. Stuttgart: utb GmbH.

Halliday, M.A.K. 1994. Introduction to Functional Grammar, Second edition. London: Edward Arnold. 

Kay, Christian, Marc Alexander, Fraser Dallachy, Jane Roberts, Michael Samuels, and Irené Wotherspoon (eds.). 2023. The Historical Thesaurus of English (2nd edn., version 5.0). University of Glasgow. ht.ac.uk.

Michael, Ian. 1970. English Grammatical Categories: And the Tradition to 1800. Cambridge:  Cambridge University Press.

Nevalainen, Terttu. 2006. An Introduction to Early Modern English. Edinburgh: Edinburgh University Press.

Walmsley, John. 1999. “English grammatical terminology from the 16th century to the present” in 2. Halbband: Ein internationales Handbuch zur Fachsprachenforschung und Terminologiewissenschaft. Edited by Lothar Hoffmann, Hartwig Kalverkämper, Herbert Ernst Wiegand, Christian Galinski, and Werner Hüllen, 2494-2502. Berlin/New York: Mouton De Gruyter. 

ISLE7 in Brisbane, Australia (June 2023)

Corpus-Based Network Analysis of Onomastic References in 17th-Century Grammar Writing

Beatrix Busse, Nina Dumrukcic, Sophie Du Bois, Ingo Kleiber

Keywords: historical corpus, network analysis, English grammar writing, reference analysis, seventeenth century

Quantitative studies on historical grammar writing are still sparse. Previous studies within the HeidelGram project (heidelgram.de) have approached this issue by investigating linguistic means employed by 16th- and 19th-century grammarians when referring to other persons within their works. The present study aims to further fill this gap, by providing a corpus-based analysis of reference strategies in 17th-century British grammar writing. 

Onomastic references within the 17th-century component of the HeidelGram corpus will be automatically extracted and visualized in a citation network (see White 2012) using a custom-built tool based on Python and R. Based on concordance lines for each item which was annotated as referencing a person, reference and author categories will be assigned. There are six reference categories, which were originally established for the 19th-century grammar data (see Busse et al. 2018, 2019, 2020), such as opinion and quotation. Similarly, there are seven author categories, which were devised when analyzing the 16th-century author references (see Busse et al. 2021), such as grammar author or political figure. The applicability of both categorizations to the 17th-century data will be evaluated via inter-rater reliability measures.

English grammars written in the 17th century were still said to be “heavily influenced by their models – grammars of Latin” (Algeo 1986, 309). Similarly, the ancient Greek and Roman literati were still considered authoritative in terms of linguistic understanding (McCarthy 2020, 24). By means of the citation network as well as the author categorization it will be possible to trace the lasting impact of ancient Greece and Rome as well as the Latinate tradition on English grammar writing, whilst also depicting the diachronic shift towards the vernacular. 

At the same time, the increasing prevalence of printing presses in Britain yield an increasing output of English grammars (McCarthy 2020, 23), so that an increase in references to other English grammars and grammarians may be expected. 

References

Algeo, John. 1986. “A Grammatical Dialectic.” In The English Reference Grammar: Language and Linguistics, Writers and Readers, edited by Gerhard Leitner, 307–33. Linguistische Arbeiten 172. Tübingen: Niemeyer.

Busse, Beatrix, Ingo Kleiber, Nina Dumrukcic, Sophie Du Bois. 2021. “A corpus-based network analysis of 16th-century British grammar writing.” CL2021, Limerick, Ireland, 2021.

Busse, Beatrix, Kirsten Gather, and Ingo Kleiber. 2020. “A Corpus-Based Analysis of Grammarians’ References in 19th-Century British Grammars.” In Variation in Time and Space: Observing the World Through Corpora, edited by Anna Cermakova and Markéta Malá. Diskursmuster - Discourse Patterns 20. Berlin: De Gruyter.

Busse, Beatrix, Kirsten Gather, and Ingo Kleiber. 2019. “Paradigm Shifts in 19th-Century British Grammar Writing: A Network of Texts and Authors.” In Norms and Conventions in the History of English, edited by Birte Bös and Claudia Claridge 347. Amsterdam: John Benjamins.

Busse, Beatrix, Kirsten Gather, and Ingo Kleiber. 2018. “Assessing the Connections Between English Grammarians of the Nineteenth Century: A Corpus-Based Network Analysis.” In Grammar and Corpora 2016, edited by Eric Fuß, Marek Konopka, Beata Trawiński, and Ulrich H. Waßner, 435–42. Heidelberg: Heidelberg University Publishing.

McCarthy, Michael. 2020. Innovations and Challenges in Grammar. Innovations and Challenges in Applied Linguistics. New York: Routledge.

ICAME44 in Vanderbijlpark, South Africa (May 2023)

A Corpus-Based Analysis of 18th-Century American Grammars

Sophie Du Bois

Keywords: historical corpus, network analysis, American grammar writing, reference analysis, eighteenth century

The practice of writing grammars of English in the United States of America commenced much later than in its British ancestor. Considering a grammar to be American, when it was originally published in the United States, the first American grammar is Samuel Johnson’s First Easy Rudiments of Grammar from 1765 (Lyman 1922, Nietz 1961, Alston 1965). 

This study of 18th-century American grammars therefore aims to trace the origins of the practice of English grammar writing in the United States and the relations of American grammarians to their British ancestry. For this purpose, a corpus of five American grammars from the 18th century (ca. 87.000 tokens) was compiled.

Making use of the project-specific annotations in the corpus, onomastic references made to other persons are automatically extracted using a custom Python-based script. The resulting concordances are manually investigated and split into actual references and mentions within example sentences. Both categories are then further analyzed for the types of persons referenced using a system of person categories originally established for an analysis of 16th-century references in British grammars (Busse et al. 2021). The non-example references are further assigned categories initially developed for references in 19th-century British grammars (Busse et al. 2020). The results are displayed in a citation network (see White 2012), enabling a qualitative evaluation of references made. The observations are further enhanced by frequency analyses.

The investigations show that the majority of name-based references occur within the context of examples (59%), demonstrating the high value that is placed on demonstration by means of example sentences. Within these, religious and political figures are frequently mentioned, implying that not only religious texts (cf. Baron 1982, 126), but also political speeches were considered to be valuable moral guidance for the students.

Furthermore, although a sentiment of patriotism and independence passes through the country in the late 18th century, this sentiment has not yet reached early American grammarians in terms of their language ideologies. They predominantly reference British authors, implying a British dominance at the time in terms of authority on language matters, and confirming that the association of language and the American nation only culminated in the 19th century (Andresen 1990, 29-41).

References

Andresen, Julie Tetel. 1990. Linguistics in America 1769-1924: A Critical History. London: Routledge.

Alston, R. C. 1965. English Grammar Written in English and English Grammar Written in Latin by Native Speakers: Vol. 1. A Bibliography of the English Language From the Invention of Printing to the Year 1800. A Systematic Record of Writings on English, based on the Collections of the Principal Libraries of the World. Leeds: Arnold & Son.

Baron, Dennis E. 1982. Grammar and Good Taste: Reforming the American Language. New Haven, London: Yale University Press.

Busse, Beatrix, Ingo Kleiber, Nina Dumrukcic, Sophie Du Bois. 2021. “A corpus-based network analysis of 16th-century British grammar writing.” CL2021, Limerick, Ireland, 2021.

Busse, Beatrix, Kirsten Gather, and Ingo Kleiber. 2020. “A Corpus-Based Analysis of Grammarians’ References in 19th-Century British Grammars.” In Variation in Time and Space: Observing the World Through Corpora, edited by Anna Cermakova and Markéta Malá. Diskursmuster - Discourse Patterns 20. Berlin: De Gruyter.

Johnson, Samuel. 1765. First Easy Rudiments of Grammar, Applied to the English Tongue. New York: J. Holt.

Lyman, R. L. V. 1922. English Grammar in American Schools Before 1850. Chicago, Illinois: University of Chicago.

Nietz, John Alfred. 1961. Old Textbooks: Spelling, Grammar, Reading, Arithmetic, Geography, American History, Civil Government, Physiology, Penmanship, Art, Music, as Taught in the Common Schools from Colonial Days to 1900. Pittsburgh: American Book-Stratford Press, Inc.

2022

ICAME43 in Cambridge, UK (July 2022)

Diachronic Analysis of Grammatical Forms and Functions in a Corpus of 16th- to 19th-Century English Grammar Books

Beatrix Busse, Nina Dumrukcic, Sophie Du Bois, Ingo Kleiber

Keywords: forms and functions, historical corpus, English grammar, text analysis, cohesive devices

The contents of contemporary English grammar books, such as A Comprehensive Grammar of the English Language (Quirk et al. 1985), the Longman Grammar of Spoken and Written English (Biber et al. 2002), or the more recently published Doing English Grammar (Berry 2021) are usually organized according to parts of speech and structural elements such as phrases or clauses. Various sub-headings typically further outline, for example, different types of word class (e.g., 4. Pronouns; 4.1. Personal Pronouns; 4.2. Reflexive Pronouns). On the one hand, this modern structuring principle enhances cohesive orientation. On the other hand, the structural outline is also in line with the theoretical approach taken – functional, corpus-based etc.

This paper is a pilot study analyzing if and how Early Modern English grammarians signposted the content of their grammars through headings as cohesive devices which tie text segments together (Halliday and Hasan 1976; Fakeuade and Sharndama 2012) to “create unity of meaning” (Jambak and Gurning 2014: 61). For this purpose, a sub-corpus of these headings which will be part of the HeidelGram corpus – a representative compilation of English grammar books from the 16th until the 19th century (see e.g., Busse et al. 2020) – is compiled. Due to irregularities in typesetting, the extraction is a two-step process which relies on quantitative and qualitative methods. First, a sample of visible sign-posters in the form of section headings, which indicate to the reader what the subsequent section will be about, are identified, extracted, and quantitatively evaluated. Based on this evaluation, a larger sample is extracted in a second step for further analysis. Other types of extratextual elements such as boilerplates and notes in margins are not considered. Intratextual cohesive markers, such as topic sentences, and historiated initials are also excluded. 

Based on this sample data, a diachronic analysis of the terminology used to describe grammatical categories and phenomena is performed using standard corpus linguistic tools such as WordHoard (2004-2020) which is used to track changes and salience of word-forms over time, and WMatrix (Rayson 2009) to determine key references to grammatical categories. Using modern grammatical terminology from the most commonly consulted books on English grammar (i.e., Quirk et al. 1985, Biber et al. 2002) as a baseline, we shall describe the lexico-grammatical strategies of signposting in Early Modern English grammars, thus reconstructing the development of fields of study such as morphology or syntax, and study genre conventions of English grammars in long-term diachrony. Based on this dataset of forms and functions of headings in this particular genre, we determine what grammatical categories and phenomena were most salient from the grammarians’ perspective at the time, and how their centrality and representation changed diachronically.

Ultimately, this pilot study will help us in operationalizing grammatical terminology throughout time. In a follow-up study, the full grammar texts will be analyzed for their references to grammatical categories and phenomena, which will further expand the diachronic form to function mapping.

Keeping in line with the theme of the conference of whether corpus linguistics is a new normal, we portray how corpus linguistic tools enable us to efficiently and rapidly look for forms and functions in historical texts over long periods of time rather than time-consuming manual close reading.

References:

Berry, Roger. 2021. Doing English Grammar: Theory, Description and Practice. Cambridge: Cambridge University Press.

Biber, Douglas, Susan Conrad, and Geoffrey Leech. 2002. Longman Student Grammar of Spoken and Written English. Harlow: Pearson Education.

Busse, Beatrix, Kirsten Gather, and Ingo Kleiber. 2020. “A Corpus-Based Analysis of Grammarians’ References in 19th-Century British Grammars.” In Variation in Time and Space: Observing the World Through Corpora, edited by Anna Cermakova and Markéta Malá. Diskursmuster - Discourse Patterns 20. Berlin: De Gruyter.

Fakeuade, Gbenga and Emmanuel C. Sharndama. 2012. “A Comparative Analysis of Variations in Cohesive Devices in Professional and Popularized Legal Text.” British Journal of Arts and Social Sciences 4(2): 300-318.

Halliday, Michael A. K., and Ruqaiya Hassan. 1976. Cohesion in English. London and New York: Longman.

Jambak, Vany T., and Busmin Gurning. 2014. “Cohesive Devices Used in the Headline News of the Jakarta Post.” Linguistica 3(1): 58-71.

Rayson, Paul. 2009. Wmatrix: a web-based corpus processing environment, Computing Department, Lancaster University. Available at ucrel.lancs.ac.uk/wmatrix/.

Quirk, Randolph, Sidney Greenbaum, Geoffrey Leech, and Jan Svartvik. 1985. A Comprehensive Grammar of the English Language. Harlow: Longman.

WordHoard. 2004–2020. WordHoard: An Application for the close reading and scholarly analysis of deeply tagged text. Available at wordhoard.northwestern.edu/userman/index.html.

2021

ICAME42 in Dortmund, Germany (August 2021)

Crossing the Boundary of Time: Retraining Modern NLP Models for Specialized Historical Corpus Data

Beatrix Busse, Ingo Kleiber, Sophie Du Bois, Nina Dumrukcic

Keywords: Deep Learning, NLP, Relational Database, Historical Corpus, English grammar writing

The application of deep learning (DL) within NLP has yielded promising results for a variety of tasks, and the field has seen a ‘neural turn’. While DL approaches have become the standard for contemporary English, historical data has not received the same amount of attention since state-of-the-art models are almost exclusively trained on contemporary language data.

In the spirit of this conference’s theme, “crossing boundaries,” this paper serves as a case study in how adapting current DL language models to the (historical) corpus domain can improve next-word prediction and additional downstream tasks for working with historical data. 

Therefore, the baseline performance of state-of-the-art language models, e.g., BERT (see Devlin et al. 2018) and GPT-2 (see Radford et al. 2019), are compared to models fine-tuned on both our own corpus of 16th-century English grammars as well as external historical data like EEBO TCP (Early English Books Online (EEBO) TCP).

The corpus introduced and utilized in this paper, which is part of the larger HeidelGram project (see e.g. Busse et al. 2020), represents what we label to be British grammars of English from the 16th century. For instance, William Bullokar's Brief Grammar for English, defining amongst other things parts of speech and grammatical cases, constitutes such a grammar. 

By applying fine-tuned language models, we are approaching multiple problems identified within the HeidelGram research project, such as mitigating bad scans and OCR as well as diverse classification tasks (e.g., categorizing reference types), from a computational perspective. 

In addition, we will propose and demonstrate a relational database approach powering both our corpora as well as our processing and analysis pipelines. While relying on text files and XML annotations (e.g., TEI-based) has been the standard in corpus construction, approaches using relational databases are gaining more attention (e.g., Davies 2005) due to their wide range of advantages (ibid.). 

While the focus will be on our specialized grammar corpus, we believe that insights from our experiments, both in terms of fine-tuning existing models as well as using relational databases, will be of interest to everyone working on historical corpora wanting to practically apply current methods from NLP.

References

Bullokar, William. 1586. Brief Grammar for English. London: Edmund Bollifant.

Busse, Beatrix, Kirsten Gather und Ingo Kleiber. 2020. „A Corpus-Based Analysis of Grammarians’ References in 19th-Century British Grammars.“ In Variation in Time and Space: Observing the World Through Corpora, hrsg. von Anna Cermakova und Markéta Malá. Diskursmuster - Discourse Patterns 20. Berlin: De Gruyter.

Davies, Mark. 2005. “The Advantage of Using Relational Databases for Large Corpora: Speed, Advanced Queries, and Unlimited Annotation.” International Journal of Corpus Linguistics 10 (3): 307–34.

Devlin, Jacob, Ming-Wei Chang, Kenton Lee und Kristina Toutanova. 2018. „BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.“ Unveröffentlichtes Manuskript. arxiv.org/pdf/1810.04805.

Early English Books Online (EEBO) TCP. (n.d.): Retrieved November 24, 2020, from https://textcreationpartnership.org/tcp-texts/eebo-tcp-early-english-books-online/

Radford, A., Jeffrey Wu, R. Child, David Luan, Dario Amodei and Ilya Sutskever. 2019. “Language Models are Unsupervised Multitask Learners.”

CL2021 in Limerick, Ireland (July 2021)

A corpus-based network analysis of 16th-century British grammar writing

Beatrix Busse, Ingo Kleiber, Nina Dumrukcic, Sophie Du Bois

Keywords: historical corpus, network analysis, English grammar writing, reference analysis, sixteenth century

Large-scale research of English grammar writing, in the sense of considering an extended period of time as well as not restricting the research to certain authors, grammar books or linguistic phenomena, is still sparse. Approaching this task, previous studies within the larger HeidelGram project have investigated the linguistic means employed by 19th-century grammarians when referring to other grammar authors (Busse et al., 2019, 2020; Busse & Gather, 2017).

The aim of this study is to extend these findings to the newly compiled 16th-century part of the HeidelGram corpus. Considering that the concept of grammar has changed throughout the centuries (Downey, 1986, p. 337; McCarthy, 2020, p. 25), the data selection for the HeidelGram corpus, spanning the 16th to 19th centuries, is based on what we label to be an English grammar. For instance, William Bullokar's Brief Grammar for English, defining amongst other things parts of speech and grammatical cases, constitutes such a grammar.

Seeing as the 16th century marks the beginning of English grammar writing (McCarthy, 2020, pp. 19–20), it is to be expected, that there will be few, if any, references to other authors of English grammars. Rather, references to grammar writers of the so-called learned languages, especially Latin, are more likely. Further, the references found in the 16th-century grammars will be sorted along the six semantic categories suggested in previous work for 19th-century grammars (Busse et al., 2020, pp. 11–12). To account for a possible shift in reference strategies over time, these semantic categories will be reevaluated in the 16th-century context.

The above hypotheses will be investigated by tracing onomastic references in the 16th-century grammars using citation networks (see White, 2012). While the number of references to other authors will provide us with quantitative insights, the types of references will offer a qualitative perspective.

References:

Busse, B., & Gather, K. (2017, July 24). HeidelGram: Network Analysis of Grammarians' References in 19th-Century British Grammars: A Corpus-Based Study, Birmingham. www.birmingham.ac.uk/Documents/college-artslaw/corpus/conference-archives/2017/general/paper211.pdf

Busse, B., Gather, K., & Kleiber, I. (2019). Paradigm Shifts in 19th-Century British Grammar Writing: A Network of Texts and Authors. In B. Bös & C. Claridge (Eds.), Norms and Conventions in the History of English. John Benjamins.

Busse, B., Gather, K., & Kleiber, I. (2020). A Corpus-Based Analysis of Grammarians’ References in 19th-Century British Grammars. In A. Cermakova & M. Malá (Eds.), Diskursmuster - Discourse Patterns: Vol. 20. Variation in Time and Space: Observing the World Through Corpora. De Gruyter.

Downey, C. (1986). The Constants and Variables Which Guided the Development of American Grammar Writing in the 18th and 19th Centuries. In G. Leitner (Ed.), Linguistische Arbeiten: Vol. 172. The English reference grammar: Language and linguistics, writers and readers (pp. 334–350). Niemeyer.

McCarthy, M. (2020). Innovations and Challenges in Grammar. Innovations and Challenges in Applied Linguistics. doi.org/10.4324/9780429243561

White, H. D. (2012). Scientific and Scholarly Networks. In J. Scott & P. Carrington (Eds.),The SAGE Handbook of Social Network Analysis (pp. 271–285). SAGE. doi.org/10.4135/9781446294413.n19

2019

CL2019 in Cardiff, UK (June 2019)

The Case for Custom Software Development: HGSimpleCorpusNetwork – A Network Analysis Toolbox for (Historical) Corpora

Beatrix Busse, Ingo Kleiber

The discipline of corpus linguistics has always been closely linked to technology, and some even claim that "[i]t was not the linguistic climate, but the technological one that stimulated the development of corpora" (Tognini-Bonelli 2010: 15).

Whenever we are conducting corpus linguistic research, we are heavily relying on our data (the corpora), but also on the software and tools that we are using. Therefore, "[t]he functionality offered by software tools largely dictates what corpus linguistics research methods are available to a researcher" (Anthony 2013: 141). Furthermore, it has to be recognized that the choice of tools almost always has some impact on the results of the analysis due to the decisions made by the developers.

This has led to the development of hundreds of applications, toolboxes, and frameworks, which arguably cover most use cases imaginable. Nevertheless, new tools are being developed despite the availability of //de facto// standard tools, which are widely used, well documented, and well tested.

The primary reason for this seems to be the need to process and analyze ever more complex and specialized data based on a rapidly growing set of methodologies. In addition, custom software has the advantage of allowing researchers to "tailor the output of the analysis to fit [their] research needs" (Biber et al. 2006: 255) and to stay "in the driver's seat" (Gries 2009: 1236).

We will be carefully making a case for custom software development within the field of corpus linguistics. In order to do this, we will be presenting HGSimpleCorpusNetwork, a custom toolbox for the analysis of historical, diachronic corpora using network analytical approaches. This toolbox, currently under development within the HeidelGram project (investigating discourses of English grammar writing between 1550 and 1900), is tailored specifically towards the data, methodology, and research questions at hand. For example, the tool is especially useful when analyzing (potentially unreliable) data generated via unsupervised OCR. Following an agile approach to development, the software is developed alongside the research project and continuously adapted and improved based on the current research aims.

Looking back at some older case studies using the toolbox, advantages, issues, and lessons learned regarding the development of custom, project-specific, corpus analysis software will be presented. Based on these insights, we are also going to provide some generalized guidelines and best practices for those who need to decide between using existing tools and developing project-specific software.

References:

Anthony, L. 2013. A critical look at software tools in corpus linguistics. Linguistic Research 30 (2), 141–61.

Biber, D., Conrad, S., & Reppen R. (1998) 2006. Corpus linguistics: Investigating language structure and use: Cambridge University Press.

Gries, S. T. 2009. What is corpus linguistics? Language and Linguistics Compass 3 (5), 1225–41.

Tognini-Bonelli, E. 2010. Theoretical overview of the evolution of corpus linguistics. In A. O’Keeffe & M. McCarthy (Eds.), The Routledge handbook of corpus linguistics, 14-27. London: Routledge.

ICAME40 in Neuchâtel, Switzerland (June 2019)

The Case for Custom Software Development: The Example of HGSimpleCorpusNetwork

Beatrix Busse, Ingo Kleiber

The discipline of corpus linguistics has always been closely linked to technology, and some even claim that "[i]t was not the linguistic climate, but the technological one that stimulated the development of corpora" (Bonelli 2010: 15).

Whenever we are conducting corpus linguistic research, we are heavily relying on our data (the corpora), but also on the software and tools that we are using. Therefore, "[t]he functionality offered by software tools largely dictates what corpus linguistics research methods are available to a researcher" (Anthony 2013: 141). Furthermore, it has to be recognized that the choice of tools is almost always having some impact on the results of the analysis due to the decisions made by the developers.

This has led to the development of hundreds of applications, toolboxes, and frameworks, which arguably cover most use cases imaginable. Nevertheless, new tools are being developed despite the availability of de facto standard tools, which are widely used, well documented, and well tested.

The primary reason for this seems to be the need to process and analyze ever more complex and specialized data based on a rapidly growing set of methodologies. In addition, custom software has the advantage of allowing researchers to "tailor the output of the analysis to fit [their] research needs" (Biber et al. 2006: 255) and to stay "in the driver's seat" (Gries 2009: 1236).

In this talk, we will be carefully making a case for custom software development within the field of corpus linguistics. In order to do this, we will be reflecting on the development of HGSimpleCorpusNetwork, a custom toolbox for the analysis of historical, diachronic corpora using network analytical approaches. This toolbox, currently under development within the HeidelGram project (investigating discourses of English grammar writing between 1550 and 1900), is tailored specifically towards the data, methodology, and research questions at hand. Following an agile approach to development, the software is developed alongside the research project and continuously adapted and improved based on the current research aims.

Looking back at some older case studies using the toolbox, advantages, issues, and lessons learned regarding the development of custom, project-specific, corpus analysis software will be presented. Based on these insights, we are going to provide some generalized guidelines and best practices for those who need to decide between using existing tools and developing project-specific software.

References:

Anthony, Laurence. 2013. “A Critical Look at Software Tools in Corpus Linguistics.” Linguistic Research/30 (2): 141–61.

Biber, Douglas, Susan Conrad, and Randi Reppen. (1998) 2006. Corpus Linguistics: Investigating Language Structure and Use: Cambridge University Press.

Gries, Stefan T. 2009. “What is Corpus Linguistics?” Language and Linguistics Compass 3 (5): 1225–41.

Tognini-Bonelli, Elena. 2010. “Theoretical Overview of the Evolution of Corpus Linguistics.” In The Routledge Handbook of Corpus Linguistics, edited by Anne O’Keeffe and Michael McCarthy, 14–27. Routledge Handbooks in Applied Linguistics. London: Routledge.

2018

ICAME39 in Tampere, Finland (May 2018)

HeidelGram: A network of evaluative terms in 19th-century British grammars – Methodological challenges and practical solutions

Beatrix Busse and Ingo Kleiber

The HeidelGram project has a twofold aim. Firstly, it makes an essential contribution to historical grammar studies by compiling, analysing, and giving open access to a representative 10-million word corpus of historical English grammar books from the 16th to the 19th centuries. Secondly, it introduces state-of-the-art network analysis into diachronic corpus linguistics; thus, considerably extending the set of concepts and methods applied in historical linguistics. Our overall aim is to examine discourses in English grammar writing by exemplarily implementing and analysing three networks – a network of grammars and grammarians, a network of evaluative terms associated with verbal hygiene (Cameron 2012 [1995]), and a network of lexemes referring to grammatical phenomena.

While network analytical methods have been applied to historical textual material (e.g. Bergs 2005; Sairio 2009; Fitzmaurice 2010) and fictional texts (e.g. Agarwal et al. 2012; Moretti 2013), the combination of corpus-based diachronic linguistics and network analysis is rather uncharted territory. This new approach poses significant methodological challenges and requires us to come up with new forms of extracting, annotating, and analysing historical linguistic data. A series of exploratory studies (Busse, et al. 2016a and 2016b; Busse and Gather 2016), based on a systematically compiled and representative corpus of 19th -century British grammar books (40 texts, approx. 2.6 mio. words), has already shown the potential of this approach towards conducting historical grammar studies. In the present paper we want to present initial findings regarding the network of evaluative terms and discuss some of the major methodological and technical challenges associated with this approach. These include expressions like “greatly erred” in Crombie’s 1802- grammar: “Priestley, in defending the other phraseology, appears to me to have greatly erred” (Crombie 1802: 302).

This second network will not only help us to critically reflect upon the concepts of prescriptivism and descriptivism, but also to uncover linguistic practices and patterns that may have led to these discursive turns. Based on an extended and optimized version of our pilot-corpus, containing the most-well known and widely distributed grammars of the 19th century (cf. Leitner 1986, 1991; Linn 2006; Michael 1987; Görlach 1998), we will begin to quantitatively investigate terms associated with verbal hygiene (Cameron 2012 [1995]), i.e. active practices of filtering, evaluating, and modifying normative language usage, and their relationships.

Furthermore, informed by this initial analysis, we will discuss three major challenges associated with historical corpus-based network analysis and potential strategies of mitigating them. We will discuss typical issues with optical character recognition (OCR) and state-of-the-art workflows and procedures and tools, both automatic and manual, to reduce misreadings. Also, we will look at problems and solutions associated with automatically generating meaningful graphs (i.e. networks) out of unstructured and unannotated linguistic data. Finally, we will present an early approach of visualizing such graphs in a way that allows for visual diachronic analysis.

References:

Agarwal, A., Corvalan, A., Jensen, J., & Rambow, O. (2012). Social Network Analysis of Alice in Wonderland. In D. Elson, A. Kazantseva, R. Milhalcea, & S. Szpakowicz (Eds.), //Proceedings of the NAACL-HLT 2012 Workshop on Computational Linguistics for Literature//. Montrèal: Association for Computational Linguistics, 88–96. Retrieved from www.aclweb.org/anthology/W12-2513.

Bergs, A. (2005). //Social Networks and Historical Sociolinguistics: Studies in Morphosyntactic Variation in the Paston Letters (1421–1503)//. Berlin/New York: Mouton de Gruyter.

Busse, B., & Gather, K. (2017, July). //HeidelGram: Network Analysis of Grammarians’ References in 19th-Century British Grammars: A Corpus-Based Study//. Corpus Linguistics Conference, Birmingham.

Busse, B., Gather, K., & Kleiber, I. (2016, August). //Paradigm Shifts in 19th-Century British Grammar Writing: A Network of Texts and Authors//. Proceedings of the 19th International Conference on English Historical Linguistics, Duisburg/Essen.

Busse, B., Gather, K., & Kleiber, I. (2016, November).// Assessing the Connections between English Grammarians of the 19th Century: A Corpus-Based Network Analysis//. Proceedings of the 6th Grammar and Corpora Conference, Mannheim.

Cameron, D. (2012). //Verbal Hygiene: The Politics of Language//. London: Routledge (Original work published 1995).

Crombie, A. (1802). //The Etymology and Syntax of the English Language, Explained and Illustrated//. London: J. Johnson.

Fitzmaurice, S. (2010). Coalitions, Networks, and Discourse Communities in Augustan England: The Spectator and the Early Eighteenth-Century Essay. In R. Hickey (Ed.), //Studies in English Language. Eighteenth-Century English: Ideology and Change//. Cambridge: Cambridge University Press, 106–132.

Görlach, M. (1998). //An Annotated Bibliography of Nineteenth-Century Grammars of English//. Amsterdam/Phildadelphia: John Benjamins.

Leitner, G. (1986). English Traditional Grammars in the Nineteenth Century. In D. Kastovsky & A. Szwedek (Eds.), //Trends in Linguistics. Studies and Monographs: Vol. 32. Linguistics Across Historical and Geographical Boundaries: Vol 2: Descriptive, Contrastive, and Applied Linguistics. In Honour of Jacek Fisiak on the Occasion of His Fiftieth Birthday//. Berlin: De Gruyter, 1333–1356.

Leitner, G. (1991). //English Traditional Grammars: An International Perspective. Amsterdam Studies in the Theory and History of Linguistic Science.: Vol. 62//. Amsterdam/Philadelphia: John Benjamins.

Linn, A. (2006). English Grammar Writing. In B. Aarts & A. M. S. McMahon (Eds.), //Blackwell Handbooks in Linguistics. The Handbook of English Linguistics//. Malden: Blackwell, 72–92.

Michael, I. (1987). //The Teaching of English//. Cambridge: Cambridge University Press.

Moretti, F. (2013). //Distant Reading//. London: Verso.

Sairio, A. (2009). //Language and Letters of the Bluestocking Network: Sociolinguistic Issues in Eighteenth-Century Epistolary English. Mémoires de la Société Néophilologique de Helsinki: Vol. 75.// Helsinki: Société Néophilologique.

2017

ICAME38 in Prague, Czech Republic (May 2017)

HeidelGram: A corpus-based network analysis of grammarians‘ references in 19th-century British grammars

Beatrix Busse, Kirsten Gather, Ingo Kleiber

The HeidelGram project, based at the English Department of Heidelberg University, has a twofold aim. It makes an essential contribution to historical grammar studies by compiling, investigating, and making available a representative 10-million-word corpus of historical English grammar books from the 16th to the 19th centuries, and it introduces state-of-the-art network analysis into diachronic corpus linguistics in order to considerably extend the set of concepts and methods applied in historical linguistics and corpus linguistics, and to exemplarily implement and analyse various kinds of networks, such as a network of grammarians, and a network of manifestations of language purism, such as verbal hygiene (Cameron 1995), in long-term diachrony.

In contrast to social network analyses of historical material (e.g. Bergs 2005, Sairio 2009, Fitzmaurice 2010) and to network studies based on fictional texts (e.g. Agarwal et al. 2012, Moretti 2013), the combination of corpus-based diachronic linguistics and network analysis is rather uncharted territory. This pilot project constitutes the first part of a series of diachronic network analyses of historical English grammar books. The present study investigates the relationships between 19th-century grammarians by examining references authors make to other grammars and grammarians. Based on White's notion of 'scholarly networks', references are understood as "record[s] of who has cited whom within a fi xed set of authors" (White 2011: 275) irrespective of their personal acquaintance.

A pilot corpus of 19th-century British grammar books (40 texts, ca. 2.6 mio. words) forms the basis for this kind of network analysis. It contains the most well-known and widely distributed grammars of the 19th century (cf. Leitner 1986, 1991, Linn 2006, Michael 1987, Görlach 1998) as full texts in digitised form. Main criteria for text selection are numbers of editions, distribution, and common use of grammars, as found in book catalogues and secondary literature on grammar writing.

A list of English and foreign grammarians from the 16th to 19th centuries that are nowadays usually considered the most famous and infl uential authors of their time (cf. Dons 2004, Finegan 1998, Görlach 1998, Linn 2006, Schmitter 1996, Tieken-Boon van Ostade 2008, Wolf 2011) comprises the search terms which, applied to the pilot corpus, yield all references made to other grammarians.

The ties between authors will be examined quantitatively, i.e. in terms of the number of references, and qualitatively, i.e. by classifying different kinds of references, e.g. quotation, approval of approaches to grammar, the citing of authorities, and various forms of criticism. Approval, for instance, is "I concur with Baker in considering …" (Crombie (1802) on Baker (1724)), whereas an example of criticism is "Mr.
Cobbett has mistaken the real causes of defective arrangement" (Doherty (1841) on Cobbett (1818)). We will show different and changing attitudes towards other grammarians, and discuss substantial implications for the development of the genre.

The network of references will further reveal paradigm shifts in grammar writing, indicating particularly the rise of descriptive grammars after the predominance of prescriptivism and critically reflecting on what is often called 'prescriptive' and 'descriptive'.

References:

Agarwal, A., Corvalan, A., Jensen, J. & Rambow, O. (2012). Social network analysis of Alice in Wonderland. Proceedings of the NAACL-HLT 2012 Workshop on Computational Linguistics for Literature (Association for Computational Linguistics, Montreal, Canada, 2012), 88–96, URL: http://www.aclweb.org/anthology/W12-2513.

Baker, W. (1724). Rules for True Spelling and Writing English. 2nd Edition. Bristol: Joseph Penn.

Bergs, A. (2005). Social Networks and Historical Sociolinguistics. Studies in Morphosyntactic Variation in the Paston Letters (1421-1503). Berlin/New York: Mouton de Gruyter.

Cameron, D. (1995). Verbal Hygiene. The Politics of Language. London: Routledge.

Cobbett , W. (1818). Grammar of the English Language, in a Series of Letters. New York: Clayton and Kingsland.

Crombie, A. (1802). The Etymology and Syntax of the English Language, Explained and Illustrated. London: J. Johnson.

Doherty, H. (1841). An Introduction to English Grammar on Universal Principles. London: Marshall & Co.

Dons, U. (2004). Descriptive Adequacy of Early Modern English Grammars. Berlin/New York: Mouton de Gruyter.

Finegan, E. (1998). English Grammar and Usage. In S. Romaine (Ed.), The Cambridge History of the English Language Vol. IV: 1776-1997. Cambridge: Cambridge University Press, 536-588.

Fitzmaurice, S. (2010). Coalitions, networks, and discourse communities in Augustan England: The Spectator and the early eighteenth-century essay. In R. Hickey (Ed.), Eighteenth-Century English. Cambridge: Cambridge University Press, 106-132.

Görlach, M. (1998). An Annotated Bibliography of Nineteenth-Century Grammars of English. Amsterdam/Philadelphia: John Benjamins.

Leitner, G. (1986). English Traditional Grammars in the Nineteenth Century. In D. Kastovsky & A. Szwedek (Eds.), Linguistics Across Historical and Geographical Boundaries; 2: Descriptive, Contrastive, and Applied Linguistics. Trends in Linguistics. Studies and Monographs [TiLSM] 32. Berlin et al.: Mouton de Gruyter, 1333-55.

Leitner, G. (1991). English Traditional Grammars: An International Perspective. Studies in the History of the Language Sciences 62. Amsterdam: John Benjamins, 1991.

Linn, A. (2006). English Grammar Writing. In B. Aarts & A. McMahon (Eds.), Handbook of English Linguistics. Malden, Mass.: Blackwell, 72-92.

Michael, I. (1987). The Teaching of English. Cambridge: Cambridge University Press.

Morett i, F. (2013). Distant Reading. London: Verso.

Sairio, A. (2009). Language and Letters of the Bluestocking Network. Sociolinguistic Issues in Eighteenth-century Epistolary English. Helsinki: Société Néophilologique (Mémoires de la Société Néophilologique de Helsinki).

Schmitter, P. (Ed.) (1996). Sprachtheorien der Neuzeit. 2 Vols. Tübingen: Gunter Narr Verlag.

Tieken-Boon van Ostade, I. (Ed.) (2008). Grammars, Grammarians and Grammar Writing in Eighteenth-century England. Berlin/New York: Mouton de Gruyter.

White, H. D. (2011). Scientific and Scholarly Networks. In J. Scott & P.J. Carrington (Eds.), The Sage Handbook of Social Network Analysis, 271-285.

Wolf, G. (2011). Englische Grammatikschreibung 1600-1900 – der Wandel einer Diskurstradition. Arbeiten zur Sprachanalyse 54. Frankfurt a. M.: Peter Lan

ars grammatica in Mannheim, Germany (June 2017)

HeidelGram: Network analysis of grammarians' references in 19th-century British grammars – a corpus-based pilot study

Beatrix Busse, Kirsten Gather, Ingo Kleiber

This pilot study brings together historical corpus linguistics and applied network theory. It investigates a network of grammarians' references in a corpus of representative 19th century British grammar books, examining and categorising connections between grammarians. Based on this network, we scrutinise established assumptions on the history of grammar writing, showing that particularly the turn from prescriptive to descriptive grammar writing is a discursively much more complex process than usually acknowledged, requiring a reassessment of the common notions of 'prescriptive' and 'descriptive' grammars and their alleged representative authorities.

2016

Grammar and Corpora in Mannheim, Germany (November 2016)

Assessing the Connections between English Grammarians of the 19th Century - A Corpus-Based Network Analysis

Beatrix Busse, Kirsten Gather, Ingo Kleiber

English grammar books are the genre that mirrors the development of language norms and language use most clearly. By stating the defects of language use and by criticising the seemingly inadequate works of predecessors and contemporaries, many grammarians authorise themselves to publish their own, better grammars. The high number of references to other grammars and grammarians shows that grammar books are usually not produced out of thin air, but that they emerge on the basis of the works of others, whether they are cited as authorities or criticised.

Our presentation aims at visualising such a network of grammars and grammarians by performing a network analysis on an XML-annotated corpus of 19th-century English grammars. The ties between authors will be examined quantitatively, i.e. in terms of the number of references, and qualitatively, i.e. by classifying and comparing different kinds of references, e.g. quotation, approval, the citing of authorities, and forms of criticism.

In so doing, a network of grammars and grammarians will become visible, which mirrors complex relations between authors, clusters of authors, isolated grammarians, and – viewed diachronically – the development of a grammarian's standing in the linguistic community.

The corpus contains the 50 most well-known and widely distributed grammars of the 19th century (cf. Leitner 1986, 1991, Linn 2006, Michael 1987, Görlach 1998) as full texts in digitised form. References to other grammarians are annotated, including the referenced author and work, and the kind of reference, as attributes.
A list of grammarians' names, i.e. predecessors and contemporaries, is used as search terms to produce an adjacency matrix on the basis of the corpus. For visualisation and analysis, this matrix is then transformed into a directed weighed graph.

This combination of corpus-based historical linguistics (see Jucker and Taavitsainen 2014: 4) and network analysis (see, for instance, Freeman 2004, White 2011) is rather uncharted territory. In recent years, network analyses have been conducted using social media data, literature, or drama (e.g. Elson et al. 2010), but so far there have not been any similar studies on historical non-fictional texts.

Apart from the presentation of results, we would like to discuss how well such a network approach can mirror connections between grammarians and by which other means results can be refined. Moreover, we are interested in discussing practical aspects of network analysis, such as levels of error tolerance and means of (semi-automated) error avoidance with respect to wrong hits in the data.

References:

Elson, David K., Nicholas Dames, and Kathleen R. McKeown. 2010. "Extracting Social Networks from Literary Fiction". In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL 2010), Uppsala, Sweden. URL: www1.cs.columbia.edu/~delson/pubs/ACL2010-ElsonDamesMcKeown.pdf.

Freeman, Linton C. 2004. The Development of Social Network Analysis. A Study in the Sociology of Science. Vancouver: Empirical Press.

Görlach, Manfred. 1998. An Annotated Bibliography of Nineteenth-Century Grammars of English. Amsterdam/Philadelphia: John Benjamins.

Jucker, Andreas, and Irma Taavitsainen. 2014. "Diachronic Corpus Pragmatics. Intersections and Interactions". In: Taavisainen, Irma, Andreas Jucker, and Jukka Tuominen (eds.). Diachronic Corpus Pragmatics. Amsterdam/Philadelphia: John Benjamins, 3-26.

Leitner, Gerhard. 1986. "English Traditional Grammars in the Nineteenth Century". In: Kastovsky, Dieter and Aleksander Szwedek (eds.). Linguistics Across Historical and Geographical Boundaries; 2: Descriptive, Contrastive, and Applied Linguistics. Trends in Linguistics. Studies and Monographs [TiLSM] 32. Berlin [et al.]: Mouton de Gruyter, 1333-55.

Leitner, Gerhard. 1991. English Traditional Grammars: An International Perspective. Studies in the History of the Language Sciences 62. Amsterdam: John Benjamins, 1991.

Linn, Andrew. 2006. "English Grammar Writing". In: Aarts, Bas, and April McMahon (eds.). Handbook of English Linguistics. Malden, Mass.: Blackwell, 72-92.

Michael, Ian. 1987. The Teaching of English. Cambridge: Cambridge University Press.

White, Howard D. 2011. "Scientific and Scholarly Networks". In: Scott, John and Peter J. Carrington (eds.). //The Sage Handbook of Social Network Analysis//, 271-285.

ICEHL19 in Duisburg/Essen, Germany (August 2016)

Paradigm Shifts in 19th-Century British Grammar Writing - A Network of Texts and Authors

Beatrix Busse, Kirsten Gather, Ingo Kleiber

In contrast to grammar books published in other centuries, 19th-century grammars have received little attention so far. Given that the vast majority of them are school grammars, this comes as no surprise. For several reasons, however, the 19th century can be seen as a turning point in English grammar writing. While moral and social aspects become more and more relevant in teaching grammars, grammar books in general also illustrate the rather late introduction of comparative historical linguistics around 1830 (Linn 2006: 79) and the emergence of phonetics/phonology as a separate topic towards the end of the century (e.g. Sweet 1892/1898). Furthermore, new movements within linguistics, such as the works of the New Philological Society and the Early English Text Society, lead to a paradigm shift in grammar writing from highly prescriptive works to predominantly descriptive grammars (Finegan 1998: 559ff).

But how do the major changes in 19th-century grammars happen? Do they occur all of a sudden? If so, how do other grammar writers react to bold and innovative ideas of contemporaries? If new developments build up by and by, do authors address and discuss them in their grammars?

The aim of our study is to make connections between grammar books visible so that mechanisms behind changing approaches to grammar become apparent. With the help of an annotated corpus of British grammars, which is currently being compiled at Heidelberg University, we will show that developments in 19th-century grammar writing can be visualised as a network of grammars and grammar authors. XML-markup of the corpus texts includes all kinds of references and judgemental statements addressing other authors and grammars. The network, as well as wordlists of grammar books in comparison reveal the authors' attitudes towards language use and language change, and give evidence of the innovative or conservative character of their grammar books.

References:

Finegan, Edward. 1998. "English Grammar and Usage". In: Romaine, Suzanne (ed.). The Cambridge History of the English Language Vol. IV: 1776-1997. Cambridge: Cambridge University Press.

Linn, Andrew. 2006. "English Grammar Writing". In: Aarts, Bas; McMahon, April (eds.). Handbook of English Linguistics. Malden, Mass.: Blackwell, 72–92.

Sweet, Henry. 1892/1898. A New English Grammar: Logical and Historical. Oxford.