The Digital Literary Stylistics Workshop
DH2016 - Krakow, Poland 12th July 2016
DH2016 - Krakow, Poland 12th July 2016
|9:00 -9.30||Opening||J. Berenike Herrmann, Francesca Frontini, Marissa Gemma (Intro & Anticipation of the SIG proposal) - SLIDES||30 Min|
|9:30 -11:00||Panel 1||Natalie Houston (“Towards a Computational Poetics: Some Features of Nineteenth-Century Poetic Style”) - SLIDES |
Anne Bandry-Scubbi (“Women’s Novels 1750s-1830s and the Company They Keep: A Computational Stylistic Approach”) - SLIDES
Jan Rybicki (“Authorial chronology by most frequent words: do writers’ stylometric thumbprints evolve with age?”)
Fotis Jannidis (“Period Styles”)
|1 h 30 Min|
|11:00-11:30||Coffee break||30 Min|
|11:30-13:00||Panel 2||Mark Algee-Hewitt (“The Author: Between Style and Substance”) - SLIDES |
Mike Kestemont (“The Matter of Art: Authenticity Criticism in the Humanities”) - SLIDES
Sarah Allison (“A Proxy for Style”)
Hugh Craig (“Beyond Authorship”) - SLIDES
|1 h 30 Min|
|13:00-14:00||Lunch on site||1 h|
|14:00-15:00||Panel 3||Tomoji Tabata ("Experimental Stylistics: A Meta-analysis to Evaluate Rolling Stylometry”)
Jean-Gabriel Ganascia (“Towards a computational and pattern-based stylistics”) - SLIDES
Christof Schöch (“Spitzer on Racine, digitally revisited”) - SLIDES
|15:00-16:35||Future SIG panel
|Define a roadmap for the proposal of an ADHO SIG on Digital Literary Stylistics||1 h 35 Min|
Stylometry has largely been focused on identifying the distinctive linguistic patterns used by particular authors; I’m interested instead in identifying those features that mark a poem as belonging to a particular historical period. Comparing these features across the works of canonical and noncanonical poets can help us understand literary influence and taste as cultural processes. Because the sentences of poetry are arranged in poetic lines, we need to examine additional features beyond the syntactic and semantic dimensions of language to understand poetic style. This paper presents my current work on analyzing repetition, enjambment, and rhyme in Victorian poetic style.
« What is style? » will be addressed from two large corpora of novels from 1748 to 1834 using computer-aided textual analysis: what, if anything, distinguishes texts by female and male authors? How do texts centring on female or on male characters differ in their use of style? Should these questions be distinguished? Are texts which made it into the canon written differently from those which did not? Combining distant and close reading, this paper will focus on novels “banal” or “eccentric” in respect to the corpus they are set in.
This paper will present a series of empirical studies on how most frequent word usage changes with time in the oeuvre of authors of various languages and literary periods. A majority of such authorial thumbprints indeed exhibit an evolution over time, to the extent that diagrams obtained by multivariate analysis of most frequent word frequencies might help correct faulty dating. Exceptions from this rule do not seem to be limited to a specific culture, language, or genre.
It has been shown repeatedly (for example Jannidis 2014) that texts from different literary periods can be distinguished using stylometric methods, but it is yet unclear whether this is an effect of a continuous change in language or whether this change is disruptive, creating more or less clear boundaries. The "time signal" in general seems to be rather weak (cf. Jockers 2013: 81), but can this be generalized or is there is a greater homogenity of style during a period? In other words is the term period useful as a stylometric concept?
The aim of this presentation is to evaluate a series of Eder's (2015) "Rolling Stylometry" techniques by conducting a meta-analysis based on validation experiments using a bootstrap sampling method to automatically generate randomized combination of sliced texts of mixed authorship. Eder's Rolling classification techniques are designed to pinpoint stylistic shifts or authorial takeovers in texts of co-authorship or mixed authorship. Statistical methods incorporated in the set of rolling techniques are Burrows's Delta (Burrows, 2002), Nearest Shrunken Centroid (Tibshirani et al., 2003), and Support Vector Machine (Cortes and Vapnik, 1995). The Rolling stylometric model runs, in essence, sequential analyses on text parts of specified length and helps us infer who, of the candidates, is the most likely author of the text block in question. This paper will report emerging results of a meta-analysis carried out to evaluate different rolling techniques and provide a robust benchmark for rolling stylometry.
During the past, there have been many attempts to base a computational stylistics on the distribution of the used lexicon, and mainly on the statistical distribution of stop words that characterizes the syntax of such or such author. Even if they are statistically efficient to discriminate writers, these approaches fail to elucidate the features that make the style. Our aim now is go further and to build tools extracting syntactic and semantic structures, i.e. patterns, typifying the style of individual authors, of genre, of epoch or of theater characters. We brush a picture of different approaches among which some have been more or less successfully tested while some others are yet tentative.
Using today's digital corpora and computational methods of text analysis, this paper revisits Léo Spitzer's famous stylistic reading of the tragedies of French seventeenth-century author Jean Racine . Spitzer's analysis was first published in 1931 and richly illustrates the manifestations of a "dampening effect" ("effet de sourdine") which Spitzer claims is characteristic of Racine's poetic style. The present attempt to reimplement Spitzer's study reveals new insights not only into Racine's use of language, but also into the respective strengths and limitations of both approaches to stylistic analysis and to the contrasting notions of style which underpin them.
Theories of stylometry, particularly those that focus on authorship attribution, speak of the utility of ‘content free words’ that are ‘unconsciously’ used by authors. How do we understand the juxtaposition between this treatment of authors as individuals with identifiable mentalities and the pragmatics of authorship attribution that reduce these individuals to probabilities of word frequencies? Combining theories of authorship (e.g. from Barthes and Foucault) with liminal cases of attribution (using strings of characters), this workshop contribution offers a discussion of the new place and being of the author within a computational representation of authorial style.
Across the Arts and Humanities, scholars often resort to analyzing a work’s intrinsic properties (‘style’) to verify its authenticity. Stylometry offers a common framework for the authentication of literary texts, which assumes that we can separate the style and content of works. I will survey the traditions in literary criticism which have focused on this issue (e.g. the “particle method” (Minio Paluello, 1947)). I will compare these to attribution techniques in the visual arts, where similar, yet independent views exist. Apart from interesting advances in computational approaches to style-content separation, I will address interesting divergences between authenticity criticism (Echtheitskritik) in the literary and visual arts (e.g. Giovanni Morelli).
I propose to approach the complex phenomenon of style through a disciplinary loan-word from statistics: the proxy. What measurable thing stands for –signals, illuminates, marks—the larger phenomenon of style we wish to trace? The careful selection and defense of a proxy can mark a path from the world of maps, graphs, trees and tables to traditional literary argument. I develop the consider approaches to Charles Dickens’s style through my own work on speech tag placement and two book-length studies of style across his corpus by Masahiro Hori (Palgrave, 2004) and Michaela Mahlberg (Routledge, 2012).
Authorship attribution through the quantitative analysis of style can call on an intuitive conceptual basis -- the familiar view that authors have idiosyncratic styles -- and on rigorous validation procedures, with securely attributed texts as controls. Descriptive quantitative stylistics has neither of these supports and is often challenged as tautologous, unconstrained in its selection of features and methods, and remote from the experience of writers and readers. This contribution will present some possible remedies, based on a collaborative project with Brett D Hirsch on the style of early modern English drama, funded by the Australian Research Council and almost complete.