A New Algorithm for Extracting Formulas from Poetic Texts and Formulaic Density of Russian Bylinas


The problem of identifying formulas in poetic texts played an important role in folkloristics and several other fields, such as medieval literary studies, for more than fifty-odd years. Currently, there is no consensus as to what constitutes a formula or how the formulaic density of a given text should be computed. Nikolayev essays responds to these questions with a strict formal definition of a formula and a fast algorithm for computing formulaic density in any poetic text in any language (provided it is separated into lines and words). A case study of formulaic density in a corpus of Russian bylinas is marshaled to illustrate the methodology.



The mean formulaic density for the corpus computed with 4-symbol keys and 5-symbol keys.

Chart: by the author.

An arc diagram of formulaic connections between the lines of the bylina Dobrynya and the dragon.

Diagram: by the author. It was made in R using package ”arcdiagram” by Gaston Sanchez available at https://github.com/gastonstat/arcdiagram.

A histogram of formulaic densities of the texts of bylinas in the corpus.

Histogram: by the author.

