New ways to search old books opens up history for researchers
Curatr is a new online platform that hosts digitised versions of all English-language books from the British Library corpus, corresponding to over thirty-five thousand unique titles, both fiction and non-fiction, from 1700 to 1899.
The project, led by Professor Gerardine Meaney and Dr. Derek Greene of Insight at UCD, is a major step forward for humanities research, enabling much deeper and wider text analysis by individual researchers.
The platform also incorporates the first digitised version of the topical classification index of books used by the British Library from 1823-1985.
Developed by the ERC-funded VICTEUR project, in collaboration with researchers at the Insight SFI Research Centre, Curatr is part of Insight’s Cultural Analytics research initiative.
The system includes a searchable index on the equivalent of over 12 million individual pages of text, which can be searched and sorted by author, title, year, and the actual full-text of the volumes themselves. This allows researchers to identify content relating to specific themes within little known or very long, unwieldy texts. This is further supported by additional functionality based on modern natural language processing techniques. This includes content-based recommendation methods and the visualisation of the relationships between concepts in the corpus through the use of semantic networks.
Curatr also supports the creation and export of smaller sub-corpora, defined thematically, chronologically and by classification. This addresses the common requirement for humanities scholars to engage in online document curation, without the need for extensive technical training.
A key use of Curatr is to assist researchers to identify original texts relevant to their work for consultation in the library. The next phase of the project will seek to integrate Curatr with other relevant online cultural resources, such as records originating from popular lending libraries in the nineteenth-century.