Index Catalog // British Library

2023

Presentation

Repositories to facilitate open scholarship

Basford, Jenny

GLAM and research repository

2023

Presentation

Scholarly publishing dynamics in the GLAM environment

Holt, Ilkay

open access, GLAM, copyright, open access policy, and open scholarship

2023

Dataset

Datasets for toponym recognition and disambiguation for nineteenth-century English newspapers

We present two datasets, one for the task of toponym recognition and one for the task of toponym disambiguation. The datasets are derived from the "Dataset for Toponym Resolution in Nineteenth-Century English Newspapers" (DOI: https://doi.org/10.23636/r7d4-kw08). The toponym recognition dataset consists of two JSON files (ner_fine_train.json and ner_fine_dev.json), whereas the toponym...

Coll Ardanuy, Mariona ; Nanni, Federico

toponym disambiguation, nineteenth-century newspapers, named entity recognition, entity linking, toponym resolution, toponym recognition, and dataset

2023

Dataset

DeezyMatch training set for OCR

Optical character recognition (OCR) is the process of automatically transcribing text from images. The presence of OCR-induced errors in digitised text is a common problem in the digital humanities. OCR errors are usually due to the misrecognition of characters, such as "h" recognised as "b", or "c" recognised as "o"....

Coll Ardanuy, Mariona ; Nanni, Federico ; Pedrazzini, Nilo

OCR, fuzzy string matching, string variation, newspapers, digital humanities, natural language processing, DeezyMatch, and Living with Machines

British Library PhD Placements 2023/24

User Collection

Phd placements

British Library PhD Placements 2022/23

User Collection

Phd placements

British Library PhD Placements 2021/22

User Collection

Phd placements

British Library PhD Placements 2019/20

User Collection

Phd placements

2023

Conference paper (unpublished)

(Re)investing in a national repository infrastructure for cultural heritage

Since 2018, the British Library (BL) has invested considerable resource in establishing the necessary infrastructure for a national repository service for cultural heritage organisations, using Samvera Hyku. This has entailed working closely with all known Hyku suppliers and developers, as well as collaborating with the University of Virginia on an...

Basford, Jenny ; Holt, Ilkay ; Jevon, Graham ; Ramsey, Nora

open access, OR2023, and repository

2022

Dataset

Diachronic word embeddings from 19th-century newspapers digitised by the British Library (1800-1919)

Word vectors related to the paper "Machines in the media: semantic change in the lexicon of mechanization in 19th-century British newspapers" by Nilo Pedrazzini and Barbara McGillivray (2022). The embeddings were trained on a 4.2-billion-word corpus of 19th-century British newspapers using Word2Vec and specific parameters. The embeddings are divided into...

Pedrazzini, Nilo ; McGillivray, Barbara

historical semantics, word-vectors, late-modern-english, newspapers, diachronic-embeddings, and word2vec

2023

Dataset

Decade-level Word2Vec models from automatically transcribed 19th-century newspapers digitised by the British Library (1800-1919)

Word embeddings trained on a 4.2-billion-word corpus of 19th-century British newspapers using Word2Vec and specific parameters. The embeddings are divided into periods of ten years each. Unlike those in this repository, these were not aligned and OCR errors skimmed from the vocabulary. See related GitHub repository for the full documentation:...

Pedrazzini, Nilo

historical semantics, British newspapers, word embeddings, word vectors, word2vec, and Late Modern English

2023

Book chapter

The sociability of scientific knowledge exchange in British Farming, 1950-90

This is a single chapter from an edited collection that has the following abstract: In the late nineteenth and early twentieth centuries, agricultural practices and rural livelihoods were challenged by changes such as commercialization, intensified global trade, and rapid urbanization. Planting Seeds of Knowledge studies the relationship between these agricultural...

Horrocks, Sally ; Martin, John ; Merchant, Paul

agrciculture, food, and farming

British Library PhD Placements 2018/19

User Collection

Phd placements

British Library PhD Placements 2017/18

User Collection

Phd placements

2023

Research report

Collecting complex digital publications: testing an enhanced curation method

Smith Nicholls, Florence

contextual collecting, play through videos, digital preservation, British Library PhD placement, web archiving, digital storytelling, and emerging formats

Collecting complex digital publications: Testing an enhanced curation method (PhD Placement)

User Collection

Phd placements

2023

Dataset

Incunabula Printed Catalogue Dataset: Volumes 1-10 copy of github repository

This dataset includes the github repository used to derive catalogue entries from volumes 1-10 of the "Catalogue of books printed in the 15th century now at the British Museum" (know as BMC). The BMC was published between 1908-2007 and comprises detailed descriptions of the incunabula collection at the British Library....

British Library

book history, metadata, catalogues, datasets, incunabula, early printed books, and early printing

Catalogue descriptions of the Incunabula Collection at the British Library

User Collection

metadata, early printing, early printed books, book history, incunabula, datasets, and catalogues

2023

Dataset

Incunabula Printed Catalogue Dataset: Volumes 1-10

This dataset includes the catalogue entries derived from volumes 1-10 of the "Catalogue of books printed in the 15th century now at the British Museum" (know as BMC). The BMC was published between 1908-2007 and comprises detailed descriptions of the incunabula collection at the British Library. The dataset was created...

British Library

datasets, catalogues, early printing, book history, early printed books, metadata, and incunabula

2023

Dataset

Incunabula Printed Catalogue Dataset Metadata: Volumes 1-10

This dataset includes the combined catalogue entries derived from volumes 1-10 of the "Catalogue of books printed in the 15th century now at the British Museum" (know as BMC). The BMC was published between 1908-2007 and comprises detailed descriptions of the incunabula collection at the British Library. The dataset was...

British Library

datasets, catalogues, early printing, incunabula, early printed books, metadata, and book history

Research Repository

Ricerca

Risultati della ricerca

2023

Presentation

2023

Presentation

2023

Dataset

2023

Dataset

2023

Conference paper (unpublished)

2022

Dataset

2023

Dataset

2023

Book chapter

2023

Research report

2023

Dataset

2023

Dataset

2023

Dataset

Affina la ricerca