- This report describes the use of identifiers and persistent identifiers (PIDs) at the Natural History Museum (NHM), London. The NHM is a visitor attraction and international science centre for natural history collections. It has an extensive research programme and employs approximately 300 research scientists. It is in the midst of an extensive collection digitisation programme to make all of the specimens in its collections available online, almost 4.8 million of the 80 million specimens are available so far.
The NHM's main internal identifier for collection objects is a registration number and the Museum also uses barcodes in some areas of its collection which include the registration number encoded within them. The registration number is included in the Museum's collections management system (CMS), EMu, which is in the process of being replaced. Registration numbers were historically assigned by the five departments of the Museum independently and this means they are not always unique and do not have a standard format. As part of the programme of work to replace the CMS, the NHM is creating a new data model to document complex digital objects more effectively, of which identifiers will form a core part.
The NHM's Data Portal forms the main external point of access for the NHM's research and specimen collections. The digitised specimen collections, currently numbering 4.8 million, are assigned Globally Unique Identifiers (GUIDs) which form citable versioned links to records. These are not true PIDs as they do not have any governance of their persistence and occasionally can become inaccessible but they do comply with linked data standards and the CETAF Stable Identifiers initiative.
The NHM mints Digital Object Identifiers (DOIs) registered with DataCite for datasets created by staff and any researchers affiliated with the Museum. These do not have to be based on the specimen collections but in practice often are. As much of the data is tabular, the Data Portal allows for DOIs to be minted for each query as needed by a user, so their retrieved data can then be cited and resolved.