Good standards, many connections
When we travel abroad we may have trouble charging our phone. Without the correct adapter, it is not possible to connect our devices and we will quickly run out of power. With a universal adapter, anyone can travel anywhere in the world. The same goes for metadata: no matter the type of data or the place it is stored, metadata standards allow anyone to find, access and use data for their research studies.
WHAT IS METADATA? WHY IS IT IMPORTANT TO HAVE STANDARDS?
Metadata is "data about data". However, not all metadata is useful, and standards need to be agreed by the research community and ideally follow a set of guidelines called the FAIR principles. Metadata standards assist with the creation of metadata catalogues to make data findable and accessible, and can serve also as adapters to make data interoperable and reusable.
To give a simple example, when you search for a film on a streaming platform, you will find information such as the year of release, the genre of the film, the director, the duration. The information describes the film and makes it easier to find a film you are interested in and decide whether you want to watch it.
It is the same with research data: if the information about the data are precise and detailed, it will be much easier for a researcher to discover that the data exists and to use it for their analysis. If a researcher finds two potentially useful datasets, but one refers to locations by name and another by postal codes, a metadata standard is necessary to combine the data.
We all use metadata standards every day: when we are using our GPS to drive to the Eiffel Tower, we should end up in the same place, regardless of the app we use. Good metadata standards allow the app developers to direct people to where they want to go, whether they entered ‘Eiffel Tower’, ‘Tour Eiffel’ or ‘75007’.
DATA NEEDS TO BE ACCOMPANIED BY METADATA BY DESIGN
The best book in the library will not be found if it is not indexed in a standard way. Different items in the library have different ways of being found. For example, a magazine has an issue number, a book can be a special edition, a comic book can be part of a larger series. The same is true for data: social data, medical data and biological data all require different metadata to describe them appropriately.
Metadata standards are necessary to find, link and use these data for research across different fields. Standards for data and their associated contextual and experimental metadata are also known as data standards, metadata standards or content standards, and can be classified in four standard subtypes: reporting guidelines or checklists, models/formats or syntax, terminology artefacts, and identifier schemata.
It is essential that metadata capture and standardisation are built into plans at the beginning of any research project, before the data are collected. This ensures that the data will find their correct place in a global ecosystem of information. Good metadata also improves the quality and reliability of data and confidence in the findings of the research.
GOOD STANDARDS, MANY CONNECTIONS
In the BY-COVID project there are many data sources (for example, databases, repositories and knowledge bases) from different research disciplines, including bioscience, clinical and epidemiological research, and social sciences and humanities. These data sources are being described in a FAIRsharing collection (in progress), along with the data and metadata standards used by each data source. A common metadata model has been developed to represent the metadata in each source, and to make it findable in one place: the Covid-19 Data Portal.
Developing a common metadata model is a major challenge, as the project involves a large number of researchers from different scientific fields, and each partner’s data source uses different metadata standards. The approach is to map the key inter-relationships between the metadata in a way that makes sense and is practical to implement. This then opens up exciting possibilities for discovering more about how infectious diseases affect people and to inform evidence-based policymaking.
IF YOU WANT TO KNOW MORE…
FAIRsharing Educational: learn about standards for data and metadata, how FAIRsharing registry helps you if you are a consumer or a producer of data, metadata standards, databases and data policies.
More details about metadata: Introduction to metadata management
Find out how indexing is used to link data in the BY-COVID project: Release of indexing system to link COVID-19 data across research disciplines
Learn more about the importance of having metadata standards (in general) : 5 Minute Metadata - What is a standard?
Learn more about the importance of metadata standards against COVID-19 pandemic: COVID-19 pandemic reveals the peril of ignoring metadata standards | Scientific Data
Find tools and guidelines to help you access, analyse and share infectious disease data, and respond quickly to disease outbreaks: Infectious Diseases Toolkit
BY-COVID - D3.1 - Metadata standards. Documentation on metadata standards for inclusion of resources in data portal | Zenodo
BY-COVID D2.1: Initial data and metadata harmonisation at domain level to enable fast responses to COVID-19 https://doi.org/10.5281/zenodo.7017728
Learn more about recipes that help you make data FAIR in the FAIR Cookbook an online resource of hands-on recipes for "FAIR doers" in the Life Sciences. The FAIR Cookbook - pre-print: “The essential resource for and by FAIR doers”, provided you with more information about its creation and content.
FAIR, ethical, and coordinated data sharing for COVID-19 response: a review of COVID-19 data sharing platforms and registries | Zenodo
Packaging research artefacts with RO-Crate - IOS Press
Lightweight Distributed Provenance Model for Complex Real–world Environments | Scientific Data
[2205.12098] COVID-19: An exploration of consecutive systemic barriers to pathogen-related data sharing during a pandemic