top of page

Meet DSxHE’s new Theme: Data Diversity


“Building a world where data improves everyone’s health”


Those are the words you read every time you visit DSxHE’s homepage—which, of course, you do several times a day.


But there are many reasons why the world we live in today doesn’t yet reflect that vision. Data—and data science—are used across the entire healthcare landscape, from basic lab research that deepens our understanding of the human body to the apps on our phones that assess whether that new mole is worth worrying about.

Whether these scientific insights or technological innovations actually improve your health, though, is far from guaranteed. A crucial factor is whether the data that underpinned these insights or tools came from people ‘like’ you. In other words: were people ‘like’ you represented in the dataset? And if you were, what about your family? Your neighbours? People in other parts of the country—or the world?


Is this dataset diverse enough?


At a population level, this boils down to a fundamental question: Is this dataset diverse enough?


This is a hard question to answer. Diversity can be defined along many axes—ethnicity, sex, and gender are often the first that come to mind. But other dimensions matter, too: age, socioeconomic status, sexual orientation, disability… The list goes on.


All too often, the answer is ‘No’. Take genomics: around 90% of participants in genomic studies to date are of European ancestry, despite representing only about 15% of the global population. In Invisible Women, Caroline Criado Perez lays out in painful detail how half the population has been historically overlooked in medical research and beyond.


This lack of representativeness has real consequences. It limits the generalisability of findings and medical advice, contributes to poorer clinical decision-making, and reduces the effectiveness of data-driven tools and interventions for underrepresented groups. Ultimately, it worsens health outcomes for those missing from today’s datasets.


There is, fortunately, a growing recognition of this problem. Many health organisations are now explicitly calling for more diverse data—like the FDA’s recent recommendation to “improve representativeness in clinical studies”.

What’s still missing, though, is practical guidance. How do we move from recognising the issue to designing studies and systems that proactively embed diversity?


Enter: DSxHE’s Data Diversity Theme.


In partnership with CRUK, DSxHE is launching a new Theme to help bridge this gap. The goal is to create tools and solutions with and for researchers, clinicians, and patient advocates that make data diversity a standard part of the research ecosystem. This could include inclusive data collection protocols, training resources, and diversity frameworks for funders and regulators.


Of course, these tools will ultimately affect the people who take part in health studies, so it’s vital their voices are heard too. That’s why we’ll be convening a diverse panel of patient and public representatives to support the entire DSxHE community. This panel will help define the Theme’s priorities, shape our tools, and guide how we engage with others.


Get involved!


Over the coming months, we’ll be running a series of activities—webinars, virtual coffee drop-ins, workshops, and more. We’ll also be reaching out to the wider community to learn what tools already exist and which ones are still missing. We want to develop these tools with clinicians and researchers, listening to what they need most and building practical, usable tools in response. We’d love as many people as possible to take part—so keep an eye on our newsletter for updates.

And to help make this a success, we’ll be putting together a volunteer team to help shape and run these activities. If you’re interested in data diversity and can spare 2–3 hours a week, we’d love to hear from you.



Find out more about how to get involved:





bottom of page