top of page

What We Learned About Data Diversity at the DSxHE Workshop




Recently, we hosted a virtual Drop-in session for our new DSxHE Theme - Data Diversity. The purpose of the session was both to introduce the new Theme to the world and to kick-start a conversation about diversity in biomedical datasets. We were fortunate enough to have a wonderfully diverse (geddit?) group of attendees, who provided a broad range of perspectives from academic research to patient engagement.

The session really highlighted both the complexity and the urgency of making data practices more inclusive. Here’s what we learned together.


🌍 What Does Data Diversity Mean to Us?

Our first prompt asked participants: What does Data Diversity mean to you?

The responses were powerful and consistent. Data diversity is about more than just race and ethnicity. It’s about making sure the data that informs healthcare decisions truly reflects the breadth of human experience - including geography, disability, socioeconomic background, and more.

Participants also emphasized the importance of thinking beyond traditional data sources. Holistic data means not limiting ourselves to what’s collected in hospitals or clinics. It includes insights from wearables, home monitoring tools, and other everyday technologies that offer a richer, more continuous picture of people’s health and lives. These sources can help fill gaps left by institutional data and bring in voices that are often missing.

At its heart, data diversity means making sure no one is left out of the research that shapes health policies, products, and care.


🚧 What Barriers Are We Facing?

Moving onto the challenges facing data diversity, attendees highlighted a number of barriers that many of us face in our day-to-day work:

  • Gatekeeping and access to data: Whether due to institutional silos or restrictive policies, this is still a major hurdle.

  • Outdated data collection practices: Many tools and methods don’t reflect the ways people live, connect, or access healthcare today.

  • Loss of trust: Some communities have experienced extractive or one-sided research, leading to disengagement and skepticism.

  • Resource limitations: Time, funding, and staffing can all limit how deeply we’re able to engage with inclusive practices.

  • Complexities with ethics: Especially in fields where ethics approvals aren’t the norm - until you want to publish and find yourself stuck.


🌱 What Do We Want to See from the Theme?

Participants shared what they’d like to see grow from the data diversity conversation. Here are some of the suggestions:

  • Clear guidelines for inclusive data collection

  • Accessible resources for navigating ethics processes, especially in interdisciplinary work

  • Engagement strategies that work both online and offline, recognizing that not everyone interacts digitally

  • Support for community-led research approaches - moving beyond top-down methods

  • More visibility for good practices - let’s spotlight what’s working so others can learn and build on it

  • Extensions of work like the Data Hazards taxonomy, which calls out harmful practices and pushes for better


🌟 Where We Go From Here

The session was a strong reminder that achieving data diversity is a shared responsibility. It’s not just about ticking boxes - it’s about actively shifting how we design, collect, and use data so that everyone counts.

We’re grateful to everyone who contributed their voice. This is just the beginning - and we’re excited to keep building together.

💬 If you’ve got ideas, feedback, or want to be part of the next step, we’d love to hear from you!




Comments


bottom of page