The Poverty You Can’t See: Scarcity, Stress, and the Health Data That Miss Them (Part 2)
- vanessarodriguez22
- 8 hours ago
- 8 min read
Measuring Socioeconomic Status for Equity-Driven Modeling
Socioeconomic status (SES) is part of the default demographic toolkit in health research — right up there with age and sex. And much like those, it’s often included reflexively, adjusted for automatically, and measured using whatever data happen to be available.
In Part 1, I laid out how SES can be mischaracterized when reduced to income and education alone, and explored ways to measure material deprivation, both at the individual and neighborhood level. This second part turns to the next layer: the subjective strain of navigating scarcity, the stress it causes, and the coping it necessitates. It concludes with recommendations for data scientists for capturing SES nuances in ways actionable for designing interventions.
Part 2: From Scarcity to Stress
There is a gap between objective scarcity — the kind two observers might agree on — and the subjective strain of living with constraint. Health status and outcomes, as well as health-related quality of life and behaviors, correlate not only with “objective” scarcity (e.g., low levels of material consumption), but with the experience of deprivation and precarity [1, 2] as well.
A promising effort to bridge this gap is the European Union’s Material and Social Deprivation Index, captured through the EU-SILC (statistics on income and living conditions) framework. Unlike neighborhood-level deprivation indices, it collects individual-level data that reflect the lived experience of constraint. It has been used in longitudinal studies (e.g., Growing Up in Ireland) and can be aggregated to produce area-level indicators of deprivation that inform policy.
The index includes items such as:
Ability to keep home adequately warm
Ability to afford a meal with meat, chicken, or fish every second day
Capacity to face unexpected expenses
Ability to afford celebrations or annual holidays
These questions reveal more than economic standing: they capture whether someone is forced to make trade-offs that impact health, wellbeing, and social belonging. In other words, this index surfaces the feelings of security about the future, and small pleasures lost to poverty.
Can Feeling Poor Damage Your Health?
Indicators like “difficulty making ends meet” or “feeling poor” have been shown to mediate the relationship between income and health outcomes: even for individuals with similar incomes, health outcomes may be worse for the person whose subjective experience of strain is greater.
For example, this pattern has been observed for:
Mental health
Chronic stress biomarkers
Healthcare avoidance and treatment adherence
Stress response frameworks inspired by Hans Selye's General Adaptation Syndrome (GAS) indicate that perceived scarcity and strain, even when not matched by "objective" income, produce compounding physiological and behavioral effects [3, 4]. Thus, SES isn’t just a covariate to adjust for. Like smoking or environmental toxins, it may operate as a layered exposure with dose-response effects on health. Thinking of social standing as a gradient risk rather than a static label requires capturing the nuances of consumption, resource insecurity, felt deprivation, as well as their interactions at individual and higher-order levels. Intensity and duration of exposure will need to be a part of our model building as well [5].
Integrating subjective aspects of deprivation not only reflects people’s social worlds more accurately (making conclusions generalizable) but also translates more easily to interventions.
Lived experience of a constraint will likely be more predictive of actual trade-offs shaping health behaviors than any income band would be: “Do I fill my prescription, or buy a winter coat?” “Do I plant in my garden vegetables that I will be able to eat fresh, or do I plant what is storable and tradeable?”
What Are the Implications for Health Data Science?
How can we operationalize this layered understanding of SES in data science and health equity work? It depends on the project’s purpose, scope, and data demands.
It may help to ask what causal pathway is being assumed when including a given SES measure: is SES serving as a proxy for access to health-promoting resources (e.g., medical care, nutritious food, diagnostics)? Or is it capturing exposure to chronic stressors (e.g., housing instability, erratic income, harmful coping behaviors)? Including behavioral and environmental mediators can clarify both direct and indirect pathways between SES and health.
[For a more in-depth discussion of causality, take a look at Why Causality Matters for Health Equity: Understanding Path-Specific Fairness in this series.]
The recommendations below reflect my experience and practice: from experimental linguistics to applied work in pharmaceutical commercialization (focused on product uptake, decision dynamics, and patient experience). I also follow how public health and health equity fields push beyond standard mea
surement toward more context-aware modeling.
In localized, short-term settings:
Incorporate qualitative data into variable design. Use focus groups, interviews, or ethnographic methods to define locally relevant forms of deprivation and constraint. This can inform meaningful proxies or survey items.
Validate findings established using traditional methods (income) with a subset of respondents drawn from the population of interest. Gather evidence of any blind spots and limitations of the findings to justify investment into collecting richer data.
In large-scale, system-level, longitudinal efforts:
Create hybrid data models that integrate surveys and electronic health records (EHRs). For example, in the US the Distress Thermometer (developed by the National Comprehensive Cancer Network (NCCN) and implemented by the US Oncology Network), offers a model for capturing subjective experience alongside clinical data during medical appointments. Similarly, Z codes within the ICD-10-CM disease classification system allow providers to document social risk factors directly in patient records.
Track and address gaps in SES data collection. Monitor completeness and granularity of SES indicators in electronic health records to surface blind spots in care delivery and equity evaluation [6].
Connect signals of material deprivation or stress to actions front-line providers can take to direct someone to resources. Avoid data collection burden on staff if it will be perceived as fruitless or academic.
In academic contexts focused on research
Model constructs like economic vulnerability or perceived scarcity as latent variables. Include multiple economic, social, and geographic indicators to estimate the latent variables.
Use multilevel modeling approaches. Combine individual-level SES data with area-level deprivation indices, and model cross-level interactions.
Link contextual data sources to patient-level records. Use tools like the UK Index of Multiple Deprivation, or the US Congressional District Health Dashboard.
Disseminate basic research findings among professionals working in private industry, public health, and clinical settings. Academic findings can motivate efforts to tailor or broaden data collection within under-resourced organizations (if their mission does not include producing generalizable knowledge).
Does your model pass the “grandmother test”? - Recommendations for Data Science Professionals
While there can be no recipes, the list below offers a few strategic prompts to help data scientists engage with the data, modeling assumptions, and the equity-relevant questions at stake. Some of the recommendations are easier to apply in traditional model building than in AI systems, where human control shifts from model specification to data curation. That makes diverse, context-rich datasets, especially those reflecting lived experience, even more critical in the age of AI.
Brainstorm edge cases and patient personas that current modeling approaches and variables are likely to miss, misclassify, or collapse
First decide what would be meaningful to know about the population, then look for appropriate data sources, not the other way round (relying on data simply because they are “already there”)
Monitor degree of data completeness in electronic health records to surface discrepancies in information richness for different groups
Use multidimensional SES measures (wealth, debt, occupation, deprivation) in addition to income and education
When choosing SES-related variables, ask: What causal model is this implying? Is SES being treated as an enabler of health-promoting behaviors (e.g., care access, diet, preventive services), or as a marker of accumulated stress and forced tradeoffs (e.g., skipped appointments, financial strain, unhealthy coping)? Select and test variables that capture these mediating pathways directly
Operationalize income, consumption, deprivation, and similar measures as continuous, rather than categorical, variables to preserve gradient effects
Employ hierarchical or mixed models combining individual-level SES data with area-based indices to capture context and interactions
Favor smaller geographic units (e.g., census tracts in the US over ZIP codes) for more precise risk mapping
Collect subjective and experiential measures (such as “difficulty making ends meet”) alongside objective markers
Incorporate approaches that capture aspects of SES that matter locally, such as property- or housing-based indices in areas where home ownership is widespread, or access to heating and fuel
Regularly validate SES measures for population and context specificity, adapting them as local realities change
Collect SES data from a combination of surveys, administrative records, and geospatial sources for robust statistical modeling
Thank you for reading this blog post! What did you find interesting? What questions do you have? Leave your comments down below!
Statistical Methods at DSxHE: Blog Series on Methods for Health Equity
The Statistical Methods section of Data Science for Health Equity (DSxHE) is launching a public-facing blog series that demystifies methods and showcases their real-world impact on health equity. Each post will blend clear, accessible explanation with concrete applications, from study design and causal inference to measurement, modelling, and evaluation, highlighting how methods can drive fairer health outcomes. We invite authors to bring not only their technical expertise but also their personal insights, and encourage anyone interested to volunteer their perspective. Submissions can be tutorials, case studies, opinion articles, or "what we learned" stories.
Want to contribute? Send a brief pitch or draft to info@datascienceforhealthequity.com.
Glossary
Independent effect
The effect of a variable on an outcome that is not influenced by a potential confounding variable.
Confounding variable
A variable that is correlated with both the independent and dependent (outcome) variables and can distort the relationship between them.
Exposure
In epidemiology and public health research, an exposure refers to a contact or interaction between a person and a potential risk factor or health hazard.
Latent variable
A variable of interest to the research that denotes a concept that cannot be measured directly (e.g., intelligence) and is estimated from other (indicator) variables, which are observed directly in the data (e.g., a series of question items in a test). In social science and behavioral research, latent variables are often estimated using approaches such as factor analysis or structural equation modeling.
Mediator, mediating pathway
A variable that acts as an intermediary attenuating the effect of one variable on another.
Indirect effect/pathway
The effect of a variable on an outcome that operates through one or more intermediary variables. For instance, socioeconomic status may indirectly affect health outcomes through how much one can spend on fresh fruit and vegetables, the gym, or cutting-edge medical treatments.
Hierarchical (multilevel) model
A model that incorporates effects operating both at the level of individuals and at the levels of higher-order units, such as geographic areas, school districts, hospitals, and can allow for cross-level interactions.
References
Tøge, A. G., & Bell, R. (2016). Material deprivation and health: a longitudinal study. BMC Public Health, 16(1), 747.
Costa, C., Freitas, A., Almendra, R., & Santana, P. (2020). The association between material deprivation and avoidable mortality in Lisbon, Portugal. International journal of Environmental Research and Public Health, 17(22), 8517.
Matthews, K. A., & Gallo, L. C. (2011). Psychological perspectives on pathways linking socioeconomic status and physical health. Annual Review of Psychology, 62(1), 501-530.
McEwen, B. S., & Seeman, T. (1999). Protective and damaging effects of mediators of stress: elaborating and testing the concepts of allostasis and allostatic load. Annals of the New York Academy of Sciences, 896(1), 30-47.
Barakat, C., & Konstantinidis, T. (2023). A review of the relationship between socioeconomic status change and health. International journal of Environmental Research and Public Health, 20(13), 6249
Juhn, Y. J., Ryu, E., Wi, C. I., King, K. S., Malik, M., Romero-Brufau, S., Weng, C., Sohn, S., Sharp, R. R., & Halamka, J. D. (2022). Assessing socioeconomic bias in machine learning algorithms in health care: A case study of the HOUSES index. Journal of the American Medical Informatics Association, 29(7), 1142-1151. https://doi.org/10.1093/jamia/ocac052