How a new U.S. health study is fixing bias in wearable data research

By giving participants wearables and internet access, the American Life in Realtime study is closing the gap in who digital health data truly represents, proving that inclusivity and rigorous design can make AI-driven healthcare fairer for all. 

Hands adjusting a smartwatch for health monitoringStudy: American Life in Realtime: Benchmark, publicly available person-generated health data for equity in precision health. Image credit: Lomb/Shutterstock.com

In a recent article in PNAS Nexus, researchers developed a longitudinal and nationally representative health study called American Life in Realtime (ALiR) to collect person-generated health data (PGHD) through study-provided wearable and internet-connected devices.

Their approach addresses the limitations of existing PGHD studies that depend on personal devices and often exclude disadvantaged populations. ALiR can thus serve as a benchmark for fair and generalizable digital health research.

Addressing historical underrepresentation

Precision health aims to improve disease prevention and treatment by tailoring strategies to individuals’ unique biological, social, and environmental contexts. A key component of this approach is PGHD, which is collected through everyday digital tools such as smartphones and wearable devices.

These data provide continuous insights into behaviors and exposures responsible for most modifiable health risks, making them vital for identifying health inequities and improving outcomes among marginalized groups.

However, the field lacks benchmark PGHD datasets, i.e., standardized, representative, and validated data resources that enable fair and reproducible development of artificial intelligence (AI) models. The authors note that an ideal PGHD benchmark should represent population diversity, include repeatedly validated measures, be longitudinal, contain sufficient data quality and quantity, and be widely accessible, which are criteria that ALiR fulfills.

Current datasets, such as the National Institutes of Health’s All of Us and the UK Biobank, underrepresent Black, Indigenous, older, and lower-income populations, often relying on irregular or unstructured data. This limits model generalizability and risks worsening disparities through biased predictions.

The pandemic of the coronavirus disease 2019 (COVID-19) underscored these challenges, revealing how social inequities amplify disease burdens. Many PGHD-based COVID detection studies relied on convenience samples that excluded disadvantaged individuals, partly due to recruitment barriers like limited technology access or mistrust.

To overcome these biases, the ALiR study was established. It uses probability-based sampling and study-provided hardware to promote inclusion and create a benchmark for equitable precision health research.

Designing the study

The ALiR study was designed as a longitudinal and nationally representative digital health cohort using best practices in probability sampling, benchmarking, and FAIR (Findable, Accessible, Interoperable, Reusable) data standards.

Participants were randomly selected from the Understanding America Study (UAS), a large address-based panel of U.S. adults. Individuals consenting to participate received a wearable device and access to a custom mobile app for continuous biometric tracking and short, frequent surveys.

These surveys, conducted every one to three days, gathered information on physical and mental health, behaviors, demographics, environmental and social exposures, and structural determinants such as income, housing, and discrimination.

Data were linked to contextual datasets, including healthcare records, weather, air quality, and crime, to enrich environmental and health information. The study also provided electronic tablets to participants lacking Internet access to minimize selection bias and ensure the inclusion of underrepresented groups.

Between August 2021 and March 2022, 2,468 UAS members were invited, with oversampling of racial/ethnic minorities and lower-education groups. Of those, 1,386 consented (64%), and 1,038 enrolled (75%).

Logistic and random forest analyses identified that nonconsent was most associated with older age, while nonenrolment was linked to lower education.

ALiR’s performance

ALiR achieved broad representativeness across U.S. population characteristics, including personality traits, health, demographics, and socioeconomic status.

Racial and ethnic minorities were overrepresented (54% vs. 38% in the population), while White individuals were underrepresented (46% vs. 62%), aligning with deliberate oversampling to improve inclusivity.

Participants with low income or limited digital access were well represented, with 77% having no prior wearable device, and 2% having no internet access before study-provided hardware. Weighted adjustments corrected most minor demographic imbalances, though retirees and those with hypertension remained slightly underrepresented.

Compared to convenience-based wearable studies, such as the All of Us Fitbit “bring-your-own-device” (BYOD) dataset, ALiR demonstrated far superior population alignment and diversity. When used to train a COVID-19 infection classification model, ALiR-based models achieved robust performance both in-sample and out-of-sample, indicating strong generalizability across all demographic subgroups.

Specifically, ALiR’s model achieved an area under the curve (AUC) of 0.84 when tested both in-sample and out-of-sample, maintaining consistent performance across all subgroups.

In contrast, an identically trained model based on All of Us data achieved an AUC of 0.93 in-sample but dropped to 0.68 out-of-sample, a 35% loss in accuracy, with the sharpest declines (22 to 40%) among older females and non-White participants.

Conclusions

ALiR is the first longitudinal population-based study to integrate wearable device data with repeatedly validated health and behavioral measures, offering a benchmark for equitable precision health research.

Its probability-based sampling, hardware provision, and oversampling strategies effectively minimized bias, achieving broad U.S. demographic and socioeconomic representation, improving convenience and “bring-your-own-device” studies like All of Us.

ALiR’s COVID-19 model performed robustly across diverse groups, showing that smaller, high-quality, representative samples can yield more generalizable results than larger, biased datasets.

However, some biases persisted, particularly underrepresentation of older adults despite device provision, suggesting that barriers beyond technology access, such as mistrust or disinterest, affect participation. The study also focused on consent and enrollment, with ongoing work addressing long-term engagement. The authors emphasize that the ALiR dataset and accompanying study app code will be publicly available in late 2025, providing an open resource for developing and validating equitable AI models.

In summary, ALiR not only sets a public benchmark for inclusive digital health research but also demonstrates that thoughtful study design can overcome long-standing barriers to representation. By providing a methodologically sound framework, ALiR supports the development of more generalizable AI models and contributes to improving equity in digital and precision health research.

Download your PDF copy now!

Journal reference:
  • Chaturvedi, R.R., Angrisani, M., Troxel, W.M., Jain, M., Gutsche, T., Ortega, E., Boch, A., Liang, C., Sima, S., Mezlini, A., Daza, E.J., Boodaghidizaji, M., Suen, S., Chaturvedi, A.R., Ghasemkhani, H., Ardekani, A.M., Kapteyn, A. (2025). American Life in Realtime: Benchmark, publicly available person-generated health data for equity in precision health. PNAS Nexus 4(10). DOI: 10.1093/pnasnexus/pgaf295. https://academic.oup.com/pnasnexus/article/4/10/pgaf295/8275735 
Priyanjana Pramanik

Written by

Priyanjana Pramanik

Priyanjana Pramanik is a writer based in Kolkata, India, with an academic background in Wildlife Biology and economics. She has experience in teaching, science writing, and mangrove ecology. Priyanjana holds Masters in Wildlife Biology and Conservation (National Centre of Biological Sciences, 2022) and Economics (Tufts University, 2018). In between master's degrees, she was a researcher in the field of public health policy, focusing on improving maternal and child health outcomes in South Asia. She is passionate about science communication and enabling biodiversity to thrive alongside people. The fieldwork for her second master's was in the mangrove forests of Eastern India, where she studied the complex relationships between humans, mangrove fauna, and seedling growth.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Pramanik, Priyanjana. (2025, October 09). How a new U.S. health study is fixing bias in wearable data research. News-Medical. Retrieved on October 09, 2025 from https://www.news-medical.net/news/20251009/How-a-new-US-health-study-is-fixing-bias-in-wearable-data-research.aspx.

  • MLA

    Pramanik, Priyanjana. "How a new U.S. health study is fixing bias in wearable data research". News-Medical. 09 October 2025. <https://www.news-medical.net/news/20251009/How-a-new-US-health-study-is-fixing-bias-in-wearable-data-research.aspx>.

  • Chicago

    Pramanik, Priyanjana. "How a new U.S. health study is fixing bias in wearable data research". News-Medical. https://www.news-medical.net/news/20251009/How-a-new-US-health-study-is-fixing-bias-in-wearable-data-research.aspx. (accessed October 09, 2025).

  • Harvard

    Pramanik, Priyanjana. 2025. How a new U.S. health study is fixing bias in wearable data research. News-Medical, viewed 09 October 2025, https://www.news-medical.net/news/20251009/How-a-new-US-health-study-is-fixing-bias-in-wearable-data-research.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of News Medical.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
Does pomegranate seed oil really help your heart? New research weighs the evidence