The Sheffield Press

Health

All of Us unveils world’s largest integrated genomic health dataset

By Sarah Mitchell ·
All of Us unveils world’s largest integrated genomic health dataset

The National Institutes of Health’s All of Us Research Program unveiled what it called the world’s largest integrated dataset, pairing genomes, electronic health records and wearable data from more than 747,000 participants across all 50 states and U.S. territories. The release links nearly 482,000 electronic health records to 535,000 whole genome sequences and adds more than 1.3 billion total genetic variants. That scale could sharpen research on disease risk, treatment response and long-term health, but it also concentrates unusually intimate medical information inside one national research platform.

All of Us said the new release moves the project into a multiomics era, with more than 8,000 participants carrying overlapping data across long-read sequencing, proteomics and RNA-seq. The program also said its analytical tools now include relatedness, phasing, pharmacogenomics, and newly added HLA and mtDNA analysis, widening what researchers can do with the data once they are inside the system. Josh Denny, the program’s chief executive officer, said the effort fulfills a long-term dream of building something that did not exist anywhere before. Geoffrey S. Ginsburg, the chief medical and scientific officer and acting chief data officer, said the resource makes “entirely new science possible.”

AI-generated illustration
AI-generated illustration

The latest release builds on a rapid expansion that has been unfolding for more than two years. In February 2025, All of Us said its research dataset had grown to more than 633,000 participants, with whole genome sequences from more than 414,000 participants and nearly 60,000 participants contributing Fitbit wearable data, which NIH described as the largest public dataset of wearable-device information. In April 2023, NIH said the program had reached nearly 250,000 whole genome sequences from more than 413,450 participants, and said about 45% of the genomic data came from people who self-identified with racial or ethnic groups historically underrepresented in medical research.

All of Us Research Program — Wikimedia Commons
US Government via Wikimedia Commons (Public domain)

The program has cast that diversity as central to its mission, not incidental to it. By linking genes, clinical care, lifestyle and wearable-device data over time, All of Us is trying to build a research base that can support studies of how biology and daily life interact in real patients, rather than in isolated lab samples. Access remains controlled: registered researchers must complete training and meet access requirements before using the secure Researcher Workbench.

Participant Growth
Data visualization chart

The program’s announcements page showed the infrastructure continuing to grow in 2026, with new biospecimen access, the All By All browser and an updated Researcher Workbench. The message is clear: All of Us is not just storing data, but building a national research platform whose value will depend on how carefully it balances scientific reach with privacy, consent and oversight.

healthAll