Plain-language explainers for the datasets trove indexes.
Three of the most widely cited public-domain datasets in U.S. healthcare are also three of the least approachable. CMS Medicare Cost Reports are shipped as 100,000-row long-skinny CSVs. IRS Form 990 Schedule H is buried in XML bulk ZIPs. FDA approval packages are scattered across hundreds of PDF directories on accessdata.fda.gov. The data is public; learning what's in it is a project.
These pages explain what each dataset is, what's in it, and how to use it — in plain language, with citations.
What is HCRIS?
CMS Medicare Cost Reports — the financial filing every Medicare-participating hospital submits annually. Covers beds, staffing, revenue, costs, charity care, and uncompensated care via the Hospital 2552-10 form.
IRSWhat is IRS Form 990 Schedule H?
The community-benefit schedule that nonprofit hospitals attach to their annual Form 990 tax return. Covers financial assistance, Medicaid shortfall, research, education, and other categories that justify tax-exempt status.
FDAWhat is an FDA novel drug approval?
The annual curated list of meaningful first-time drug approvals (NMEs and novel BLAs), and the approval package each one ships with — medical, statistical, pharmacology, and chemistry reviews.
The docs explain the data. The tools let you query it.