HCRIS stands for the Healthcare Cost Report Information System. It's the public CMS database of annual Medicare Cost Reports — the financial, operational, and utilization filings that every Medicare-participating hospital is required to submit. HCRIS is one of the most widely cited datasets in U.S. hospital research, and one of the hardest to use in its raw form.
Why HCRIS exists
Medicare reimburses hospitals on the basis of their costs. To set rates and audit them, CMS requires every participating provider to file an annual cost report listing what they spent, what they earned, and on whom. The reports also feed Medicare's disproportionate-share hospital (DSH) payments, which compensate hospitals that serve a high share of low-income patients. The data is collected for regulatory purposes, but because the underlying filings are public, HCRIS has become the default source for almost any quantitative question about U.S. hospitals.
What gets filed
Hospitals file CMS Form 2552-10, a multi-worksheet document with several thousand line items spread across worksheets labeled S (general), A through F (cost centers and allocations), G (balance sheet), H (financial statements), I (capital costs), J through N (allocation of patient days and beds), and more. Other CMS provider types (skilled nursing facilities, home health agencies, hospices, federally qualified health centers) file their own variants. Trove focuses on Hospital 2552-10.
Among the many worksheets, a handful are disproportionately important for hospital research:
- Worksheet S-3 — bed counts, occupancy, discharges, employee FTEs.
- Worksheet S-10 — charity care, bad debt, uncompensated care. Line 23 column 3 ("charity care cost") and line 30 ("total uncompensated care cost") are the most-quoted numbers in HCRIS.
- Worksheet G — balance sheet (assets, liabilities, fund balances).
- Worksheet G-3 — net income, total revenue, total operating expenses.
Why HCRIS is hard to use
CMS distributes HCRIS as bulk-extract ZIP files, where each ZIP contains a long-skinny CSV with one row per (report, worksheet, line, column, value) tuple. A single hospital's annual filing decomposes into tens of thousands of rows. To do anything analytical you have to pivot the data wide — joining the right (worksheet, line, column) triples to get something like "charity care cost in dollars" as a column.
The line and column codes are documented in CMS's Provider Reimbursement Manual (Pub. 15-2, Chapter 40), but the documentation is dense, and CMS occasionally renumbers lines across form revisions. Trove ships a 44-variable dictionary that names the most commonly used fields and resolves them to the right (worksheet, line, column) source.
The "FY2023" labeling gotcha
CMS labels HCRIS bulk files by federal fiscal-year reporting cycle — the federal fiscal year in which the report was filed — not by the period the report covers. A hospital with a December 31 fiscal-year end will file its calendar-2022 cost report in 2023, and that report appears in CMS's "FY2023" bulk. When comparing HCRIS to other datasets like IRS 990 Schedule H, this is the single biggest source of confusion: the same hospital's "FY2023 HCRIS" and "TY2022 990" are about the same fiscal year, but the labels don't tell you that. The trove community-benefit dataset addresses this by carrying both the HCRIS fiscal-year end and the 990 tax-period end on every row.
What HCRIS doesn't include
- Quality and outcomes metrics. Readmissions, mortality, complication rates — those live in CMS Care Compare / Hospital Compare, not HCRIS.
- Patient-level data. HCRIS is hospital-level aggregates, not encounters.
- Hospitals that don't participate in Medicare. Veterans Affairs hospitals, military hospitals, some children's hospitals, and a handful of specialty centers are not in HCRIS or are partial.
- Nonprofit-specific community benefit detail. Charity care is in Worksheet S-10, but the broader "community benefit" framing — research, education, Medicaid shortfall, in-kind contributions — is reported separately on IRS Form 990 Schedule H for tax-exempt hospitals.
How to use HCRIS data
Three reasonable paths, in order of effort:
- Use trove's published Parquet bundle. Query directly with DuckDB over HTTPS:
SELECT * FROM read_parquet('https://troveproject.com/data/hcris_2023_wide.parquet'). Browse 1,295 nonprofit hospital systems in the lookup tool. - Install the hcris-analyst Claude Code skill to ask questions in natural language: "Show me a profile of New York-Presbyterian" or "Which hospitals provided the most uncompensated care in FY2023?"
- Download from CMS directly and pivot it yourself. The trove repo (github.com/cbetz/trove) has the parser if you want to reproduce or extend the work.
Citations
- CMS, Hospital Cost Report and Provider Cost Report.
- CMS Provider Reimbursement Manual, Pub. 15-2, Chapter 40 — Hospital and Hospital Health Care Complex Cost Report (Form CMS-2552-10).
- HCRIS data is U.S. government work, public domain.