What is HCRIS?

CMS Medicare Cost Reports — the financial filing every Medicare-participating hospital submits annually.

HCRIS stands for the Healthcare Cost Report Information System. It's the public CMS database of annual Medicare Cost Reports — the financial, operational, and utilization filings that every Medicare-participating hospital is required to submit. HCRIS is one of the most widely cited datasets in U.S. hospital research, and one of the hardest to use in its raw form.

Why HCRIS exists

Medicare reimburses hospitals on the basis of their costs. To set rates and audit them, CMS requires every participating provider to file an annual cost report listing what they spent, what they earned, and on whom. The reports also feed Medicare's disproportionate-share hospital (DSH) payments, which compensate hospitals that serve a high share of low-income patients. The data is collected for regulatory purposes, but because the underlying filings are public, HCRIS has become the default source for almost any quantitative question about U.S. hospitals.

What gets filed

Hospitals file CMS Form 2552-10, a multi-worksheet document with several thousand line items spread across worksheets labeled S (general), A through F (cost centers and allocations), G (balance sheet), H (financial statements), I (capital costs), J through N (allocation of patient days and beds), and more. Other CMS provider types (skilled nursing facilities, home health agencies, hospices, federally qualified health centers) file their own variants. Trove focuses on Hospital 2552-10.

Among the many worksheets, a handful are disproportionately important for hospital research:

Why HCRIS is hard to use

CMS distributes HCRIS as bulk-extract ZIP files, where each ZIP contains a long-skinny CSV with one row per (report, worksheet, line, column, value) tuple. A single hospital's annual filing decomposes into tens of thousands of rows. To do anything analytical you have to pivot the data wide — joining the right (worksheet, line, column) triples to get something like "charity care cost in dollars" as a column.

The line and column codes are documented in CMS's Provider Reimbursement Manual (Pub. 15-2, Chapter 40), but the documentation is dense, and CMS occasionally renumbers lines across form revisions. Trove ships a 44-variable dictionary that names the most commonly used fields and resolves them to the right (worksheet, line, column) source.

The "FY2023" labeling gotcha

CMS labels HCRIS bulk files by federal fiscal-year reporting cycle — the federal fiscal year in which the report was filed — not by the period the report covers. A hospital with a December 31 fiscal-year end will file its calendar-2022 cost report in 2023, and that report appears in CMS's "FY2023" bulk. When comparing HCRIS to other datasets like IRS 990 Schedule H, this is the single biggest source of confusion: the same hospital's "FY2023 HCRIS" and "TY2022 990" are about the same fiscal year, but the labels don't tell you that. The trove community-benefit dataset addresses this by carrying both the HCRIS fiscal-year end and the 990 tax-period end on every row.

What HCRIS doesn't include

How to use HCRIS data

Three reasonable paths, in order of effort:

Citations