Skip to Main Content

Database Library

Data sets available through Yale Surgery
American College of Surgeons National Surgical Quality Improvement Program (NSQIP)

The NSQIP contains a large amount of clinical data on a systematic sample of major surgical procedures performed on adult patients at participating hospitals around the U.S. The NSQIP does not include trauma surgery or transplants. The database contains demographic data, comorbidities, lab values, CPT codes of procedures performed, and 30-day complications, mortality, re-hospitalizations, and re-operations.

Yale Surgery owns NSQIP data from 2006 through 2019. In addition, Yale Surgery has the Pediatric NSQIP, the Geriatric NSQIP, and Procedure Targeted files for Colectomy, Proctectomy, Esophagectomy, Pancreatectomy, Hepatectomy, Cystectomy, Nephrectomy, Prostatectomy, Appendectomy, Thyroidectomy, Vascular Surgery, Hip Fracture, Hysterectomy, and Gynecology Reconstruction. The Procedure Targeted files contain additional variables specific to those procedures and can be linked to the main NSQIP.

Procedure codes required: CPT

Diagnosis codes required: ICD-9/ICD-10 (only one discharge diagnosis recorded)

More Details

Healthcare Cost and Utilization Project Data Sets

National (Nationwide) Inpatient Sample (HCUP-NIS)

The NIS is an all-payor database of U.S. inpatient hospital stays. Starting in 2012 the NIS contains data on a 20% sample of all hospitalizations in the U.S.

Prior to 2012 it contained data on all hospitalizations from a 20% sample of hospitals. Based on the sampling methodology, the NIS can be used to obtain national estimates of inpatient care. The NIS database includes basic demographic information, insurance coverage, length of stay, discharge disposition, ICD-10 diagnosis and procedure codes, charges, and hospital characteristics (see the NIS database for a complete list). Yale Surgery owns NIS data for 2016 through 2019.

Procedure codes required: ICD-10-PCS

Diagnosis codes required: ICD-10

Nationwide Emergency Department Sample (HCUP-NEDS)

The NEDS is an all-payor database of U.S. emergency department visits. The NEDS contains data on all ED visits from a 20% sample of hospitals with emergency departments. Based on the sampling methodology, the NEDS can be used to obtain national estimates of emergency department care.

The NEDS database includes basic demographic information (except for race), insurance coverage, discharge disposition, ICD-10 diagnosis codes, CPT procedure codes, charges, and hospital characteristics (see the database for a complete list). If the patient was admitted to the same hospital upon discharge from the ED, the NEDS includes information on their hospital stay. Yale Surgery owns NEDS data for 2016 through 2019.

Procedure codes required: CPT (ED procedures), ICD-10-PCS (inpatient procedure)

Diagnosis codes required: ICD-10

Nationwide Readmissions Database (HCUP-NRD)

The NRD is an all-payor database of state inpatient data from about 22 states which provide identifiers that allow linkage of patients over time. The NRD can be used to obtain national estimates of readmissions. The NRD database includes similar data points as that found in the NIS: basic demographic information, insurance coverage, discharge disposition, ICD-10 diagnosis and procedure codes, charges, and hospital characteristics (see the database for a complete list). Yale Surgery owns NRD data for 2016 through 2019.

Procedure codes required: ICD-10-PCS

Diagnosis codes required: ICD-10

Nationwide Amublatory Surgery Sample (HCUP-NASS)

The NASS is an all-payor database of state ambulatory surgery data from about 34 states, representing about 68% of all ambulatory surgeries nationwide. The NASS can be used to obtain national estimates of ambulatory surgery. The NASS database includes similar data points as that found in the NIS: basic demographic information, insurance coverage, discharge disposition, ICD-10 diagnosis codes, CPT procedure codes, charges, and hospital characteristics (see the database for a complete list). Yale Surgery owns NASS data for 2018.

Procedure codes required: CPT

Diagnosis codes required: ICD-10

MarketScan Commercial Claims and Encounters Database and Medicare Supplemental Database, from IBM Watson Health

These data comprise a nationally representative (but not random) sample of individuals in the U.S. with employer-sponsored private health insurance, as well as their spouses and dependents. This claims database captures the full continuum of care in all settings: physician office visits, hospital stays, retail, mail order, and specialty pharmacies; and carve-out care services.

The Commercial Claims and Encounters Database is limited to patients under the age of 65. The Medicare Supplemental portion of the database includes Medicare-eligible patients aged 65 and older who also have coverage through their employers or former employers. Yale Surgery owns data from 2017 through 2019, during which time there were an average of about 27 million enrollees under age 65 and an average of 1.2 million enrollees aged 65 and over.

Procedure codes required: ICD-10-PCS (Inpatient), CPT/HCPCS (Inpatient, Outpatient)

Diagnosis codes required: ICD-10

Other codes possibly required: NDC (Pharmacy)

Other obtainable datasets

In addition to the above, Yale Surgery can easily obtain the following data resources:

  • Clinical Classification Software (CCS) from AHRQ:
  • Area Health Resources File (AHRF)
  • CDC Natality database
  • US Census population estimates
  • US Census median income by zip code and census tract
  • US Census ADI, education, age, sex, race by zip code
  • Zip code geographical location (latitude and longitude)
  • CDC/NCHS National Hospital Discharge Survey (NHDS)
  • CDC/NCHS National Hospital Ambulatory Medical Care Survey (NHAMCS)
  • CDC/NCHS National Ambulatory Medical Care Survey (NAMCS)
  • National Health and Nutrition Examination Survey (NHANES)
  • CDC National Study of Family Growth (NSFG)
Other experience

For those who can obtain the data or permission to use the data, Yale Surgery has experience with the following data sets:

  • Premier
  • USRDS (End-stage renal disease)
  • SEER-Medicare
  • SEER-MHOS (Medicare Health Outcome Survey)
  • Medicare
  • National Cancer Databank (NCDB)
  • National Trauma Databank (NTDB) / Trauma Quality Improvement Plan (TQIP)
  • Surveillance, Epidemiology, and End Results (SEER) Program Database
  • MIMIC-III (detailed ICU data from Beth Israel Hospital)