CPRD Aurum database: Assessment of data quality and completeness of three important comorbidities

Journal: Pharmacoepidemiology and Drug Safety

Authors: Rebecca PerssonCatherine Vasilakis-ScaramozzaKatrina Wilcox HagbergTodd SponholtzTim WilliamsPuja MylesSusan S Jick

NML Citation: Persson R, Vasilakis-Scaramozza C, Hagberg KW, Sponholtz T, Williams T, Myles P, Jick SS. CPRD Aurum database: Assessment of data quality and completeness of three important comorbidities. Pharmacoepidemiol Drug Saf. 2020 Nov;29(11):1456-1464. doi: 10.1002/pds.5135. Epub 2020 Sep 28. PMID: 32986901.

Abstract

Purpose: The Clinical Practice Research Datalink (CPRD) now provides a new medical record database, CPRD Aurum. This is the second of several studies being undertaken to assess the quality of CPRD Aurum data for research.

Methods: We included patients aged 20+, with at least one lab test result of any type from a random sample of 50 000 patients in CPRD Aurum. We assessed whether diagnosis codes for type 2 diabetes, hyperlipidemia, and iron deficiency or unspecified anemia were accompanied by supporting codes including lab results and treatments (correctness) and whether lab results, treatments, or other codes indicate a missing diagnosis record (completeness).

Results: Among 37 502 patients in CPRD Aurum, correctness of type 2 diabetes, hyperlipidemia, and anemia diagnoses was high (99%, 93%, and 97%, respectively). Completeness was only high for type 2 diabetes (94%-98%); completeness for hypercholesterolemia and anemia diagnoses was modest even when the presence of treatments and lab results indicated the conditions were likely present (51%-59% and 58%-70%, respectively).

Conclusions: Our findings indicate that for studies of type 2 diabetes, hyperlipidemia, and iron deficiency or unspecified anemia, the diagnosis code is likely to be correct where present. However, a significant proportion of cases of hyperlipidemia or anemia will be missed if only diagnosis codes are used to select patients with these conditions. Researchers should consider using treatments, supporting codes, and, when available, lab data to supplement diagnosis codes and enhance case capture when including these conditions in studies using CPRD Aurum.

Keywords: CPRD Aurum; Clinical Practice Research Datalink; data completeness; data correctness; data validation; pharmacoepidemiology.