PURPOSE: Strategies to identify and validate acute myocardial infarction (AMI) and stroke cases might impact effect measures. This has not been well characterized in primary care electronic records. Additionally, the validity of cardiovascular risk factors that could act as confounders in studies on those endpoints has not been thoroughly assessed in the United Kingdom Clinical Practice Research Datalink (CPRD).
METHODS: We identified AMI, stroke, smoking, obesity, and menopausal status in a patient cohort treated for overactive bladder by applying electronic algorithms to primary care medical records (CPRD, 2004-2012). We validated these cardiovascular outcomes and risk factors with physician questionnaires (gold standard in this analysis). Then, we estimated incidence rate ratios (IRRs) for AMI and stroke using various strategies: algorithm-identified cases, questionnaire-confirmed cases, and cases identified through linkage with hospitalization and mortality data (gold standard in this analysis).
RESULTS: For AMI, the positive predictive value (PPV) of the electronic algorithm was >90%. Initial electronic algorithms for stroke performed less well because of inclusion of codes for prevalent stroke; algorithm refinement increased the PPV to 80% but decreased sensitivity by 20%. Algorithms for smoking and obesity were considered valid in electronic primary care data. IRRs based on questionnaire-confirmed cases were closer to IRRs estimated from hospitalization and mortality data than IRRs based on algorithm-identified cases.
CONCLUSIONS: AMI, stroke, smoking, obesity, and postmenopausal status can be accurately identified in the CPRD. Questionnaire-validated AMI and stroke cases render IRRs that are closest to the gold standard.