Familial hypercholesterolemia (FH) is an inherited disorder caused by a genetic mutation in one of three genes that influences the normal functioning of the low-density lipoprotein receptor (LDL-R). The dysfunctional LDL-R results in decreased LDL-cholesterol (LDL-C) clearance from circulation and premature atherosclerotic cardiovascular disease. FH is one of the most common monogenic disorders with an estimated worldwide prevalence of 1 in 311. (1)

Despite being a common disease, the morbidity and mortality associated with FH can be reduced if early treatment is initiated. As FH is a silent disease, population-based screening via a lipid profile is recommended and individuals with an LDL-C above a specified threshold should be evaluated further. This entails history taking and clinical examination to enable the diagnosis based on a validated FH diagnostic tool, such as the commonly used Dutch Lipid Clinic Network criteria. However, although more expensive and therefore not suited for population screening, genetic confirmation of an FH-related mutation is regarded as the gold standard.

Unfortunately, the current approach to identifying individuals with FH is suboptimal as it is estimated that more than 95% of individuals with FH worldwide remain undiagnosed and untreated. (2) The advent of artificial intelligence (AI), and especially machine learning (ML), offer new modalities to help solve old problems.

Although the concept of ML might seem daunting, in its essence it is computer powered statistical analysis to identify patterns on large datasets. As a branch of ML, supervised ML uses data with known outcomes and creates ties between characteristic variables and a specific outcome. When then faced with new data with an unknown outcome, the created pattern, or algorithm, can be used to predict the result based on what was learned before. Types of supervised ML include regression, neural networks, decision trees and support vector machines.

In terms of medical application, ML is a relatively new tool, but has shown great promise when applied to especially classification problems. Laboratory medicine is in a particularly opportune position when it comes to ML, as laboratory information systems (LIS) offer large digital repositories of categorical and numerical data, often tied with gold standard diagnostic information. This type of data is perfectly suited to the application of ML, and it therefore comes as no surprise that ML has already demonstrated great promise in this setting.

As a potential tool to identify undiagnosed, or so-called “missing” FH individuals, ML has also been attempted with promising results. Previously, ML was applied to electronic health record data to identify potential FH individuals based on a combination of diagnostic codes, pharmacy records, clinical notes and laboratory results. (3)(4)

Recently, the application of ML to even basic laboratory information, such as age and the components of the basic lipid profile, have been shown to meet, or even surpass, the performance of clinical diagnostic tools in the identification of FH when measured against the gold standard of FH-mutation testing. (5)(6) In these investigations, higher concentrations of age-adjusted total cholesterol and LDL-C and lower concentrations of high-density lipoprotein cholesterol and triglyceride concentrations were good predictors of mutation-proven FH.

Of course, the addition of clinical information, such as tendon xanthomata presence, and more extensive biochemical investigations, such as lipoprotein (a), may further improve the accuracy of these tools, but the improvement in performance that these additions bring must be weighed against the availability of the extra variables. This is especially important when the intent of the ML tool is for population-wide identification of previously unidentified FH individuals on a typical LIS.

Regardless of the approach used in developing the ML algorithm, it is clear that the application of machine learning can improve on our current approaches in the screening for, and diagnosis of, familial hypercholesterolemia.


  1. Hu P, Dharmayat KI, Stevens CAT, Sharabiani MTA, Jones RS, Watts GF, et al. Prevalence of Familial Hypercholesterolemia Among the General Population and Patients With Atherosclerotic Cardiovascular Disease. Circulation [Internet]. 2020 Jun 2 [cited 2020 Oct 26];141(22):1742–59. Available from: https://www.ahajournals.org/doi/10.1161/CIRCULATIONAHA.119.044795
  2. Nordestgaard BG, Chapman MJ, Humphries SE, Ginsberg HN, Masana L, Descamps OS, et al. Familial hypercholesterolaemia is underdiagnosed and undertreated in the general population: Guidance for clinicians to prevent coronary heart disease. Eur Heart J. 2013 Dec 1;34(45).
  3. Banda JM, Sarraju A, Abbasi F, Parizo J, Pariani M, Ison H, et al. Finding missed cases of familial hypercholesterolemia in health systems using machine learning. NPJ Digit Med [Internet]. 2019 [cited 2019 Sep 5];2:23. Available from: http://www.ncbi.nlm.nih.gov/pubmed/31304370
  4. Myers KD, Knowles JW, Staszak D, Shapiro MD, Howard W, Yadava M, et al. Precision screening for familial hypercholesterolaemia: a machine learning study applied to electronic health encounter data. Lancet Digit Heal. 2019;
  5. Pina A, Helgadottir S, Mancina RM, Pavanello C, Pirazzi C, Montalcini T, et al. Virtual genetic diagnosis for familial hypercholesterolemia powered by machine learning. Eur J Prev Cardiol [Internet]. 2020 Oct 4 [cited 2020 Oct 11];27(15):1639–46. Available from: http://journals.sagepub.com/doi/10.1177/2047487319898951
  6. Hesse R, Raal FJ, George JA. Machine learning outperforms traditional screening and diagnostic tools for the detection of familial hypercholesterolemia. In: AACC Annual Scientific Meeting. 2020.