Mitigating Harm Related to Racial Bias in Healthcare Algorithms



Artificial intelligence is pervasive in our everyday lives whether we recognize it or not. In the simplest terms, artificial intelligence (AI) is the science of making computers mimic human thoughts and actions in real-life environments. AI is a branch of computer science focused on creating machines that can learn, make decisions and perform tasks at a human-like level. AI functions by processing different types of data using AI algorithms, or a series of rules, to learn or discover patterns from the data and then react in a specific way. There are complex mathematical and scientific concepts packed behind these simple definitions, but at the end of the day, we live, work and play in a world in which AI, tons of various data and algorithms do “smart” things in our environment. If you unlocked your smartphone through facial recognition, asked “Siri” to start your morning brew or used Google Maps to get to your destination, you have interacted with AI today.

The future of healthcare is promising because of the almost limitless capability of AI. AI includes a range of subfields including robotics, machine learning (ML), deep learning (DL), neural networks, computer vision and natural language processing. The Da Vinci robot is an example of robotics. Robotics allows physicians to perform complex surgical procedures by operating the arms of the robot remotely. ML, in the simplest terms, refers to empowering computers with the ability to learn by using algorithms backed by data to make decisions. Continuous glucose meters are an example of ML. These wearable devices make sense of data and turn it into useful information about blood glucose levels that medical providers can use to manage diabetes treatment. DL is a complex subset of ML through which computers learn from vast quantities of data such as images, text or sounds and then respond as humans would, but faster and more accurately. The most common DL techniques used in healthcare include computer vision, natural language processing and reinforcement learning. For example, deep-learning computer vision models in healthcare can detect skin cancer, differentiating moles from melanomas with physician-level accuracy.

While AI in healthcare holds tremendous promise, there is great potential for harm to individuals, families, communities and society when used unethically and without regulation. Unethical AI use includes:

  • bias and discrimination
  • violation of privacy
  • violation of human rights
  • unintended harm
  • lack of access to equitable care

Systemic and structural racism continues to cause health disparities, unequal access and poor outcomes in minority populations. Unfortunately, bias in the development and use of AI and algorithms used for diagnosis, treatment, prognosis, risk stratification and resource allocation leads to worse outcomes for racial and ethnic minorities and other historically marginalized populations. While there is reliable evidence that race is not a dependable proxy for genetic difference, it is embedded within medical practice. A primary example of race-based medicine involving AI is the insertion of race into diagnostic algorithms and practice guidelines that adjust outputs based on race or ethnicity. By embedding race into data and healthcare decisions, these algorithms propagate race-based medicine and perpetuate health inequities and disparities among racial and ethnic minorities. Race-adjusted algorithms guide medical decisions that direct more time, attention and resources to white patients versus racial and ethnic minorities. Following are examples of AI and race correction used in various healthcare specialty practices:

Obstetrics: Vaginal Birth after Cesarean (VBAC) Risk Calculator—used to estimate the probability of a successful vaginal birth after prior cesarean section. Race correction results in a lower VBAC score for Black and Hispanic patients, which potentially dissuades providers from offering trials of labor to people of color.

Urology: STONE Score—predicts the risk of a ureteral stone in patients presenting with flank pain. Race correction adds three points for non-Black patients, thus lowering the risk for all non-Black patients and steering providers from aggressive assessment and treatment of Black patients.

Pediatric Urology: Urinary Tract Infection (UTI) Calculator—estimates the risk of UTI in children 2-23 months of age and guides decisions for further testing for a definitive diagnosis. Race correction assigns a lower score if a child is Black, approximately 2.5 times increased score, or risk, of UTI if non-Black, and results in increased failure to diagnose Black children presenting with signs of UTI.

Pulmonology: Pulmonary Function Test—uses spirometry to diagnose and monitor pulmonary disease. Spirometers use race correction factors for Black patients at 10-15% and 4-6% for Asians, which results in inaccurate estimates of lung function and possible misclassification of the severity of lung disease in racial and ethnic minorities.

Endocrinology: Fracture Risk Assessment Tool (FRAX)—estimates the 10-year risk of a major osteoporotic fracture based on patient demographics and a risk-factor profile. The United States calculator corrects for race if a female patient is Black by a factor of (0.43), Asian (0.50), or Hispanic (0.53). Thus, the 10-year risk for Black women is half that of white women with identical risk factors. A lower risk for non-white women potentially delays osteoporosis therapy.

Oncology: Rectal Cancer Survival Calculator—estimates conditional 1 to 5-year survival after diagnosis of rectal cancer. Race correction includes assigning a regression coefficient of one to white patients and higher coefficients (1.18-1.72) depending on cancer staging to Black patients. A lower predicted survival rate for Black patients may reduce access to interventions for Black patients diagnosed with rectal cancer.

It is noteworthy to describe examples of AI-related bias that clinicians may be more familiar with to demonstrate the extent to which AI and algorithms have harmed and continue to pose harm to racial and ethnic minorities.

Pulse Oximeters: These measure oxygen levels, which is a critical vital sign measured by first responders and clinicians in hospital emergency departments and intensive care units. Research reports demonstrate systematic overestimation of oxygen saturation using pulse oximetry in patients with darker pigmentation compared to individuals with lighter pigmentation. Occult hypoxemia is up to three times higher in individuals self-reporting as Black compared to white. Equity concerns related to this medical device include:

  • higher risk of occult hypoxemia among Asian and Black patients compared to white patients
  • higher inpatient mortality for Black patients compared to white patients
  • delayed supplemental treatment with oxygen for darker-pigmented minorities.

EPIC Early Detection of Sepsis Model: Uses AI software to detect and treat sepsis sooner, which can mean the difference between life and death for millions of patients annually. The tool did not undergo Food and Drug Administration review before use, nor was there a system in place to monitor its safety or performance. Researchers at the University of Michigan Ann Arbor evaluated the tool’s performance once implemented in their health system. One of the most significant findings from their research demonstrated that the model did not detect sepsis in 67% of patients who developed sepsis. The implications of the untested model include:

  • increased mortality related to sepsis
  • higher inpatient healthcare costs related to sepsis
  • ongoing hospital use of a model that underperforms.

Population Health Management AI Tool: A widely used commercial algorithm used in healthcare to identify and assist patients with complex health needs exhibits significant racial bias because the algorithm uses healthcare costs instead of illness. Large health systems and payers use this algorithm to target “high-risk” patients for care management programs. Care management programs seek to improve care for complex, high-need patients by providing additional resources, including specialty-trained nurses and providers, additional primary care appointment slots and care coordination services. Racial bias occurs in this example because the algorithm uses health costs as a proxy for health needs versus uncontrolled illnesses. The implications of bias in this example include:

  • failed provision of post-hospital resources to patients with greater need
  • unequal access to care.

These examples highlight how healthcare algorithms and AI bias contribute to population health disparities based on race, ethnicity, gender, age, socioeconomic status and other demographic factors. The Agency for Healthcare Research and Quality and the National Institute on Minority Health and Health Disparities convened a diverse group of experts to listen to stakeholders, receive community feedback and review evidence. The primary goal of the workgroup was to promote health and healthcare equity for patients and communities within the context of structural racism and discrimination. A conceptual framework and a set of guiding principles to mitigate and prevent bias in healthcare organizations resulted from their work.

The five sequential phases in an algorithm’s life cycle include:

Phase 1: Problem formulation including defining, related factors and priority outcomes

Phase 2: Selection and management of data used

Phase 3: Algorithm development, training and validation

Phase 4: Algorithm deployment and integration in the defined setting

Phase 5: Based on performance and outcomes, maintain, update or end the algorithm.

The five guiding principles to apply at each of the phases of an algorithm’s life cycle include:

Principle 1: promote health and healthcare equity

Principle 2: ensure algorithms’ intended use is transparent and understandable

Principle 3: engage patients and communities to earn trustworthiness

Principle 4: identify healthcare fairness issues and trade-offs

Principle 5: establish accountability for equity and fairness in outcomes.

In addition to applying these guiding principles, scientists and developers should ensure that a racially and ethnically diverse workforce engages in the entire life cycle of algorithm development and deployment.

Systemic racism, discrimination and oppression are well-established social constructs that have resulted in health inequities, health disparities and physical and mental harm to Black, Indigenous, and other people of color for centuries. Whether consciously, through explicit preconceived ideas, or unconsciously, through ingrained thoughts based on stereotypes, race-based bias hidden in AI algorithms results in poor health outcomes for the same communities. Each of the examples provided highlights the pervasiveness and peril of embedding race into data used in healthcare to drive medical decision-making and care. Addressing algorithmic bias in healthcare is an urgent and pressing issue for all stakeholders in the healthcare industry. The conceptual framework and guiding principles presented here are a starting point for stakeholders to mitigate and prevent further harm from race-based algorithms by creating systems, processes, standards and policies to promote and achieve equity related to the use of AI in healthcare.

Kelva Edmunds-Waller, DNP, RN, CCM, has nearly 40 years of nursing experience, including over 20 years in leadership roles. She has clinical experience in acute care, home health, infusion therapy, public health, managed care, primary care, and long-term acute care. She earned a DNP degree at Loyola University New Orleans. She completed her undergraduate and graduate nursing degrees at Virginia Commonwealth University in Richmond, VA. Kelva serves as President of the Central Virginia Chapter of CMSA and is a member of the CMSA Editorial Board.


Chin, M., Afsar-Manesh, N., Bierman, A., Chang, C., Colón-Rodríguez, C., Dullabh, P., Duran, D., Fair, M., Hernandez-Boussard, T., Hightower, M., Jain, A., Jordan, W., Konya, S., Moore, R., Moore, T., Rodriguez, R., Shaheen, G., Snyder, L., Srinivasan, M., … Ohno-Machado, L. (2023). Guiding Principles to Address the Impact of Algorithm Bias on Racial and Ethnic Disparities in Health and Health Care. JAMA Network Open, 6 (12), e2345050—e2345050.

Colon-Rodriguez, C. (2023). Shedding light on healthcare algorithmic and artificial intelligence bias. Office of Minority Health, U.S. Department of Health and Human Services.

Fawzy, A., Wu, T. D., Wang, K., Robinson, M. L., Farha, J., Bradke, A., Golden, S. H., Xu, Y., & Garibaldi, B. T. (2022). Racial and Ethnic Discrepancy in Pulse Oximetry and Delayed Identification of Treatment Eligibility Among Patients With COVID-19. JAMA Internal Medicine, 182(7), 730—738.

Habehh, H. & Gohel, S. (2021). Machine learning in healthcare. Current Genomics 22(4), 291-300. https://10.2174/1389202922666210705124359.

Obermeyer, Z., Powers, B., Vogeli, C., & Mullainathan, S. (2019). Dissecting racial bias in an algorithm used to manage the health of populations. Science (American Association for the Advancement of Science), 366(6464), 447—453.

Richardson, Liz. (2021). Artificial intelligence can improve health care—but not without human oversight. The Pew of Charitable Trusts.


Comments are closed.