Algorithmic bias

As algorithms expand their ability to organize society, politics, institutions, and behavior, sociologists have become concerned with the ways in which unanticipated output and manipulation of data can impact the physical world.

The relative inability of facial recognition technology to accurately identify darker-skinned faces has been linked to multiple wrongful arrests of black men, an issue stemming from imbalanced datasets.

Algorithms are difficult to define,[5] but may be generally understood as lists of instructions that determine how programs read, collect, process, and analyze data to generate output.

[23]: 65  Weizenbaum warned against trusting decisions made by computer programs that a user doesn't understand, comparing such faith to a tourist who can find his way to a hotel room exclusively by turning left or right on a coin toss.

[26] A 2018 study by Joy Buolamwini and Timnit Gebru found that commercial facial recognition technologies exhibited error rates of up to 35% when identifying darker-skinned women, compared to less than 1% for lighter-skinned men.

Researchers and critics, such as Cathy O'Neil in her book Weapons of Math Destruction (2016), emphasize that these biases can amplify existing social inequalities under the guise of objectivity.

O'Neil argues that opaque, automated decision-making processes in areas such as credit scoring, predictive policing, and education can reinforce discriminatory practices while appearing neutral or scientific.

In such cases, individuals accept recommendations that align with their preexisting beliefs and disregard those that do not, thereby perpetuating existing biases and undermining the fairness objectives of algorithmic interventions.

[38]: 117  In recent years, the study of the Fairness, Accountability, and Transparency (FAT) of algorithms has emerged as its own interdisciplinary research area with an annual conference called FAccT.

"[44] Luo et al.'s work[44] shows that current large language models, as they are predominately trained on English-language data, often present the Anglo-American views as truth, while systematically downplaying non-English perspectives as irrelevant, wrong, or noise.

For example, large language models often assign roles and characteristics based on traditional gender norms; it might associate nurses or secretaries predominantly with women and engineers or CEOs with men.

[49] A recent focus in research has been on the complex interplay between the grammatical properties of a language and real-world biases that can become embedded in AI systems, potentially perpetuating harmful stereotypes and assumptions.

For example, AI systems used in hiring, law enforcement, or healthcare may disproportionately disadvantage certain racial groups by reinforcing existing stereotypes or underrepresenting them in key areas.

Such biases can manifest in ways like facial recognition systems misidentifying individuals of certain racial backgrounds or healthcare algorithms underestimating the medical needs of minority patients.

Addressing racial bias requires careful examination of data, improved transparency in algorithmic processes, and efforts to ensure fairness throughout the AI development lifecycle.

[69] Legal scholar Jonathan Zittrain has warned that this could create a "digital gerrymandering" effect in elections, "the selective presentation of information by an intermediary to meet its agenda, rather than to serve its users", if intentionally manipulated.

[71] In 2012, the department store franchise Target was cited for gathering data points to infer when women customers were pregnant, even if they had not announced it, and then sharing that information with marketing partners.

[95] A study conducted by researchers at UC Berkeley in November 2019 revealed that mortgage algorithms have been discriminatory towards Latino and African Americans which discriminated against minorities based on "creditworthiness" which is rooted in the U.S. fair-lending law which allows lenders to use measures of identification to determine if an individual is worthy of receiving loans.

[104] Facebook was also found to allow ad purchasers to target "Jew haters" as a category of users, which the company said was an inadvertent outcome of algorithms used in assessing and categorizing data.

[107] Another instance in a study found that 85 out of 100 examined subreddits tended to remove various norm violations, including misogynistic slurs and racist hate speech, highlighting the prevalence of such content in online communities.

[108] As platforms like Reddit update their hate speech policies, they must balance free expression with the protection of marginalized communities, emphasizing the need for context-sensitive moderation and nuanced algorithms.

[108] Surveillance camera software may be considered inherently political because it requires algorithms to distinguish normal from abnormal behaviors, and to determine who belongs in certain locations at certain times.

[118] There has also been a study that was conducted at Stanford University in 2017 that tested algorithms in a machine learning system that was said to be able to detect an individual's sexual orientation based on their facial images.

The lack of historical depth in defining disabilities, collecting its incidence and prevalence in questionnaires, and establishing recognition add to the controversy and ambiguity in its quantification and calculations.

[139]: 2  In response to this tension, researchers have suggested more care to the design and use of systems that draw on potentially biased algorithms, with "fairness" defined for specific applications and contexts.

[147] A significant barrier to understanding the tackling of bias in practice is that categories, such as demographics of individuals protected by anti-discrimination law, are often not explicitly considered when collecting and processing data.

[152] Furthermore, false and accidental correlations can emerge from a lack of understanding of protected categories, for example, insurance rates based on historical data of car accidents which may overlap, strictly by coincidence, with residential clusters of ethnic minorities.

[153] A study of 84 policy guidelines on ethical AI found that fairness and "mitigation of unwanted bias" was a common point of concern, and were addressed through a blend of technical solutions, transparency and monitoring, right to remedy and increased oversight, and diversity and inclusion efforts.

[185] Critiques of simple inclusivity efforts suggest that diversity programs can not address overlapping forms of inequality, and have called for applying a more deliberate lens of intersectionality to the design of algorithms.

The order outlines a coordinated, government-wide approach to harness AI's potential while mitigating its risks, including fraud, discrimination, and national security threats.

A flow chart showing the decisions made by a recommendation engine , c. 2001 [ 1 ]
A 1969 diagram for how a simple computer program makes decisions, illustrating a very simple algorithm
This card was used to load software into an old mainframe computer. Each byte (the letter 'A', for example) is entered by punching holes. Though contemporary computers are more complex, they reflect this human decision-making process in collecting and processing data. [ 23 ] : 70 [ 24 ] : 16
Facial recognition software used in conjunction with surveillance cameras was found to display bias in recognizing Asian and black faces over white faces. [ 32 ] : 191