Critical data studies

Interest in this unique field of critical data studies began in 2011 with scholars danah boyd and Kate Crawford posing various questions for the critical study of big data and recognizing its potential threatening impacts on society and culture.

[2] It was not until 2014, and more exploration and conversations, that critical data studies was officially coined by scholars Craig Dalton and Jim Thatcher.

[1] They put a large emphasis on understanding the context of big data in order to approach it more critically.

[3] Other key scholars in this discipline include Rob Kitchin and Tracey P. Lauriault who focus on reevaluating data through different spheres.

Furthermore, big data as a technological tool and the information that it yields are not neutral, according to Dalton and Thatcher, making it worthy of critical analysis in order to identify and address its biases.

Building off this idea, another justification for a critical approach is that the relationship between big data and society is an important one, and therefore worthy of study.

Desmond Upton Patton and others used their own classification system in the communities of Chicago to help target and reduce violence with young teens on twitter.

They had students in those communities help them to decipher the terminology and emojis of these teens to target the language used in tweets that followed with violence outside of the computer screens.

[1][11] Data plays a pivotal role in the emerging knowledge economy, driving productivity, competitiveness, efficiency, sustainability, and capital accumulation.

The ethical, political, and economic dimensions of data dynamically evolve across space and time, influenced by changing regimes, technologies, and priorities.

This technological advancement raises concerns about data quality, encompassing validity, reliability, authenticity, usability, and lineage.

Addressing these issues often requires scholars to make edits and assumptions about the data to ensure its reliability and relevance.

The research team may have inadequate skills or organizational capabilities which leads to the actual analytics performed on the dataset to be biased.

This can also lead to ecological fallacies, meaning an assumption is made about an individual based on data or results from a larger group of people.

The algorithms used demonstrated “a clear racial bias against Black patients” which caused estimated “health expenditures [to be] based on historical data structured by systemic racism and perpetuating that bias in access to care management”[21] In many trained machine learning and artificial models, there is no standard model reporting procedure to properly document the performance characteristics.

The use of model cards aims to provide important information to its users about the capabilities and limitations of machine learning systems and ways to promote fair and inclusive outcomes with the use of machine learning technology.

This framework focuses on ways to approach and understand how data is collected, processed, and used emphasizing ethical perspectives and protecting individuals information.

According to Jose and Dijck, it highlights the transformation of social actions into digital data allowing real time tracking and predictive analysis.

It also examines how societal changes take effect as digital data becomes more prevalent in our everyday lives.

Häußler says that users focus on how algorithms can produce discriminatory outcomes specifically when it comes to race, gender, age, and other characteristics, and can reinforce ideas of social inequities and unjust practices.

Generally there are key components within the framework bias identification, data quality, impact assessment, fairness and equity, transparency, remediation, and implications.