Iris' AI research

Iris Dominguez-Catena research website

I have completed my PhD

I’m excited to share that on October 9th, 2024, I successfully defended my doctoral thesis “Demographic bias in machine learning: measuring transference from dataset bias to model predictions” at the Public University of Navarre, receiving the highest distinction of outstanding cum laude. The full text of the thesis is available here. Most of the findings are also collected in four papers, the first three already published (check the publications section).

In my research, I tackled the challenging issue of algorithmic fairness in artificial intelligence systems, with a particular focus on how demographic biases manifest and propagate in machine learning models. Using Facial Expression Recognition (FER) as my primary case study, I developed new ways to measure and analyze bias both in the datasets we use to train AI models and in the models’ predictions.

One of my key findings was the identification and characterization of two main types of demographic bias in datasets: representational bias, where certain demographic groups are underrepresented (think having many more white faces than faces of other races in a dataset), and stereotypical bias, where there are inappropriate associations between demographic groups and specific outcomes (like women being shown more frequently expressing happiness and men expressing anger).

What’s particularly interesting is that stereotypical bias, which has received much less attention in research, actually has a stronger and more persistent impact on model predictions than representational bias. This is especially relevant in the light of the current trend of creating datasets from Internet images, this type of bias is becoming predominant - while these datasets are more diverse overall compared to the older “laboratory gathered” ones, they tend to reinforce existing stereotypes.

To address these challenges, I developed several new tools and methodologies:

Additionally, we also found a strong correlation between increased bias in the datasets and a decrease in model accuracy. This challenges the long-held belief that making AI systems fairer always comes at the cost of performance. In fact, my experiments showed that unbiased systems can actually be more accurate! I highly suspect that the current observations are mostly due to biased test sets, but that’s still under research.

This work has significant implications for developing fairer AI systems. It’s not enough to ensure balanced representation of different demographic groups - we need to examine and correct potential stereotypical biases as well. I hope that the tools and methodologies I’ve developed make it easier to detect these issues both during development and after deployment of AI systems.

I’m grateful for the opportunity to contribute to this important field, and I look forward to seeing how these findings might help shape the development of more equitable AI systems in the future.