By June Ip
In 2017, The Economist declared that “the world’s most valuable resource is no longer oil, but data,” citing the extraordinary $25 billion revenue that companies like Amazon, Google, Facebook, and Apple (sometimes referred to by the acronym AGFA) have captured by monetizing our personal data. In 2018, we know that organizations like Cambridge Analytica had enough data on many of us to create behavioural profiles that are so accurate they could manipulate our voting patterns.
Much like how access to and control over oil often justified political interference in decades past, it seems data has proven its worth not only to businesses, but also to political interests. Indeed, data is the new oil.
While the Cambridge Analytica scandal demonstrates the nefarious ways in which data is shaping our world, there are also many positive ways in which data can create value, like services that monitor trends in air travel prices so you can get the best deal on your flights.
In a world where economic (and socio-political) value is increasingly reliant on data and data analysis, the lack of gender-specific and disaggregated data is concerning. In her 2019 best-selling book, Invisible Women: Exposing Data Bias in a World Designed for Men, Caroline Criando Perez details how these data gaps have led and continue to lead to policy failures and even life-threatening scenarios for women across all walks of life.
These data gaps become even more alarming when one examines the realm of artificial intelligence (AI) and machine learning, which is invading every sphere of life, from work to school to home. If you’ve ever had a voice assistant tell you it couldn’t understand you because you speak with a thick accent or have a speech impediment, then you’ve been victim to data bias in technology. In order for a computer to recognize and learn from patterns, it must be given training data it can extrapolate from. Often this training data is made up of data that is not reflective of larger society – for example, taken from the pool of mostly heteronormative able-bodied male computer developers working in the Silicon Valleys of the world.
Therefore, the level of ‘intelligence’ of AI is directly related to how varied and diverse training data sets are. If there is a recognized dearth of data about a particular proportion of the population – like those detailed by Criando Perez about women – then it is quite impossible for any AI to be truly helpful. In fact, this data bias can become downright harmful, like in this example where an AI algorithm around jobs exacerbates gender and racial stereotypes.
As one can imagine, with so much money involved in these technologies and their profound social impacts, the solutions to this data bias are complicated. It is therefore more important than ever that women become more familiar with the issues at stake and advocate for their interests when it comes to gender-specific data collection, availability, and representation.
Organizations like Women in Identity, Women in Data, and Stemettes are all doing their part to bring about better representation of women in technology, which will go a long way to ensuring women’s experiences are considered in product design decisions and policy outcomes. And, while we can’t all be data scientists and programmers, we can still be strong allies by learning more about impacts of data bias and bringing these issues to the attention of policymakers.
--
June Ip (she/her) is a marketing executive and educator with a background in political economy and social justice. Born to immigrant Chinese parents in Toronto, June's lived experiences have inspired her to advocate for and amplify the voices of racialized women in the city. She resides in the King West area of the city with her husband and dog, and can often be found eating sushi and/or noodles.
Image by @ThisIsEngineering