Surveying Bias in Word Embeddings of Political News Media Dataset

In recent years, there has been an increasing academic and public interest in the field of AI fairness. One popular area of study is word embedding bias. Many studies have demonstrated the presence of bias in word embedding models that reflect the implicit social bias encoded in the text dataset. For instance, Bolukbasi [1] found concerning gender biases from analogy tasks using a model built with Google News Dataset. In this project, we question whether the political leanings of the news article sources play any factor in the bias. We survey different types of social bias found in two models: one built with a corpus of “liberal” articles, and the other built with a corpus of “conservative” articles.

Introduction

In this section, we hope to provide more background on our task. Word embedding is a term used for a learned representation of text data that encodes word semantics in the form of real-valued vectors. Consider the sentences “I am going to the shop to get eggs” and “I am going to the store to get eggs.” The words “shop” and “store” share similar meanings because they share similar neighboring words, and word embedding models aim to capture such relationships. Ideally, the word vectors for “shop” and “store” would have high similarity using measures like cosine similarity. Using the learned vectors, one can also perform arithmetic like “king” – “man” + “woman” to output a vector that will be most similar to the vector that represents the word “queen”. This is an example of an analogy task that can be read “man is to a king as woman is to a queen.” Word embeddings are often used to enhance various machine learning and NLP tasks. Although word embedding’s ability to capture semantics is powerful, it often contains many biases. For instance, the paper “Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings” [1] found concerning gender biases from analogy tasks using a model built with Google News Dataset. Given that entities like occupations are neutral, we hope the model wouldn’t distinguish them based on protected attributes like gender. However, the text data used to train the model may contain implicit human bias because historically some occupations, for instance, have been more male-dominant and others more female-dominant. Blind application of word embedding models can lead to serious consequences like reinforcing and amplifying harmful social stereotypes. In this project, we question whether the political leanings of the news article sources play any factor in the bias. We survey different types of social bias found in two models: one built with a corpus of “liberal” articles, and the other built with a corpus of “conservative” articles

Dataset

We processed two corpora, one made of news articles from “liberal” media and the other from “conservative” media [2][3]. The liberal corpus consists of 30000 articles 1from CNN, 30000 articles from Washington Post, and 30000 articles from Buzzfeed News. The conservative corpus consists of 12229 articles from Fox News, 30000 articles from Breitbart, 30000 articles from New York Post. The majority of articles from both sides are from the year range 2016 to 2020. We got our dataset from Components, a publication and research group. We faced few challenges building our corpora for this project. In our project proposal, we intended to perform diachronic word embeddings analysis to examine the shift in bias over time. However, we did not have the resources to gather a large amount of liberal and conservative news articles from older dates. Second, we noticed that much of the publicly available media-specific news datasets are of “liberal” sources, and not much from “conservative” sources. Finally, we are aware of the limitations of our representation of “liberal” and “conservative” news, as we only have 3 media sources from each side. In addition, not all articles and authors are politically motivated. Still, we believe it will be interesting to examine if there exists any difference between media that are publicly conceived as “liberal” or “conservative.”

Methods and Experiments

For building our word embedding models, we used the Word2vec algorithm utilizing the python library gensim. For preprocessing our corpora, we lower-cased all words and utilized NLTK library to tokenize them into sentences of word tokens that are required by gensim. We chose 200 as the dimensionality of word vectors as the convention is between 100 and 300, and we ignored words with a frequency less than 5.

Initially, we attempted to explore bias in the models by hand-picking several examples to compare cosine similarities or analogies. For instance, the famous analogy "man is to a doctor as woman is to a nurse" was replicated in both liberal and conservative models. We also tried performing PCA dimensionality reduction to visualize the existence of bias in the models. We graphed sports that are stereotypically associated with either males or females, along with the words "he" and "she." The words we used for "male sports" were [football, baseball, basketball, soccer], and words we used for "female sports" were [cheerleading, softball, volleyball, gymnastics]. Both the conservative and liberal models exhibited a similar result. The fact that one could draw a line to divide stereotypically male sports like "football" along with the term "he", and stereotypically female sports like "cheerleading" along with the term "she", hinted at a possibility of bias in both models.

To give a more comprehensive measurement of bias in the word embedding models, we used the Word Embedding Association Test 2(WEAT) proposed by Caliskan [6]. The WEAT aims to detect implicit bias by measuring the association between two sets of target concepts and two sets of attributes. For instance, suppose we are interested in quantifying gender bias in arts and science. The attributes would be a set of words that describe “male”, such as “male”, “man”, “boy”, “brother”, “he”, “him”, his”,... , and a set of words that describe “female”, such as “female”, “woman”, “girl”, “sister”, “she”, “her”, ..... For the target concepts, we are inspecting arts and science, so a set of words for arts may be “poetry”, “art”, “Shakespeare”, “dance”, “literature”, . . . and a set of words for science may be “science”, “technology”, “physics”, “chemistry”, “NASA”, . . . .. In essence, the WEAT tries to give a summary of whether male-related terms are more related to science than female terms (i.e whether female-related terms are more related to arts than male terms). Formally, the calculations are done with the equation in the figure below, which we borrowed from [4].

X,Y refer to the target sets, such as arts and science, while A,B refer to the attributes like male and female. For each word in the target sets X and Y like "literature" vs "chemistry", we compute s(w, A, B), and compare the difference of sum. s(w, A, B) is calculated by comparing the cosine similarity of the target word with each word from the attribute set like "boy" vs "girl" and taking the difference of the sum. In reality, the calculation of the formula is normalized using effect size, but we leave the detail out in this report. The values typically range from -2 to 2, where 0 indicates the absence of bias. In the formulation of X, Y, A, B, a positive value indicates that X is closer to A (i.e. Y is closer to B ). A negative value indicates that X is closer to B (i.e. Y is closer to A)

We used the python library wefe (The Word Embedding Fairness Evaluation Framework) for calculating WEAT [7]. For our analysis, we tested the following attribute set pairs: (male, female), (islam, christianity), (white, black), (white, asian), (LGBTQ, straight), and (old, young). For the neutral terms, we tested the following target set pairs (strong vs weak), (terrorism, peace), (normal, abnormal), and (intelligence, appearance). The words in each of the target or attribute sets were either borrowed from other studies [4][6], or they were curated by us. The exact list of words in each set can be found in our code. We graphed our results.

The blue line refers to the liberal model, and the red line refers to the conservative model. For interpretation, consider figure 4.

Comparing "male and female" with respect to "intelligence and appearance", both the liberal and conservative model gave a similar positive WEAT score, indicating that male terms are closer to intelligence terms (i.e. female terms are closer to appearance terms).

In general, the degree of bias or the absence of bias were similar between liberal and conservative models. Still, there were couple of analogies where we could see some difference. For instance, Consider the normal vs abnormal graph. Normal terms include words like "natural", "right", "normal", while abnormal terms include terms like "weird", "abnormal", "wrong". For male and female pair, the conservative model had a much higher score, suggesting that male is closer to normal, and female is closer to abnormal. On the other hand, for LGBTQ and straight, the liberal had a much more negative score, suggesting that LGBTQ is closer to abnormal, and straight is closer to normal. Another noticeable result was Islam and Christianity with respect to terrorism and peace. The conservative model was much more biased in associating Islam with terrorism and Christianity with peace. The limitation of the WEAT score is that it cannot be interpreted directly. Like the cosine similarity, it is only a comparison measure.

Conclusion

In this project, we used the WEAT method as a primary tool to diagnose bias. We found that the degree of bias was generally similar between liberal and conservative models, though there were noticeable differences in some comparisons. We also found bias in our models with analogies like "man is to a doctor as a woman is to a nurse." We would like to note using analogies for bias detection is challenged by some researchers. For instance, one study makes the following argument: if we make the analogy man is to computer programmer as woman is 4to x, is there a correct output for x? In the traditional analogy task A:B :: C: D, all four terms are forced to be distinct. Therefore, forcing the fourth term to be different from the second can be problematic. In addition, the scope of attribute and target terms we used for WEAT is also limited as well as hand-picked by humans, so it is difficult to make a comprehensive comment regarding bias in word embeddings. This goes to show that the area of AI fairness still has a lot to be explored

References

[1] Bolukbasi, T., Chang, K.-W., Zou, J., Saligrama, V., and Kalai, A. (2016). Man is to computer programmer as woman is to homemaker? debiasing word embeddings. arXiv preprint arXiv:1607.06520.

[2] 2.7 million news articles and essays https://components.one/datasets/all-the-news-2-news-articles-dataset/

[3] All the news 143,000 articles from 15 American publications https://www.kaggle.com/snapcrack/all-the-news

[4] Anthony Rios, Reenam Joshi, and Hejin Shin. 2020. Quantifying 60 years of gender bias in biomedical research with word embeddings. In Proceedings of 44 the 19th SIGBioMed Workshop on Biomedical Language Processing, pages 1–13, Online. Association for Computational Linguistics.

[5] Malvina Nissim, Rik van Noord, and Rob van der Goot. Fair is better than sensational: Man is to doctor as woman is to doctor. arXiv preprint arXiv:1905.09866, 2019.

[6] A. Caliskan, J. J. Bryson, and A. Narayanan. Semantics derived automatically from language corpora contain humanlike biases. Science, 356(6334):183–186, 2017.

[7] P. Badilla, F. Bravo-Marquez, and J. Pérez WEFE: The Word Embeddings Fairness Evaluation Framework In Proceedings of the 29th International Joint Conference on Artificial Intelligence and the 17th Pacific Rim International Conference on Artificial Intelligence (IJCAI-PRICAI 2020), Yokohama, Japan.