Facebook Researchers Built New AI That Can Identify Gender Bias In Text

Navin Bondade 4 min read

In a recently published paper called Multi-Dimensional Gender Bias Classification, Facebook researchers describe that they have created an AI system that can detect gender biases in the text along several pragmatic and semantic dimensions, which they used to annotate data sets and evaluate gender bias classifiers.

The researchers are looking at an artificial intelligence-based framework that breaks down gender bias in the text through several different factors. It looks at the bias from the gender of the person being spoken about, bias from the gender of the person being spoken to, and bias from the gender of the speaker.

The Facebook team says its work considers how humans collaboratively and socially construct language and gender identities. That is, it accounts for (1) bias from the gender of the person being spoken about, (2) bias from the gender of the person being spoken to, and (3) bias from the gender of the speaker.

The framework attempts to capture in this way the fact that adjectives, verbs, and nouns describing women differ from those describing men; the way addressees’ genders affect how they converse with another person; and the importance of gender to a person’s identity.

By using this framework and Facebook’s ParlAI, the researchers developed classifiers that decompose bias over sentences into the dimensions bias from the gender of the person being discussed, etc. while including gender information that falls outside of the male-female binary.

The team trained the classifiers on a range of text extracted from Wikipedia, Funpedia (a less formal version of Wikipedia), Yelp reviews, OpenSubtitles (dialogue from movies) & other sources, all this text contained information about the author and addressee gender that could inform the model’s decision-making.

The team also created a specialized evaluation corpus MDGender by collecting conversations between two volunteer speakers, each of whom was provided with a persona description containing gender information & tasked with adopting that persona and having a conversation about sections of a biography from Wikipedia.

Annotators were asked to rewrite each turn in the dialogue to make it clear they were speaking about a man or a woman, speaking as a man or a woman, and speaking to a man or a woman.

For example, a response to “How are you today? I just got off work” might have been rewritten as “Hey, I went for a coffee with my friend and her dog.”

In experiments, the team evaluated the gender bias classifiers against MDGender, measuring the percentage accuracy for masculine, feminine, and neutral classes.

They found that the best-performing model a so-called multitask model correctly decomposed sentences 77% of the time across all data sets and 81.82% of the time on Wikipedia only.

In another set of tests, the team applied the best-perform classifier to control the genderedness of generated text, detect biased text in Wikipedia, & explore the interplay between offensive content & genderedness.

They report that training the classifier on a data set containing 250,000 text snippets from Reddit enabled it to generate gendered sentences, for instance, “Awwww, that sounds wonderful” and “You can do it, bro!”

Separately, the model managed to score paragraphs among a set of biographies to identify which were masculine in the “about” dimension (74% skewed toward masculine, but the classifier was more confident in the femininity of pages about women, suggesting that women’s biographies contained more gendered text).

Lastly, after training and applying the classifier to a popular corpus of explicitly gendered words, they found that 25% of masculine words fell into “offensive” categories like “sexual connotation.”

“In an ideal world, we would expect little difference between texts describing men, women, and people with other gender identities, aside from the use of explicitly gendered words, like pronouns or names.

An ML model, then, would be unable to pick up on statistical differences among gender labels, because such differences would not exist. Unfortunately, we know this is not the case,” wrote the coauthors.

“We provide a finer-grained framework for this purpose, analyze the presence of gender bias in models and data, and empower others by releasing tools that can be employed to address these issues for numerous text-based use-cases.”

Link to paper:

Multi-Dimensional Gender Bias Classification