Binary

Classifying text sequences as biased/fair.

Overview of Task:

Binary classification is the foundation of many bias detection frameworks, and in this case refers to classifying an entire text sequence as "biased" or "unbiased."

This is typically implemented with an encoder-only model, such as BERT, to create encodings (i.e. contextual representations) that capture "the meaning" of a sentence, and can be passed to a classifier layer(s) with one output feature (for 0 to 1 probability of a single class: "Biased").

🤖 Models:

One of the UnBIAS findings is that ternary classification (see Multi-Class) is a stronger approach, but the binary classification model is just as good.

UnBIAS Classifier

UnBIAS is a framework started in 2023 by Raza. et al at the Vector Institute, and a refresh of the technology proposed in Dbias.

🤗 HF Space to Test UnBIAS Classifier

Base Model: bert-base-uncased Dataset: BEAD (3.67M rows)

🤗 Hugging Face Model

newsmediabias/UnBIAS-classifier · Hugging Facehuggingface

📄 Research Paper

Dbias: Detecting biases and ensuring Fairness in news articlesarXiv.org

Use UnBIAS Classifier:

# pip install transformers
from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("newsmediabias/UnBIAS-classification-bert")
model = AutoModelForSequenceClassification.from_pretrained("newsmediabias/UnBIAS-classification-bert")

classifier = pipeline("text-classification", model=model, tokenizer=tokenizer , device=0 if device.type == "cuda" else -1)
classifier("Anyone can excel at coding.")

Dbias

Dbias proposed an architecture in 2022, for addressing news media bias, with a framework that utilized binary classification, named-entity recognition, bias masking, and word recommendation (Raza, et al.).

While reimplementations have made changes in approach, Dbias was a trailblazer, especially for binary classification (the first phase in the image above).

Base Model: bert-base-uncased, Dataset: MBAD Dataset

🤗Hugging Face Model

d4data/bias-detection-model · Hugging Facehuggingface

📄 Research Paper

Dbias: Detecting biases and ensuring Fairness in news articlesarXiv.org

Use Dbias Bias Classification:

Dbias has a PyPI package.

# pip install Dbias
# pip install https://huggingface.co/d4data/en_pipeline/resolve/main/en_pipeline-any-py3-none-any.whl
from Dbias.bias_classification import *

# returns classification label for a given sentence fragment.
classifier("Tall people are so clumsy.")

💾 Datasets:

Bias Evaluation Across Domains (BEADs) Dataset

3.67M rows | 2024

The BEADs corpus was gathered from the datasets: MBIC, Hyperpartisan news, Toxic comment classification, Jigsaw Unintended Bias, Age Bias, Multi-dimensional news (Ukraine), Social biases.

It was annotated by humans, then with semi-supervised learning, and finally human verified.

It's one of the largest and most up-to-date datasets for bias and toxicity classification, though it's currently private so you'll need to request access through HuggingFace.

🤗HuggingFace Dataset

newsmediabias/news-bias-full-data · Datasets at Hugging Facehuggingface

📑 Contents

Fields

Description

text

The sentence or sentence fragment.

dimension

Descriptive category of the text.

biased_words

A compilation of words regarded as biased.

aspect

Specific sub-topic within the main content.

label

Indicates the presence (True) or absence (False) of bias. The label is ternary - highly biased, slightly biased, and neutral.

toxicity

Indicates the presence (True) or absence (False) of toxicity.

identity_mention

Mention of any identity based on words match.

While BEADs doesn't have a binary label for bias, the ternary labels (e.g. neutral, slightly biased, and highly biased) of the label field can categorized into biased (1), or unbiased (0). Additionally, the toxicity field contains binary labels.

📄 Research Paper

Navigating News Narratives: A Media Bias Analysis DatasetarXiv.org

How it Works:

Train your own binary classification model: 📝 Blog post - 💻 Training .ipynb

BERT (and other encoder models) process an input sequence into a encoding sequence as shown in the figure below, where self-attention heads encode the contextual words' meaning into each token representation.
These encodings are the foundation of many NLP tasks, and it's common (in BERT sequence classification) to then classify the CLS encoding into the desired classes (e.g. Biased, Neutral).
1. The CLS token (pooler_output) is a built in pooling mechanism, but you can also use your own pooling mechanism (e.g. averaging all the representations for a mean-pooled representation).
bert-base-uncased has 768 output features (for each token) and we can pass the CLS token into a (768 -> 1) dense layer.
1. This output logit of that classification head is activated (typically with a sigmoid or softmax function), for a probability that falls between 0-1.
A threshold is sometimes applied to the output (e.g. probability > 0.5 is "Biased").

Metrics:

When evaluating models' performance at binary classification, you should try to understand the way positive (biased), negative (neutral) fall into the categories: correct (true) predictions, and incorrect (false) predictions.

Your individual requirements will guide your interpretation (e.g. maybe you REALLY want to avoid false positives).

Confusion Matrix: Used to visualize the levels of correct and incorrect classifications made, the goal
Precision: $\frac{TP}{TP + FP}$
Recall: $\frac{TP}{TP + FN}$
F1 Score: $2 \times \frac{precision \times recall}{precision + recall}$

PreviousSequence Classification NextMulti-class

Last updated 1 year ago

Binary

Overview of Task:

🤖 Models:

UnBIAS Classifier

🤗 Hugging Face Model

Use UnBIAS Classifier:

Dbias

🤗Hugging Face Model

Use Dbias Bias Classification:

💾 Datasets:

Bias Evaluation Across Domains (BEADs) Dataset

Generalizations, Unfairness, and Stereotypes Dataset (Synthetic Corpus)

Bias Annotations By Experts (BABE)

How it Works: