Classifying Fake News with Tensorflow

Introduction

Hello Everyone!

Today we will be learning how to classify fake news with Tensorflow! Let’s begin by importing the necessary packages and data.

Importing necessary packages and data

import numpy as np
import pandas as pd
import tensorflow as tf
import re
import string
import nltk
nltk.download('stopwords')
from nltk.corpus import stopwords

from tensorflow.keras import layers
from tensorflow.keras import losses

from tensorflow.keras.layers.experimental.preprocessing import TextVectorization
from tensorflow.keras.layers.experimental.preprocessing import StringLookup

from sklearn.model_selection import train_test_split
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
train_url = "https://github.com/PhilChodrow/PIC16b/blob/master/datasets/fake_news_train.csv?raw=true"

train_data = pd.read_csv(train_url)
train_data

The data we are importing contains article titles, text, and a column telling us if they are fake news or not.

Unnamed: 0 title text fake
0 17366 Merkel: Strong result for Austria's FPO 'big c... German Chancellor Angela Merkel said on Monday... 0
1 5634 Trump says Pence will lead voter fraud panel WEST PALM BEACH, Fla.President Donald Trump sa... 0
2 17487 JUST IN: SUSPECTED LEAKER and “Close Confidant... On December 5, 2017, Circa s Sara Carter warne... 1
3 12217 Thyssenkrupp has offered help to Argentina ove... Germany s Thyssenkrupp, has offered assistance... 0
4 5535 Trump say appeals court decision on travel ban... President Donald Trump on Thursday called the ... 0
... ... ... ... ...
22444 10709 ALARMING: NSA Refuses to Release Clinton-Lynch... If Clinton and Lynch just talked about grandki... 1
22445 8731 Can Pence's vow not to sling mud survive a Tru... () - In 1990, during a close and bitter congre... 0
22446 4733 Watch Trump Campaign Try To Spin Their Way Ou... A new ad by the Hillary Clinton SuperPac Prior... 1
22447 3993 Trump celebrates first 100 days as president, ... HARRISBURG, Pa.U.S. President Donald Trump hit... 0
22448 12896 TRUMP SUPPORTERS REACT TO DEBATE: “Clinton New... MELBOURNE, FL is a town with a population of 7... 1

22449 rows × 4 columns

Next, we will be making a function to turn our data into a tensor object! This will allow us to create our models.

Make function to make data tensor object

Our function will first remove stopwords from our titles and text, and then we will turn it into a tensor object. Title and text will be our input, and fake will be our output.

def make_dataset(df):
  # Remove stopwords from the article text and title. 
  # A stopword is a word that is usually considered to be uninformative, such as “the,” “and,” or “but.” 
  # Construct and return a tf.data.Dataset with two inputs and one output. The input should be of the form (title, text), 
  # and the output should consist only of the fake column. You may find it helpful to consult lecture notes or this tutorial for 
  # reference on how to construct and use Datasets with multiple inputs

  #stop words
  stop = stopwords.words('english')

  #remove stop words from article title
  df["title"] = df['title'].apply(lambda x: ' '.join([word for word in x.split() if word not in (stop)]))
  #remove stop words from article text
  df["text"] = df['text'].apply(lambda x: ' '.join([word for word in x.split() if word not in (stop)]))

  data = tf.data.Dataset.from_tensor_slices(
   ( # dictionary for input data/features
    #our two inputs: title and text
    { "title": df[["title"]],
     "text": df["text"]
    },
    # dictionary for output data/labels
    # one output: fake
    { "fake": df[["fake"]]
        
    }   
   ) 
  ) 

  return data.batch(100)

#batch our data to make training faster; train on chunks of data rather than individual rows
data = make_dataset(train_data)
#check size of dataset
len(data)
225

Now, we will be splitting our data into a training and validation set. Our validation set will be 20% of the data.

Split data into training and validation sets

#shuffle dataset
import random
random.seed(10)
data = data.shuffle(buffer_size = len(data))

#20% validation
val_size  = int(0.2*len(data))

val = data.take(val_size)
train = data.skip(val_size).take(len(data) - val_size)

#check size of training and validation sets
print(len(train), len(val))
180 45

Base rate - Labels Iterator, fake text, count of labels, on training data

Now, let’s look at our base rate by looking at the “fake” labels in our training data.

# Base rate 
## similar to previous hw; can get true and fake from training data, labels iterator on fake column

labels_iterator= train.unbatch().map(lambda input, output: output).as_numpy_iterator()

train_data2 = train_data.sample(n=1800)

len(train_data2[train_data2["fake"] == 1]) / 1800
0.5355555555555556

The base rate appears to be about 53.6%, which indicates that a little more than half of the artiles in the training data is fake while the other half is true.

Model Creation

Next we will create our models, but before we do so, we must perform standardization and text vectorization.

# Text vectorization

#preparing a text vectorization layer for tf model
size_vocabulary = 2000

def standardization(input_data):
    lowercase = tf.strings.lower(input_data)
    no_punctuation = tf.strings.regex_replace(lowercase,
                                  '[%s]' % re.escape(string.punctuation),'')
    return no_punctuation 

## Title Vectorization
title_vectorize_layer = TextVectorization(
    standardize=standardization,
    max_tokens=size_vocabulary, # only consider this many words
    output_mode='int',
    output_sequence_length=500) 

title_vectorize_layer.adapt(train.map(lambda x, y: x["title"]))

## Text Vectorization

text_vectorize_layer = TextVectorization(
    standardize=standardization,
    max_tokens=size_vocabulary, 
    output_mode='int',
    output_sequence_length=500) 

text_vectorize_layer.adapt(train.map(lambda x, y: x["text"]))

We will be creating 3 models: one with the title only, one with the text only, and one with both the text and title. The purpose of this is to see which method will classify fake news the best.

Model 1: Title only

For our first model, we will focus on just using the title. We must first create an input.

title_input = tf.keras.Input(
    shape=(1,),
    name = "title", # same name as the dictionary key in the dataset
    dtype = "string"
)

Next, we will add our layers. Since our titles are categorical and not numerical, we will add GlobalAveragePooling. For this, we will include an embedding and begin with 0.2 for our dropout. I decided to begin with these parameters based on the lecture notes.

# layers for processing the title
title_features = title_vectorize_layer(title_input)

# Add embedding layer, dropout
title_features = layers.Embedding(size_vocabulary, output_dim = 3, name="embedding1")(title_features)
title_features = layers.Dropout(0.2)(title_features)
title_features = layers.GlobalAveragePooling1D()(title_features)
title_features = layers.Dropout(0.2)(title_features)
title_features = layers.Dense(2, activation='relu')(title_features)

output = layers.Dense(2, name="fake")(title_features) 

Now, we create our model and check its summary.

model1 = tf.keras.Model(
    inputs = title_input,
    outputs = output
)
from tensorflow.keras import utils

model1.summary()
utils.plot_model(model1)
Model: "model_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 title (InputLayer)          [(None, 1)]               0         
                                                                 
 text_vectorization_2 (TextV  (None, 500)              0         
 ectorization)                                                   
                                                                 
 embedding1 (Embedding)      (None, 500, 3)            6000      
                                                                 
 dropout_2 (Dropout)         (None, 500, 3)            0         
                                                                 
 global_average_pooling1d_1   (None, 3)                0         
 (GlobalAveragePooling1D)                                        
                                                                 
 dropout_3 (Dropout)         (None, 3)                 0         
                                                                 
 dense_1 (Dense)             (None, 2)                 8         
                                                                 
 fake (Dense)                (None, 2)                 6         
                                                                 
=================================================================
Total params: 6,014
Trainable params: 6,014
Non-trainable params: 0
_________________________________________________________________

output_18_1.png

model1.compile(optimizer="adam",
              loss = losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=["accuracy"])

Now, let’s fit our data.

history = model1.fit(train, validation_data=val, epochs=20)
Epoch 1/20


/usr/local/lib/python3.7/dist-packages/keras/engine/functional.py:559: UserWarning: Input dict contained keys ['text'] which did not match any model input. They will be ignored by the model.
  inputs = self._flatten_to_reference_inputs(inputs)


180/180 [==============================] - 3s 11ms/step - loss: 0.6922 - accuracy: 0.5199 - val_loss: 0.6909 - val_accuracy: 0.5244
Epoch 2/20
180/180 [==============================] - 2s 9ms/step - loss: 0.6889 - accuracy: 0.5577 - val_loss: 0.6845 - val_accuracy: 0.5988
Epoch 3/20
180/180 [==============================] - 2s 9ms/step - loss: 0.6764 - accuracy: 0.6732 - val_loss: 0.6650 - val_accuracy: 0.8622
Epoch 4/20
180/180 [==============================] - 2s 9ms/step - loss: 0.6485 - accuracy: 0.8005 - val_loss: 0.6267 - val_accuracy: 0.9313
Epoch 5/20
180/180 [==============================] - 2s 10ms/step - loss: 0.6015 - accuracy: 0.8615 - val_loss: 0.5699 - val_accuracy: 0.9285
Epoch 6/20
180/180 [==============================] - 2s 9ms/step - loss: 0.5442 - accuracy: 0.8794 - val_loss: 0.5050 - val_accuracy: 0.9322
Epoch 7/20
180/180 [==============================] - 2s 10ms/step - loss: 0.4829 - accuracy: 0.8984 - val_loss: 0.4426 - val_accuracy: 0.9338
Epoch 8/20
180/180 [==============================] - 2s 9ms/step - loss: 0.4248 - accuracy: 0.9200 - val_loss: 0.3848 - val_accuracy: 0.9338
Epoch 9/20
180/180 [==============================] - 2s 10ms/step - loss: 0.3756 - accuracy: 0.9275 - val_loss: 0.3347 - val_accuracy: 0.9520
Epoch 10/20
180/180 [==============================] - 2s 13ms/step - loss: 0.3306 - accuracy: 0.9383 - val_loss: 0.2911 - val_accuracy: 0.9531
Epoch 11/20
180/180 [==============================] - 2s 10ms/step - loss: 0.2931 - accuracy: 0.9438 - val_loss: 0.2575 - val_accuracy: 0.9642
Epoch 12/20
180/180 [==============================] - 2s 10ms/step - loss: 0.2606 - accuracy: 0.9494 - val_loss: 0.2229 - val_accuracy: 0.9651
Epoch 13/20
180/180 [==============================] - 2s 10ms/step - loss: 0.2355 - accuracy: 0.9514 - val_loss: 0.1958 - val_accuracy: 0.9671
Epoch 14/20
180/180 [==============================] - 2s 9ms/step - loss: 0.2149 - accuracy: 0.9535 - val_loss: 0.1829 - val_accuracy: 0.9667
Epoch 15/20
180/180 [==============================] - 2s 10ms/step - loss: 0.1938 - accuracy: 0.9570 - val_loss: 0.1662 - val_accuracy: 0.9700
Epoch 16/20
180/180 [==============================] - 2s 10ms/step - loss: 0.1809 - accuracy: 0.9582 - val_loss: 0.1478 - val_accuracy: 0.9718
Epoch 17/20
180/180 [==============================] - 2s 10ms/step - loss: 0.1665 - accuracy: 0.9608 - val_loss: 0.1375 - val_accuracy: 0.9684
Epoch 18/20
180/180 [==============================] - 2s 10ms/step - loss: 0.1552 - accuracy: 0.9621 - val_loss: 0.1265 - val_accuracy: 0.9693
Epoch 19/20
180/180 [==============================] - 2s 11ms/step - loss: 0.1441 - accuracy: 0.9630 - val_loss: 0.1164 - val_accuracy: 0.9747
Epoch 20/20
180/180 [==============================] - 2s 10ms/step - loss: 0.1364 - accuracy: 0.9666 - val_loss: 0.1104 - val_accuracy: 0.9747

As we can see, we have a pretty good validation accuracy (around 97%) and the models don’t appear to be overfitted! Let’s visualize this process then move onto our next model.

from matplotlib import pyplot as plt
plt.plot(history.history["accuracy"], label="training data")
plt.plot(history.history["val_accuracy"], label="validation data")
plt.gca().set(xlabel = "epoch", ylabel = "accuracy")
plt.legend()

output_21_1.png

Model 2: Text Only

Our next model is using just text. We repeat the same process as above, but with the text instead of titles.

text_input = tf.keras.Input(
    shape=(1,),
    name = "text",
    dtype = "string"
)

For our layers, I initially tried the same parameters as the title model, but decided to try 0.3 and 0.4 for the dropout because the model seemed to be a bit overfitted. 0.4 had the best accuracy rates, so I went with that one! After this, let’s fit our model and visualize it.

# layers for processing the title
text_features = text_vectorize_layer(text_input)

# Add embedding layer, dropout
text_features = layers.Embedding(size_vocabulary, output_dim = 3, name="embedding2")(text_features)
text_features = layers.Dropout(0.4)(text_features)
text_features = layers.GlobalAveragePooling1D()(text_features)
text_features = layers.Dropout(0.4)(text_features)
text_features = layers.Dense(2, activation='relu')(text_features)

output = layers.Dense(2, name="fake")(text_features) 
model2 = tf.keras.Model(
    inputs = text_input,
    outputs = output
)
from tensorflow.keras import utils

model2.summary()
utils.plot_model(model2)
Model: "model_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 text (InputLayer)           [(None, 1)]               0         
                                                                 
 text_vectorization_3 (TextV  (None, 500)              0         
 ectorization)                                                   
                                                                 
 embedding2 (Embedding)      (None, 500, 3)            6000      
                                                                 
 dropout_4 (Dropout)         (None, 500, 3)            0         
                                                                 
 global_average_pooling1d_2   (None, 3)                0         
 (GlobalAveragePooling1D)                                        
                                                                 
 dropout_5 (Dropout)         (None, 3)                 0         
                                                                 
 dense_2 (Dense)             (None, 2)                 8         
                                                                 
 fake (Dense)                (None, 2)                 6         
                                                                 
=================================================================
Total params: 6,014
Trainable params: 6,014
Non-trainable params: 0
_________________________________________________________________

output_26_1.png

model2.compile(optimizer="adam",
              loss = losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=["accuracy"])
history = model2.fit(train, validation_data=val, epochs=20)
Epoch 1/20


/usr/local/lib/python3.7/dist-packages/keras/engine/functional.py:559: UserWarning: Input dict contained keys ['title'] which did not match any model input. They will be ignored by the model.
  inputs = self._flatten_to_reference_inputs(inputs)


180/180 [==============================] - 4s 19ms/step - loss: 0.6907 - accuracy: 0.5299 - val_loss: 0.6852 - val_accuracy: 0.5364
Epoch 2/20
180/180 [==============================] - 3s 18ms/step - loss: 0.6636 - accuracy: 0.7319 - val_loss: 0.6298 - val_accuracy: 0.8787
Epoch 3/20
180/180 [==============================] - 3s 18ms/step - loss: 0.5893 - accuracy: 0.8307 - val_loss: 0.5279 - val_accuracy: 0.9204
Epoch 4/20
180/180 [==============================] - 3s 17ms/step - loss: 0.4990 - accuracy: 0.8798 - val_loss: 0.4294 - val_accuracy: 0.9272
Epoch 5/20
180/180 [==============================] - 3s 17ms/step - loss: 0.4213 - accuracy: 0.9020 - val_loss: 0.3582 - val_accuracy: 0.9516
Epoch 6/20
180/180 [==============================] - 3s 17ms/step - loss: 0.3631 - accuracy: 0.9121 - val_loss: 0.2960 - val_accuracy: 0.9602
Epoch 7/20
180/180 [==============================] - 4s 22ms/step - loss: 0.3200 - accuracy: 0.9188 - val_loss: 0.2558 - val_accuracy: 0.9567
Epoch 8/20
180/180 [==============================] - 3s 18ms/step - loss: 0.2850 - accuracy: 0.9242 - val_loss: 0.2293 - val_accuracy: 0.9595
Epoch 9/20
180/180 [==============================] - 4s 19ms/step - loss: 0.2611 - accuracy: 0.9284 - val_loss: 0.2031 - val_accuracy: 0.9676
Epoch 10/20
180/180 [==============================] - 3s 19ms/step - loss: 0.2408 - accuracy: 0.9313 - val_loss: 0.1800 - val_accuracy: 0.9691
Epoch 11/20
180/180 [==============================] - 3s 18ms/step - loss: 0.2269 - accuracy: 0.9288 - val_loss: 0.1676 - val_accuracy: 0.9691
Epoch 12/20
180/180 [==============================] - 3s 17ms/step - loss: 0.2123 - accuracy: 0.9371 - val_loss: 0.1555 - val_accuracy: 0.9684
Epoch 13/20
180/180 [==============================] - 3s 19ms/step - loss: 0.1982 - accuracy: 0.9368 - val_loss: 0.1382 - val_accuracy: 0.9731
Epoch 14/20
180/180 [==============================] - 3s 18ms/step - loss: 0.1908 - accuracy: 0.9388 - val_loss: 0.1335 - val_accuracy: 0.9717
Epoch 15/20
180/180 [==============================] - 3s 17ms/step - loss: 0.1822 - accuracy: 0.9383 - val_loss: 0.1325 - val_accuracy: 0.9720
Epoch 16/20
180/180 [==============================] - 3s 18ms/step - loss: 0.1768 - accuracy: 0.9361 - val_loss: 0.1191 - val_accuracy: 0.9753
Epoch 17/20
180/180 [==============================] - 3s 18ms/step - loss: 0.1724 - accuracy: 0.9398 - val_loss: 0.1058 - val_accuracy: 0.9776
Epoch 18/20
180/180 [==============================] - 3s 17ms/step - loss: 0.1653 - accuracy: 0.9404 - val_loss: 0.1061 - val_accuracy: 0.9773
Epoch 19/20
180/180 [==============================] - 3s 18ms/step - loss: 0.1639 - accuracy: 0.9401 - val_loss: 0.0997 - val_accuracy: 0.9782
Epoch 20/20
180/180 [==============================] - 3s 17ms/step - loss: 0.1585 - accuracy: 0.9399 - val_loss: 0.0978 - val_accuracy: 0.9802

As we can see, the validation accuracy is pretty high, sitting at around 98%! This looks promising, but let’s try our final model.

plt.plot(history.history["accuracy"], label="training data")
plt.plot(history.history["val_accuracy"], label="validation data")
plt.gca().set(xlabel = "epoch", ylabel = "accuracy")
plt.legend()

output_29_2.png

Model 3: Title and Text

Now, we create a model that uses both the title and text. We begin by concatenating our title and text feattures.

main = layers.concatenate([title_features, text_features], axis = 1)

Next, we will create layers for this concatenation and use it for our output.

main = layers.Dense(4, activation='relu')(main)
output = layers.Dense(4, name="fake")(main) 
model3 = tf.keras.Model(
    inputs = [title_input, text_input],
    outputs = output
)
model3.compile(optimizer="adam",
              loss = losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=["accuracy"])
history = model3.fit(train, 
                    validation_data=val,
                    epochs = 20)
Epoch 1/20
180/180 [==============================] - 8s 40ms/step - loss: 1.0807 - accuracy: 0.7412 - val_loss: 0.8972 - val_accuracy: 0.8238
Epoch 2/20
180/180 [==============================] - 4s 22ms/step - loss: 0.8153 - accuracy: 0.8341 - val_loss: 0.6897 - val_accuracy: 0.9233
Epoch 3/20
180/180 [==============================] - 4s 22ms/step - loss: 0.6401 - accuracy: 0.9288 - val_loss: 0.5430 - val_accuracy: 0.9631
Epoch 4/20
180/180 [==============================] - 4s 22ms/step - loss: 0.5208 - accuracy: 0.9593 - val_loss: 0.4383 - val_accuracy: 0.9766
Epoch 5/20
180/180 [==============================] - 4s 21ms/step - loss: 0.4267 - accuracy: 0.9735 - val_loss: 0.3624 - val_accuracy: 0.9858
Epoch 6/20
180/180 [==============================] - 4s 21ms/step - loss: 0.3540 - accuracy: 0.9797 - val_loss: 0.2913 - val_accuracy: 0.9898
Epoch 7/20
180/180 [==============================] - 4s 21ms/step - loss: 0.2939 - accuracy: 0.9842 - val_loss: 0.2426 - val_accuracy: 0.9927
Epoch 8/20
180/180 [==============================] - 4s 21ms/step - loss: 0.2490 - accuracy: 0.9837 - val_loss: 0.2031 - val_accuracy: 0.9921
Epoch 9/20
180/180 [==============================] - 4s 22ms/step - loss: 0.2139 - accuracy: 0.9854 - val_loss: 0.1734 - val_accuracy: 0.9927
Epoch 10/20
180/180 [==============================] - 4s 21ms/step - loss: 0.1824 - accuracy: 0.9874 - val_loss: 0.1494 - val_accuracy: 0.9940
Epoch 11/20
180/180 [==============================] - 4s 22ms/step - loss: 0.1607 - accuracy: 0.9876 - val_loss: 0.1281 - val_accuracy: 0.9929
Epoch 12/20
180/180 [==============================] - 4s 21ms/step - loss: 0.1408 - accuracy: 0.9880 - val_loss: 0.1161 - val_accuracy: 0.9915
Epoch 13/20
180/180 [==============================] - 4s 22ms/step - loss: 0.1216 - accuracy: 0.9896 - val_loss: 0.0978 - val_accuracy: 0.9947
Epoch 14/20
180/180 [==============================] - 4s 21ms/step - loss: 0.1098 - accuracy: 0.9904 - val_loss: 0.0926 - val_accuracy: 0.9921
Epoch 15/20
180/180 [==============================] - 4s 21ms/step - loss: 0.0972 - accuracy: 0.9906 - val_loss: 0.0795 - val_accuracy: 0.9949
Epoch 16/20
180/180 [==============================] - 4s 22ms/step - loss: 0.0895 - accuracy: 0.9900 - val_loss: 0.0719 - val_accuracy: 0.9947
Epoch 17/20
180/180 [==============================] - 4s 22ms/step - loss: 0.0786 - accuracy: 0.9920 - val_loss: 0.0666 - val_accuracy: 0.9944
Epoch 18/20
180/180 [==============================] - 4s 22ms/step - loss: 0.0724 - accuracy: 0.9904 - val_loss: 0.0542 - val_accuracy: 0.9956
Epoch 19/20
180/180 [==============================] - 4s 21ms/step - loss: 0.0676 - accuracy: 0.9909 - val_loss: 0.0494 - val_accuracy: 0.9960
Epoch 20/20
180/180 [==============================] - 7s 37ms/step - loss: 0.0627 - accuracy: 0.9913 - val_loss: 0.0469 - val_accuracy: 0.9953

As we can see, this model performed the best, sitting at 99.5%!

from matplotlib import pyplot as plt
plt.plot(history.history["accuracy"], label="training data")
plt.plot(history.history["val_accuracy"], label="validation data")
plt.gca().set(xlabel = "epoch", ylabel = "accuracy")
plt.legend()

output_36_1.png

Looking at all of our models, it appears that the model using both text and title scored the highest and is our best model! This makes sense, as using both the text and title gives our model more information to learn from and thus may be more helpful. Let’s evaluate our model using test data and see how it performs!

Model Evaluation on Test Data

Let’s read in our test data.

test_url = "https://github.com/PhilChodrow/PIC16b/blob/master/datasets/fake_news_test.csv?raw=true"

test_data = pd.read_csv(test_url)
test_data

test = make_dataset(test_data)

Now, we will evaluate our best model on the testing data:

test_evaluate = model3.evaluate(test)
225/225 [==============================] - 3s 12ms/step - loss: 0.0541 - accuracy: 0.9930

We get an accuracy of about 99.3%. This tells us that the model performed very well; if we used the model as a fake news predictor, we would be right about 99.3% of the time.

Now, let’s create an embedding visualization.

Embedding Visualization

weights = model2.get_layer('embedding2').get_weights()[0] # get the weights from the embedding layer
vocab = text_vectorize_layer.get_vocabulary()                # get the vocabulary from our data prep for later

from sklearn.decomposition import PCA
pca = PCA(n_components=2)
weights = pca.fit_transform(weights)

embedding_df = pd.DataFrame({
    'word' : vocab, 
    'x0'   : weights[:,0],
    'x1'   : weights[:,1]
})
import plotly.express as px 
fig = px.scatter(embedding_df, 
                 x = "x0", 
                 y = "x1", 
                 size = [2]*len(embedding_df),
                # size_max = 2,
                 hover_name = "word")

fig.show()

word_plot.png

Five words that I found interpretable and interesting in this visualization are “apparently,” “fox,” “reportedly,” “radical,” and “21wire,” all of which reside towards the left side of the visualization. “fox” and “21wire” most likely refers to Fox News and 21st Century Wire, which are both news sources that have been criticized for spreading propoganda and false or exhaggerated information. “apparently” and “reportedly” also make sense to me because these are words that are often used by writers when they can’t be sure about the information; these words allow them to detach themselves from involvement. “radical” also makes sense as a fake news word because a lot of fake news articles tend to attack “radical leftists.”

That’s it for today! Thank you so much for reading!

Written on May 21, 2022