Revisiting ProPublica’s Report on Algorithmic Hate Speech

Last year ProPublica investigated Facebook’s hate speech algorithms learning that moderators were being taught to elevate “white men” over “black children” as a protected class. It’s worth revisiting to show how the complexities of the English language confound machine logic.

Machines correlate without causation. That’s a key concept in Interpersonal Divide’s critique of “artificial intelligence.” Technical systems are adept at answering 4 of the 5 “Ws” and H of mass communication: Who, What, When, Where and How.

Those are the only qualifiers you need to make a sale. Social media, especially Facebook, sell to and surveil us simultaneously whenever we feed its algorithms. If we receive a new pair of shoes in the mail for our birthday, and we display them, thanking Grandma, the machine knows who got what gift when and how from where.  That’s the point. That’s social networks create value via consumer narratives.

Interpersonal Divide cites computer scientist and author Jaron Lanier’s explanation. Machines with copious amounts of data may be able to discern odd commercial truths: People with bushy eyebrows who like purple toadstools in spring might hanker for hot sauce on mashed potatoes in autumn. That would enable a hot sauce vendor to place a link in front of bushy-eyebrowed Facebookers posting toadstool photos, increasing the chance of a sale, “and no one need ever know why.”[1]

The narrative knows:

  • Who: people with bushy eyebrows.
  • What: hot sauce
  • When: autumn
  • Where: Facebook IP address
  • How: on mashed potatoes

No one ever need know Why. A sale is a sale is a sale.

When it comes to Facebook’s algorithm, however, we do know why “White Men” outrank “black children” according to machine logic. The algorithm, which purportedly has been tweaked since the ProPublica report, bases hate speech on what seems at first blush a logical foundation. If a suspected hate message targets a protected class, such as race and gender (white men), that trumps a class modified by subset such as age (black children).

Of course, the English language doesn’t work this way, especially since one word may have multiple meanings that change based on its position in a sentence. Rearrange words of this sentence–“Stop drinking that this instant; tea is better for you!“–and you get several variations, such as “Better stop drinking that; this instant tea is for you.”

As the ProPublica noted, Facebook allowed U.S. Congressman Clay Higgins to threaten “radicalized” Muslims with this post: “Hunt them, identify them, and kill them. Kill them all. For the sake of all that is good and righteous. Kill them all.”

However Facebook removed this post from Boston poet and Black Lives Matter activist Didi Delgado: “All white people are racist. Start from this reference point, or you’ve already failed.

Why? Human monitors trained by machine to think like one followed the algorithmic rule that “white people” + attack (racist) trumped “radicalized” (subset) Muslims. Everyone seemed to miss “hunt” and “kill them all.”

This illustration depicts how that could have happened.

Facebook Bias

Interpersonal Divide asks readers to understand technology from a programming rather than consumer perspective so as to explain “why” things happen in the age of the machine.

This is one small incident that indicates a larger issue of machines correlating on biased data with flawed computer logic. You can read more about Facebook rules by visiting these sites referenced in this report:


[1] Jaron Lanier, Who Owns the Future (New York: Simon and Schuster, 2013), p. 115.

 

Leave a comment