Backup Header Below

Polite warnings are surprisingly good at reducing hate speech on social media

Researchers from NYU’s Center for Social Media and Politics had an idea: What if you tracked the followers of accounts that were banned for hate speech, and sent these followers (who also used hate speech in their tweets) a warning about their bad behaviors? Would these users be pressured into changing what they posted? It turns out, the answer is yes—at least for a brief time after receiving the warning. The researchers’ findings were published Monday in the journal Perspectives on Politics.

“One of the tradeoffs that we always face in these public policy conversations about whether to suspend accounts is what’s going to happen to these people on other platforms,” says Joshua Tucker, co-director of NYU’s Center for Social Media and Politics and an author on the paper. “There’s been more recent research showing that after a bunch of right-wing white nationalists in Britain were suspended, there was a big uptick in the amount of activity among these groups on Telegram.”

They wanted to come up with a solution that would hit the “sweet spot” where the accounts wouldn’t necessarily be banned, but would receive some kind of push to stop them from using hate speech, says Mikdat Yildirim, a PhD student at NYU and the first author on this study. This way, the intervention would “not limit their rights to express themselves, and also prevent them from migrating to more radical platforms.” In other words, it was a warning, not a silencing.

The plan? Create a set of six Twitter accounts operated like virtual, vigilante patrollers, finding, announcing, and tagging the offenders on their public feed. The warnings that these accounts posted had a similar structure. Each tagged the full username of an account that used hate speech, warned them that a given account they followed was suspended recently for using similar language, and that they could be next if they kept tweeting like they did. Each account worded their warnings slightly differently.

But first, the researchers had to identify the potential offenders who were likely to get suspended. The team downloaded more than 600,000 tweets on July 21, 2020 that had been posted in the past week and narrowed them down to tweets that contained at least one word from hateful language dictionaries used in previous research (these were centered around racial or sexual hate). They followed around 55 accounts and were able to gather the follower list of 27 of them before those accounts got suspended.

“We didn’t send these messages to all of their followers; we sent these messages to only those followers who employed hate speech in more than 3 percent of their tweets,” Yildirim explains. That resulted in a total of around 4,400 users who became part of the study. Out of these, 700 were in a control group that received no warnings at all, and 3,700 users were warned by one of the six researcher-run Twitter accounts.

The plan? Create a set of six Twitter accounts operated like virtual, vigilante patrollers, finding, announcing, and tagging the offenders on their public feed. The warnings that these accounts posted had a similar structure. Each tagged the full username of an account that used hate speech, warned them that a given account they followed was suspended recently for using similar language, and that they could be next if they kept tweeting like they did. Each account worded their warnings slightly differently.

But first, the researchers had to identify the potential offenders who were likely to get suspended. The team downloaded more than 600,000 tweets on July 21, 2020 that had been posted in the past week and narrowed them down to tweets that contained at least one word from hateful language dictionaries used in previous research (these were centered around racial or sexual hate). They followed around 55 accounts and were able to gather the follower list of 27 of them before those accounts got suspended.

“We didn’t send these messages to all of their followers; we sent these messages to only those followers who employed hate speech in more than 3 percent of their tweets,” Yildirim explains. That resulted in a total of around 4,400 users who became part of the study. Out of these, 700 were in a control group that received no warnings at all, and 3,700 users were warned by one of the six researcher-run Twitter accounts.

Source : https://www.popsci.com/technology/nyu-researches-hate-speech-warnings-twitter/

    Other Press Releases