iStock
2021
Proceedings of the National Academy of Sciences

Despite heightened awareness of the detrimental impact of hate speech on social media platforms on affected communities and public discourse, there is little consensus on approaches to mitigate it. While content moderation—either by governments or social media companies—can curb online hostility, such policies may suppress valuable as well as illicit speech and might disperse rather than reduce hate speech. As an alternative strategy, an increasing number of international and nongovernmental organizations (I/NGOs) are employing counterspeech to confront and reduce online hate speech. Despite their growing popularity, there is scant experimental evidence on the effectiveness and design of counterspeech strategies (in the public domain). Modeling our interventions on current I/NGO practice, we randomly assign English-speaking Twitter users who have sent messages containing xenophobic (or racist) hate speech to one of three counterspeech strategies—empathy, warning of consequences, and humor—or a control group. Our intention-to-treat analysis of 1,350 Twitter users shows that empathy-based counterspeech messages can increase the retrospective deletion of xenophobic hate speech by 0.2 SD and reduce the prospective creation of xenophobic hate speech over a 4-wk follow-up period by 0.1 SD. We find, however, no consistent effects for strategies using humor or warning of consequences. Together, these results advance our understanding of the central role of empathy in reducing exclusionary behavior and inform the design of future counterspeech interventions.