Critical Review: Automatic Detection of Hate Speech

Abstract:

The research emphasized that some of the significant issues on social media are abusive and harassing text messages, including controversial topics, swearing, abusive language, and taboo words which are not ethical for the human being. Social media is an independent platform where people can put their thought using text messages, without knowing what would be effected on other’s minds and behaviors. A vast majority of people who regularly engage with social media platforms will have encountered a harasser. Even the biggest enthusiasts included that there is a widespread phenomenon that exists to encounter harassing behaviors. In this research paper the researcher used text mining and machine learning algorithms to detect and identify harassing behaviors and abusive text messages. The researcher also focused on the automated process of harassment classification which will also take supervised action against the harassers. The researcher discussed on some of the significant research issues and challenges on hate speech and how to identify abusive text and detect harassing behaviors of the people which are used social media.

Keywords: NLP, Machine Learning, Social Media

I. Introduction

Social media has made it simple for us to convey rapidly and effectively with family, companions, and colleagues, just as sharing encounters and telling others of our sentiments and convictions. These suppositions and convictions might be about world occasions or nearby issues, legislative issues or religion, interests, affiliations, associations, items, individuals, and a wide assortment of different subjects. Our discussions and remarks can be intently focused on or generally communicate to the point that relying upon the subject [1], they can become a web sensation. Shockingly, social media is additionally generally utilized by abusers, for precisely the reasons recorded previously. Numerous culprits ‘cover up’ behind the way that they will be unable to be promptly distinguished, saying things that they wouldn’t think about saying eye to eye, which could be viewed as weak. Online maltreatment takes a few structures, and exploited people are not restricted to open figures. They can carry out any responsibility, be of all ages, sex, sexual introduction or social or ethnic foundation, and live in any place [2].

II. Literature Review

Cyberbullying can happen online just, or as a major aspect of progressively broad harassment. Cyberbullies might be individuals who are known to you or unknown. Like all domineering jerks, they recurrence attempt to induce others to participate. You could be harassed for your religious or political convictions, race or skin shading, or self-perception, in the event that you have a psychological or physical handicap or for no clear reason at all [3].

Cyberbullying for the most part contains sending undermining or generally frightful messages or different interchanges to individuals by means of social media, gaming locales, content or email, posting humiliating or embarrassing videos on facilitating destinations, for example, YouTube or Vimeo, or hassling through rehashed writings, texts or visits. Progressively, it is executed by posting or sending pictures, videos or private subtleties acquired by means of sexting, without the injured individual’s authorization. Some cyberbullies set up Facebook pages and other social media accounts absolutely to menace others [4] [5].

The impacts of cyberbullying range from disturbance and mellow misery to – in the most outrageous cases – self-damage and suicide. This can be a reality for powerless individuals, or without a doubt, anyone made to feel helpless through cyberbullying or other individual conditions [5].

Chikashi Nobata et. al., (2016) underlined that the Detection of damaging language in client-created online substances has turned into an issue of expanding significance lately. Most present business techniques utilize boycotts and normal articulations, anyway these measures miss the mark while fighting with progressively unobtrusive, less ham-fisted instances of hate speech. In this work, we build up an AI based technique to distinguish hate speech on online client remarks from two areas which beats a cutting-edge profound learning approach. We likewise build up a corpus of client remarks clarified for oppressive language, the first of its sort. At last, we utilize our identification instrument to investigate injurious language after some time and in various settings to additionally upgrade our insight into this conduct [1].

Hossein Hosseini et.al. (2017) focused on social media stages giving a situation where individuals can unreservedly participate in discourses. Lamentably, they additionally empower a few issues, for example, online provocation. As of late, Google and Jigsaw began an undertaking called Perspective, which utilizes AI to naturally distinguish dangerous language. A showing site has been additionally propelled, which enables anybody to type an expression in the interface and momentarily observe the danger score [1].

In this paper the researcher proposed an assault on the Perspective dangerous recognition framework dependent on the antagonistic models. We demonstrate that a foe can quietly alter an exceptionally poisonous expression such that the framework appoints essentially lower danger score to it. We apply the assault on the example phrases given in the Perspective site and demonstrate that we can reliably decrease the lethality scores to the dimension of the non-poisonous expressions. The presence of such ill-disposed models is exceptionally destructive for poisonous discovery frameworks and genuinely undermines their ease of use [2].

B. Sri Nandhinia and J.I.Sheebab (2015) expressed that social systems administration destinations (SNS) is as a rule quickly expanded as of late, which gives stage to interface individuals everywhere throughout the world and offer their interests. Be that as it may, Social Networking Sites is giving chances to cyberbullying exercises. Cyberbullying is bugging or offending an individual by sending messages of harming or compromising nature utilizing electronic correspondence. Cyberbullying presents huge danger to physical and emotional well-being of the people in question. Discovery of cyberbullying and the arrangement of resulting preventive measures are the fundamental game-plans to battle cyberbullying. The proposed technique is a powerful strategy to distinguish cyberbullying exercises on social media. The identification technique can recognize the nearness of cyberbullying terms and order cyberbullying exercises in social systems, for example, Flaming, Harassment, Racism and Terrorism, utilizing Fuzzy rationale and Genetic calculation [3].

Divya Bansal, Sanjeev Sofat (2016) stressed that Social spam is a colossal and entangled issue tormenting social systems administration locales in a few different ways. This incorporates posts, surveys or writes containing item advancements and challenges, grown-up substance and general spam. It has been discovered that social media sites, for example, Twitter is likewise going about as a merchant of obscene substance, despite the fact that it is considered against their own expressed arrangement. In this paper, we have surveyed the instance of Twitter and found that spammers adding to explicit substance pursue authentic Twitter clients and send URLs that interface clients to obscene destinations. Social examination of such sort of spammers has been directed utilizing diagram based just as substance-based data got utilizing straightforward content administrators to think about their attributes. In the present examination, around 74,000 tweets containing explicit grown-up substance posted by around 18,000 clients have been gathered and broke down. The examination demonstrates that the clients posting explicit substance satisfy the attributes of spammers as expressed by the standards and rules of Twitter. It has been seen that the ill-conceived utilization of social media for spreading social spam has been spreading at a quick pace, with the system organizations turning a visually impaired eye toward this developing issue. Obviously, there is a massive prerequisite to construct a viable answer for expel questionable and libellous substance as expressed above from social systems administration sites to advance and ensure open respectability and the welfare of kids and grown-ups. It is additionally basic in order to improve open involvement of real clients utilizing social media and shield them from damage to their open personality on the World Wide Web. Further in this paper, arrangement of obscene spammers and real clients has additionally been performed utilizing AI system. Exploratory outcomes demonstrate that Random Forest classifier can foresee explicit spammers with a sensibly high precision of 91.96 %. As far as we could possibly know, this is the principal endeavour to investigate and classify the conduct of obscene clients in Twitter as spammers. Up until this point, the work has been accomplished for distinguishing spammers yet they are not explicitly focusing on obscene spammers [4].

Karthik Dinakar et.al. (2012) underscored that cyberbullying (badgering on social systems) is broadly perceived as a genuine social issue, particularly for youths. It is as much a danger to the suitability of online social systems for youth today as spam used to be to email in the beginning of the Internet. Current work to handle this issue has included social and mental examinations on its commonness just as its negative impacts on youths. While genuine arrangements lay on instructing youth to have solid individual connections, few have considered creative plan of social system programming as an apparatus for alleviating this issue. Alleviating cyberbullying includes two key parts: hearty strategies for successful location and intelligent UIs that urge clients to think about their conduct and their decisions[5][4].

Spam channels have been fruitful by applying measurable methodologies like Bayesian systems and shrouded Markov models. They can, similar to Google’s Gmail, total human spam decisions since spam is sent almost indistinguishably such a large number of individuals. Tormenting is increasingly customized, changed, and logical. In this work, we present a methodology for harassing location dependent on cutting edge characteristic language handling and a good judgment information base, which grants acknowledgment over a wide range of points in regular day to day existence. We break down an increasingly tight scope of specific topic related with harassment (for example appearance, insight, racial and ethnic slurs, social acknowledgment, and dismissal), and develop Bully Space, a sound judgment learning base that encodes specific information about harassing circumstances. We at that point perform joint dissuading presence of mind information about a wide scope of regular day to day existence themes. We examine messages utilizing our novel Analogy Space good judgment thinking strategy. We additionally consider social system investigation and different components. We assess the model on genuine cases that have been accounted for by clients on Form spring, a social systems administration site that is well-known with young people. On the mediation side, we investigate a lot of intelligent client cooperation ideal models with the objective of advancing sympathy among social system members. We propose an ‘aviation authority’- like dashboard, which cautions mediators to huge scale flare-ups that seem, by all accounts, to be heightening or spreading and encourages them organize the present storm of client grievances. For potential exploited people, we give instructive material that advises them about how to adapt to the circumstance and associates them with passionate help from others. A client assessment demonstrates that in-setting, directed, and dynamic help amid cyberbullying circumstances cultivates end-client reflection that advances better adapting procedures [5].

Paula Fortuna, Sérgio Nunes (2018) emphasized that the scientific study of hate speech, from a computer science point of view, is recent. This survey organizes and describes the current state of the field, providing a structured overview of previous approaches, including core algorithms, methods, and main features used. This work also discusses the complexity of the concept of hate speech, defined in many platforms and contexts, and provides a unifying definition. This area has an unquestionable potential for societal impact, particularly in online communities and digital media platforms. The development and systematization of shared resources, such as guidelines, annotated datasets in multiple languages, and algorithms, is a crucial step in advancing the automatic detection of hate speech. [6]

Anna Schmidt, Michael Wiegand (2017). Emphasized the term hate speech. The researcher decided in favour of using this term since it can be considered a broad umbrella term for numerous kinds of insulting user-created content addressed in the individual works we summarize in this paper. Hate speech is also the most frequently used expression for this phenomenon, and is even a legal term in several countries. Below we list other terms that are used in the NLP community. This should also help readers with finding further literature on that task. Hate speech is commonly defined as any communication that disparages group on the basis of some characteristic such as race, color, ethnicity, gender, sexual orientation, nationality, religion, or other characteristics (Nockleby, 2000)[7].

III. Methodology

Text Mining Approaches in Automatic Hate Speech Detection In this research article the researcher described on algorithms for hate speech detection, and also other studies focusing on related concepts (e.g., Cyberbullying). Finding the right features for a classification problem can be one of the more demanding tasks when using machine learning. Therefore, the researcher allocates this specific section to describe the features already used by other authors. We divide the features into two categories: general features used in text mining, which are common in other text mining fields; and the specific hate speech detection features, which we found in hate speech detection documents and are intrinsically related to the characteristics of this problem. We present our analysis in this section.

  1. General Features Used in Text Mining. The majority of the papers we found try to adapt strategies already known in text mining to the specific problem of automatic detection of hate speech. It defines general features as the features commonly used in text mining. We start by the most simplistic approaches that use dictionaries and lexicons.
  2. Dictionaries. One strategy in text mining is the use of dictionaries. This approach consists in making a list of words (the dictionary) that are searched and counted in the text. These frequencies can be used directly as features or to compute scores.
  3. In the case of hate speech detection, this has been conducted using: Content words (such as insults and swear words, reaction words, and personal pronouns) collected from www.noswearing.com
  4. A number of profane words in the text, with a dictionary that consists of 414 words, including acronyms and abbreviations, where the majority are adjectives and nouns.
  5. Label Specific Features consisted in using frequently used forms of verbal abuse as well as widely used stereotypical utterances.
  6. Ortony Lexicon was also used for negative affect detection; the Ortony lexicon contains a list of words denoting a negative connotation and can be useful, because not every rude comment necessarily contains profanity and can be equally harmful .

This methodology can be used with an additional step of normalization, by considering the total number of words in each comment. Besides, it is also possible to use this kind of approach with regular expressions. Rule-based approaches, sentiment analysis, and deep learning. For the specific hate speech detection features, we found mainly othering language, the superiority of the in-group, and focus on stereotypes. Besides, we observed that the majority of the studies only considers generic features and do not use particular features for hate speech. This can be problematic because hate speech is a complex social phenomenon in constant evolution and supported in language nuances. Finally, we identified challenges and opportunities in this field, namely the scarcity of open-source code and platforms that automatically classify hate speech; the lack of comparative studies that evaluate the existing approaches; and the absence of studies in languages other than English.

IV. Cases of hate speech

Hate speech has become a popular topic in recent years. This is reflected not only by the increased media coverage of this problem but also by the growing political attention. There are several reasons to focus on hate speech automatic detection, which we discuss in the following list: • European Union Commission directives. In recent years, the European Union Commission has been conducting different initiatives for decreasing hate speech. Several programs are being founded in the fight of hate speech.

Also, European regulators accused Twitter of not being good enough at removing hate speech from its platform.

  1. Automatic techniques not available. Automated techniques aim to programmatically classify text as hate speech, making its detection easier and faster for the ones that have the responsibility to protect the public [9, 65]. These techniques can give a response in less than 24h, as presented in the previous point. Some studies have been conducted about the automatic detection of hate speech, but the tools provided are scarce.
  2. Lack of data about hate speech. There is a general lack of systematic monitoring, documentation, and data collection of hate and violence, namely, against LGBTI (lesbian, gay, bisexual, transgender, and intersex) people.
  3. Nevertheless, detecting hate speech is a very important task, because it is connected with actual hate crimes and automatic hate speech detection in text can also provide data about this phenomenon.
  4. Hate speech removal. Some companies and platforms might be interested in hate speech detection and removal. For instance, online media publishers and online platforms, in general, need to attract advertisers and therefore cannot risk becoming known as platforms for hate speech

V. Issues and challenges

Hate speech is a complex phenomenon and its detection problematic. Some challenges and difficulties were highlighted by the authors of the surveyed papers:

  1. Low agreement in hate speech classification by humans, indicating that this classification would be harder for machines.
  2. The task requires expertise about culture and social structure.
  3. The evolution of social phenomena and language makes it difficult to track all racial and minority insults.
  4. Language evolves quickly, in particular among young populations that communicate frequently in social networks.
  5. Despite the offensive nature of hate speech, abusive language may be very fluent and grammatically correct, can cross sentence boundaries.

VI. Discussion and conclusion

This research article is based on a critical overview on how the automatic detection of hate speech in text has evolved over the past years. First, we analyzed the concept of hate speech in different contexts, from social network platforms to other organizations. Based on our analysis, we proposed a unified and clearer definition of this concept that can help to build a model for the automatic detection of hate speech. Additionally, we presented examples and rules for classification found in the literature, together with the arguments in favour or against those rules. Our critical view pointed out that we have a more inclusive and general definition about hate speech than other perspectives found in the literature. This is the case, because we propose that subtle forms of discrimination on the internet and online social networks should also be spotted. With our analysis, we also concluded that it would be important to compare hate speech with cyberbullying, abusive language, discrimination, toxicity, flaming, extremism and radicalization. It more difficult to compare results from different studies. Nevertheless, we found three available datasets, in English and German. Additionally, we compared the diverse studies using algorithms for hate speech detection, and we rank them in terms of performance. Our goal was to reach conclusions about which approaches are being more successful. However, and in part due to the lack of standard datasets, we find that there is no particular approach proving to reach better results among the several articles.

In this paper, the researcher emphasized that a critical review on the automatic detection of hate speech. This task is usually framed as a supervised learning problem. Fairly generic features, such as bag of words or embeddings, systematically yield reasonable classification performance. Character-level approaches work better than token-level approaches. Lexical resources, such as list of slurs, may help classification, but usually only in combination with other types of features. Various complex features using more linguistic knowledge, such as dependency parse information, or features modelling specific linguistic constructs, such as imperatives or politeness, have also been shown to be effective. Information derived from text may not be the only cue suggesting the presence of hate speech. It may be complemented by meta-information or information from other modalities (e.g. images attached o messages). Making judgments about the general effectiveness of many of the complex features is

References

  1. Chikashi Nobata, Joel Tetreault, Achint Thomas, Yashar Mehrdad, Yi Chang (2016). Abusive Language Detection in Online User Content, International World Wide Web Conferences Steering Committee Republic and Canton of Geneva, Switzerland ©2016, ISBN: 978-1-4503-4143-1 doi>10.1145/2872427.2883062.
  2. Hossein Hosseini, Sreeram Kannan, Baosen Zhang, Radha Poovendran (2017). Deceiving Google’s Perspective API Built for Detecting Toxic Comments, Machine Learning, and 27 Feb 2017.
  3. B. Sri Nandhinia, J.I.Sheebab (2015). Online Social Network Bullying Detection Using Intelligence Techniques, Procedia Computer Science, Volume 45, 2015, Pages 485-492, Elsevier, https://doi.org/10.1016/j.procs.2015.03.085
  4. Divya Bansal, Sanjeev Sofat (2016). Behavioural analysis and classification of spammers distributing pornographic content in social media, Social Network Analysis and Mining, 24 June 2016, Springer.
  5. Karthik Dinakar, Birago Jones, Catherine Havasi, Henry Lieberman, Rosalind Picard (2012). Common Sense Reasoning for Detection, Prevention, and Mitigation of Cyberbullying, ACM Transactions on Interactive Intelligent Systems (TiiS) – Special Issue on Common Sense for Interactive Systems archive, Volume 2 Issue 3, September 2012.
  6. Paula Fortuna, Sérgio Nunes(2018).A Survey on Automatic Detection of Hate Speech in Text, ACM Computing Surveys, Vol. 51, No. 4, Article 85. Publication date: July 2018.
  7. Anna Schmidt, Michael Wiegand (2017).ASurveyonHateSpeechDetectionusingNaturalLanguageProcessing, Fifth International Workshop on Natural Language Processing for Social Media, pages 1–10, Valencia, Spain, and April 3-7, 2017.

Discursive Essay: Is the Current Law of Hate Speech Sufficient in the UK

In the UK, is the current law of hate speech sufficient?Current Law of hate speech is sufficient because it intrudes with the freedom of speech. I will further discuss why the current hate speech laws doesn’t need amendments.Current law for hate speech in the UK is the Public Order Act 1986 in which section the section 18 Part 3 of the act states that:A person who uses threatening, abusive or insulting words or behaviour, or displays any written material which is threatening, abusive or insulting, is guilty of an offence if—(a) he intends thereby to stir up racial hatred, or(b) having regard to all the circumstances racial hatred is likely to be stirred up thereby.Offences carry a maximum of 7 years of prison or a fine or both.Further the Criminal Justice and Public order Act amended POA 1986 by adding section 4 A which states that:(1) A person is guilty of an offence if, with intent to cause a person harassment, alarm or distress, he—(a) uses threatening, abusive or insulting words or behaviour, or disorderly behaviour, or(b) displays any writing, sign or other visible representation which is threatening, abusive or insulting,thereby causing that or another person harassment, alarm or distress.(5) A person guilty of an offence under this section is liable on summary conviction to imprisonment for a term not exceeding six months or to a fine not exceeding level 5 on the standard scale or to both.

In 2012, a campaign was started by Christian Institute and National Secular society to remove the word ‘insulting’ from Section 5. According to the leader, they protested that do police and court really need to deal with the insults. There are many victims as well for this for example a 16year old protester held a placard that said that ‘Scientology is not a religion it is a dangerous cult’ this was reported and the Crown Prosecution Service said that this is ‘Abusive and Insulting’ later on the court decided to drop the charges. This relates to Hate speech directly because as mentioned POA 1986 Section 4A(B). It really shouldn’t be in hands of police and court to decide whether or not these words are insulting unless and until it provokes violation or threatens society. Another example to prove that these words should be removed is when an Oxonian was arrested under section 5 for saying to A policeman: ‘Excuse me. Do you realise your horse is gay?’ In Democratic society you should have all the rights to call the horse gay. This proves that the police or the court would decide if you or someone else might feel insulted or not. Finally on 1st February 2014 the Change was incorporated on to insult someone would be no longer illegal. Current Law of Hate Speech is sufficient because Hate Speech Prohibition is often Anathematised by the advocates for the right of Speech. In the USA, there is no specific law for hate speech because the government thinks that it intruded the first and foremost right of Freedom of Speech.

As comparing it to European countries, they do have a specific law for Hate Speech. I Strongly agree that there should be the hate speech law because not only does it promotes violations but also plays a huge role in increasing the Crime rate. If we were to Compare the Crime rate of African countries like Jamaica, it certainly does play a huge role. Those who don’t abide by the law are punished and imprisoned or fined or both.In 2017, a 19year old Chelsea Russell was arrested because she quoted the line ‘Kill a snitch Nigga, rob a rich Nigga’ on her Instagram page. She did so to pay tribute to a 13-year-old Franky Murphy who was killed in a car accident. Hate crime investigators charged Russell for ‘sending a grossly offensive message by means of a public electronic communication network. In 2018, District judge found Russell Guilty and imposed a fine of 385 pounds and curfew. In 2019, Russel’s Conviction was not heard. To prove herself, she argued that she used the N word just how she uses Mate and that everyone she knows doesn’t think N word as abusive or insulting. The UK has many laws on Hate Speech. It is a difficult term to understand because it included lot many terms and it is vague. If at all they need to make changes would be to improve it with the precision. Companies that include Hate Speech policies is Facebook and YouTube.

Several activists and scholars have criticised the practice of limiting hate speech. Civil liberties activist Nadin Strossen says that while efforts to save hate speech censor hate speech have the goal of protecting the most vulnerable they are ineffective and may have the opposite effect: disadvantaged and ethnic minorities being charged with violating laws against hate speech. Kim Holmes, Vice President of the conservative Heritage foundation and the critic of hate speech theory has argues that it ‘assumes bad faith on the part of people regardless of their state intentions.’ And that it ‘obliterates the ethical responsibility of the individual’. Rubecca Ruth Gould, a professor of Islamic and Comparative Literature at the University of Birmingham, argues that laws against hate speech constitute viewpoint discrimination as the legal system punishes some viewpoints but not others, however other scholars such as Gideon Elford argue instead that ‘in so far as hate speech regulation targets the consequences of speech that are contingently connected with the substance of what is expressed then it is viewpoint discriminatory is only an indirect sense.’ Hate speech develops not only separation but also bigotry. It also damages people who are discriminated. Just because a person is different and is out from “society’s standards”, this person must not be discriminated against, humiliated nor required to listen to hate speech from others just because the others can say anything they want. Therefore, people who practice hate speech should be punished by law. I strongly support that instead of constantly amending the laws of hate speech, the government should educate people and make them aware about the happenings of hate speech since it is directly related to the increasing of crime rates and abusive and threatening constantly.

There has been much debate over the freedom of speech, hate speech and the hate speech legislation. As mentioned, a majority of developed democracies have hate speech laws. Countries like Sweden, Norway, Denmark, France, Germany, and several other countries follow these laws including the UK. On the other hand, the USA doesn’t have a specific law for hate speech. RS5 campaign director, Simon Calvert spoke to LBC Radio-‘Most people are amazed when you tell them the British law outlaws insults because we all recognize insults is such a vague and subjective term.’ I disagree with this statement because I think insult is not a big issue. There are various issues like murders, suicide to be considered and insult can be handled by oneself and if at all it leads to a physical fight, then the police should invade and help them out. Hate speech sometimes can have dreadful violence such as physical fights and actions. However, there is no real internationally agreed definition of hate speech. Often used definition is the one outlined in the Council of Europe’s Committee of Ministers’ Recommendation 97(20): “the term “hate speech” shall be understood as covering all forms of expression which spread, incite, promote or justify racial hatred, xenophobia, anti-Semitism or other forms of hatred based on intolerance, including: intolerance expressed by aggressive nationalism and ethnocentrism, discrimination and hostility against minorities, migrants and people of immigrant origin.

To the extreme, it is illegal to discuss about holocaust and Nazi ideologies. Apart from that, people have been prosecuted simply because the people were talking in online hate speech. I think that what we need is to implement these laws strongly and follow it strictly in a way that the right person land in prison or gets fined for violating the law.In conclusion, Current Law of hate speech is sufficient because it intrudes with the freedom of speech. According to the leader, they protested that do police and court really needed to deal with the insults. This proves that the police or the court would decide if you or someone else might feel insulted or not. Current Law of Hate Speech is sufficient because Hate Speech Prohibition is often Anathematised by the advocates for the right of Speech. In 2017, 19year old Chelsea Russell was arrested because she quoted the line ‘Kill a snitch Nigga, rob a rich Nigga’ on her Instagram page. She did so to pay tribute to 13-year-old Franky Murphy who was killed in a car accident. It is a difficult term to understand because it included lot many terms and it is vague. If at all they need to make changes would be to improve it with precision. I strongly support that instead of constantly amending the laws of hate speech, the government should educate people and make them aware about the happenings of hate speech since it is directly related to the increase of crime rates and abusive and threatening constantly. However, there is no real internationally agreed definition of hate speech.

To the extreme, it is illegal to discuss about holocaust and Nazi ideologies. Expressions of hatred toward someone on account of that person’s colour, race, disability, nationality (including citizenship), ethnic or national origin, religion, gender identity, or sexual orientation is forbidden. Any communication which is threatening or abusive, and is intended to harass, alarm, or distress someone is forbidden. The penalties for hate speech include fines, imprisonment, or both. The Police and CPS have formulated a definition of hate crimes and hate incidents, with hate speech forming a subset of these. A hate incident becomes a hate crime if it crosses the boundary of criminality.

The Scottish government has held an independent review of hate crime laws which it intends to use as the basis for a wider consultation on new legislation. In England and Wales and Scotland, the Public Order Act 1986 prohibits, by its Part 3, expressions of racial hatred, which is defined as hatred against a group of persons by reason of the group’s colour, race, nationality (including citizenship) or ethnic or national origins. Section 18 of the Act says: The Criminal Justice and Public Order Act 1994 inserted Section 4A into the Public Order Act 1986. The Racial and Religious Hatred Act 2006 amended the Public Order Act 1986 by adding Part 3A. The Criminal Justice and Immigration Act 2008 amended Part 3A of the Public Order Act 1986. The campaign was backed by a number of high-profile activists including comedian Rowan Atkinson and former Shadow Home Secretary David Davis.

Free Speech Versus Hate Speech on Social Media

Hate speech is merely critical, often demeaning, very critical, and offensive. Whenever hate speech becomes clear intimidation and threats against certain citizens, then some legal action needs to be taken. In addition, any form of malicious and persistent harassment that is focused on an individual is hate speech and should be prosecuted using the law. Individuals who send threatening messages using the internet to another individual or use public messages that are displayed on a certain website with information showing the intent of committing an act of violence needs prosecution (Pohjonen, 2017).

On the other hand, Free speech is not a right to speak anonymously. With social media platforms such as Twitter and Facebook have become key avenues for its odors to exercise their free speech rights. Some legislators and commentators have shown concern about the reputation and the digital public forums offered by these social media platforms. Alongside that, others have argued that these platforms have unfairly restricted and banned access to valuable speech. Violation of free speech is evident, especially when some site’s decisions create significant barriers (Davidson, 2017).

Hate speech versus free speech on social media can become controversial, especially when it becomes a real threat. The social media companies are aimed at creating safe environments for their users, but also they would want to keep upholding the value of speech. Very little seems to be done by social media companies to keep hate speech off such platforms (Mathew, 2019).

Moreover, the internet is revolutionizing means through which people can share their information and also communicate with one another. Providing such platforms for people to communicate has also paved a path for different kinds of speech. For instance, misogynists, racists, terrorists, and xenophobes have utilized the internet for communicating noxious opinions aimed at harassing other people. Other platforms have prompted the beating of lesbians and gays while some rally against Islam, and Christianity among other religions. Considering cyberbullying and terrorists who use the internet to recruit others, hate speech presents a great challenge (Brown, 2018).

Consequently, the battleground raises unique concerns about the future of hate speech and free speech. The Columbia Law Review has termed free speech as a triangle because it involves three kinds of people. The internet infrastructure companies, nation-states, and the variety of individual speakers (Guiora, 2017).

Subsequently, social media has turned into a platform where one can express their views irrespective of how abrasive or favorable they are. On the one hand, people are given the freedom to express their hatred towards a particular individual or a group of people; however, if the hatred is directed to a specific individual especially with violent terms, then it turns out to be a hate speech. This is because such a speech involves some threatening facts concerning a plan or an action (Olteanu, 2017).

Ultimately, social media platforms should be an avenue where free speech is protected and cherished, but if the speech goes too far, then a problem arises. At such a point, the government needs to step in, and it is obligated to protect its citizens against such odds (Schieb, 2016).

Consequently, regulating free and hate speech on social media has turned out to be futile and dangerous. A majority of people believe that companies should be regulated. Despite such claims, companies such as Google have some political biases that affect their operations of such companies. Whereas some computer programmers may create some algorithms that are discriminative in nature, ultimately the curation and collection of such social preferences will turn out to be adaptive algorithms linked to societal biases (ElSherief, 2018).

In this regard, the impact of hate speech versus free speech is significant. It calls for governments to lay down restrictions on social media and technology companies. Although these have come with controversies, especially those who do not have the regulations, argue that social media platforms are undermining their freedom of speech. Restrictions should be applied to both modern-day social media and traditional media because it will provide fair balance to both the radio and TV channels. This reduces the risk of violation of free speech, especially when the requirements on social media platforms are met (Gençoğlu Onbaşi, 2015).

According to, ethics in IT by George Reynolds, he presents a case by Finkel’s classmates. The extent to which students have all the rights to exercise the free speech through social media platforms and the restrictions put in place to curb the freedom were among the key questions the court had to answer. As a result, such unique arguments have attracted worldwide attention because of the consequences linked to the ruling of such cases. This would bring serious implications to both the persons in question and the social media sites (Mondal, 2017).

Furthermore, most governments have reacted to the effects caused by both free speech and hate speech. In 2005, a student at Florida committed suicide after she was teased and online bullied at her school for three years. To its effect, in 2008, the state played down some tough laws that prohibited social exclusion, teasing, threatening, stalking, intimidation, religious harassment, public humiliation and racial harassment (Khiabany, 2015).

Next, there exists a close link between free speech and hate speech. Most countries have not provided protection against hate speech. For instance, the promotion of Nazi ideology is a great crime in Germany, and denying the happening of the Holocaust is not legal in most European countries. Hence the government authorities in Canada, Britain, Germany, and France have charged people over their crimes involving their free and hate speech on social media (Sultana, 2018).

Considering in America, speech that is merely demeaning, and annoying enjoys clear protection under their Amendment. Legal recourse can only take place whenever hate speech becomes a clear threat and starts to intimidate specific citizens (Alkiviadou, 2019).

Consequently, anonymous expression covers the opinions of people who don’t reveal their identity. Possessing free speech on social media without fear of reprisal has become important in any democratic society. Anonymity expression is very crucial in narions that do not permit free speech; however, with wrong people, it can be used to commit unethical and illegal activities (Silva, 2016).

Firstly, the major difference between free speech and hate speech is that free speech is majorly a concept that one cannot be legally prosecuted for. People can’t be stopped from saying things are racist in nature; things are prejudiced. Whereas for hate speech, it is not just an opinion or a view, it goes beyond that. It is characterized as words or views that could possibly cause an individual (Silva, 2016).

Clearly, an individual could speak to a fellow person telling them that they do not like them based on their ethnic background and cultural heritage. That is his/her right and therefore, cannot be arrested for speaking such words. However, if the individual goes ahead to threaten the fellow individual telling them that they would kill them because of their ethnic background and heritage, then it goes beyond a free speech. That becomes an actual threat that causes serious penalties (Sap, 2019).

Ideally, there is little difference between hate speech and free speech. Whenever language or a person’s ideas become too cynical and hurtful to other people, then free speech becomes hate speech. This ultimately breaks down a society rather than helping people grow and evolve. The differences between free speech and hate speech are distinct yet completely intertwined. The amendments protect only the individual when he/she uses a free speech, but when hate is used, then they are answerable (Tarn, 2019).

Most importantly, hate speech and free speech happen, almost everyone. Although the line between hate speech and free speech becomes fuzzy sometimes, it is important to create an awareness concerning the social standards in the respective environments regarding the kind of message that is disrespecting to others. Hate speech is often derived from larger societal issues facing people. It plays a major role in segregating individuals who appear to be different form each other (Ponterotto, 2017).

In addition, hate speech and free speech appears to be more influential o students at Campus than in other different environments. In particular, hate speech is seen to negatively influence the social progress of individuals, therefore, excluding the idea of a universe that accommodates all kinds of people. While students are figuring out and learning who they are and what they may want to become, the presence of demeaning and hateful speech can really hurt, confuse and stunt their education and self-growth (Flickinger, 2018).

In conclusion, freedom of speech is just one thing, but hate speech is yet another. Hate speech should be taken seriously by society but just dismissed as something that at its worst only hurts the feelings of certain individuals. With it comes the prediction of violence. More and more groups that are exposed to hate speech end up committing suicide. This can only be termed as a dehumanizing effect on the lives of individuals. Alongside that, there is also room for individuals to take part in free speech which should be done with moderation. Ultimately we need to celebrate one another but not tearing each other down (Brumpton, 2019).

Bibliography

  1. Alkiviadou, N., 2019. Hate speech on social media networks: towards a regulatory framework?. Information & Communications Technology Law, 28(1),. s.l.:s.n.
  2. Brown, A., 2018. What is so special about online (as compared to offline) hate speech?. Ethnicities, 18(3),. s.l.:s.n.
  3. Brumpton, B. S. E. H. F. H. S. V. G. C. Y. H. L. H. A. B. D. H. A. a. H. J., 2019. Within-family studies for Mendelian randomization: avoiding dynastic, assortative mating, and population stratification biases. bioRxiv, p.602516.. s.l.:s.n.
  4. Davidson, T. W. D. M. M. a. W. I., 2017. May. Automated hate speech detection and the problem of offensive language. In the Eleventh international AAAI conference on web and social media. s.l.:s.n.
  5. ElSherief, M. K. V. N. D. W. W. a. B. E., 2018. ElSherief, M., Kulkarni, V., Nguyen, D., Wang, W.Y. and Belding, E., 2018, June. Hate lingo: A target-based linguistic analysis of hate speech in social media. In Twelfth International AAAI Conference on Web and Social Media.. s.l.:s.n.
  6. Flickinger, T. D. C. X. A. K. A. G. M. W. A., R. G. C. M. C. W. I. K. a. D. R., 2018. Addressing Stigma Through a Virtual Community for People Living with HIV: A Mixed Methods Study of the PositiveLinks Mobile Health Intervention. AIDS and Behavior, 22(10),. s.l.:s.n.
  7. Gençoğlu Onbaşi, F., 2015. Social media and the Kurdish issue in turkey: Hate speech, free speech, and human security. Turkish Studies, 16(1),. s.l.:s.n.
  8. Guiora, A. a. P. E., 2017. Hate Speech on Social Media. Philosophia, 45(3),. s.l.:s.n.
  9. Khiabany, G. a. W. M., 2015. Free speech and the market state: Race, media, and democracy in new liberal times. European Journal of Communication, 30(5). s.l.:s.n.
  10. Mathew, B. D. R. G. P. a. M. A., 2019. June. Spread of hate speech in online social media. In Proceedings of the 10th ACM Conference on Web Science. s.l.:s.n.
  11. Mondal, M. S. L. a. B. F., 2017. July. A measurement study of hate speech in social media. In Proceedings of the 28th ACM Conference on Hypertext and Social Media. s.l.:s.n.
  12. Olteanu, A. T. K. a. V. K., 2017. June. The limits of abstract evaluation metrics: The case of hate speech detection. In Proceedings of the 2017 ACM on Web Science Conference. s.l.:s.n.
  13. Pohjonen, M. a. U. S., 2017. Extreme speech online: An anthropological critique of hate speech debates. International Journal of Communication, 11, s .l.:s.n.
  14. Ponterotto, J., 2017. Ethical and legal considerations in psychobiography. American Psychologist, 72(5),. s.l.:s.n.
  15. Sap, M. C. D. G. S. C. Y. a. S. N., 2019. July. The Risk of Racial Bias in Hate Speech Detection. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. s.l.:s.n.
  16. Schieb, C. a. P. M., 2016. Governing hate speech by means of counterspeech on Facebook. In the 66th ica annual conference, at Fukuoka, Japan. s.l.:s.n.
  17. Silva, L. M. M. C. D. B. F. a. W. I., 2016. March. Analyzing the targets of hate in online social media. In Tenth International AAAI Conference on Web and Social Media. s.l.:s.n.
  18. Silva, L. M. M. C. D. B. F. a. W. I., 2016. March. Analyzing the targets of hate in online social media. In Tenth International AAAI Conference on Web and Social Media.. s.l.:s.n.
  19. Sultana, F., 2018. The false equivalence of academic freedom and free speech: Defending academic integrity in the age of white supremacy, colonial nostalgia, and anti-intellectualism. ACME: An International E-Journal for Critical Geographies, 17(2).. s.l.:s.n.
  20. Tarn, J. H.-T. N. L. D. M. X. S. A. D.-P. V. S. R. S. A. J. K. M. P. a. A.-A. S., 2019. Symptom-based stratification of patients with primary Sjögren’s syndrome: multi-dimensional characterization of international observational cohorts and reanalyses of randomized clinical trials. The Lancet Rheumatology, 1(2),. s.l.:s.n.

Pragmatic Supervised Learning Methodology of Hate Speech Detection in Social Media

A Pragmatic Supervised learning Methodology of Hate Speech Detection in Social Media

1G.Priyadharshini, 2Dr.M.Balamurugan

1Research Scholar, 2Professor and Head

1School of Computer Science, Engineering and Applications

1Bharathidasan University, Tiruchirappalli, India

_____________________________________________________________________________________________________

Abstract: In recent decades, information technology has been undergoing a huge evolution, with an expressive adoption of online social networks and social media platforms. Such progress revolutionized the way communication takes place by enabling a rapid, easy and almost costless digital interaction between its users. Although its numerous advantages, the anonymity associated with these interactions often leads to the adoption of more aggressive and hateful communication styles. These emerge at a fast and uncontrollable pace and usually cause severe damage to its targets, being crucial that governments and social network platforms are able to successfully detect and regulate aggressive and hateful behaviors occurring on a regular basis on multiple online platforms. The detection of this type of speech is far from being trivial due to the topic’s abstractness. Therefore this paper is proposed to deliver and complement current methodology and solutions on the detection of hate speech online, focusing on social media.

Index Terms – Preprocessing, Feature Extraction, Machine Learning, Classification.

________________________________________________________________________________________________________

1. INTRODUCTION

Hate speech is language that attacks or diminishes, that incites violence or hate against groups, based on specific characteristics such as physical appearance, religion, descent, national or ethnic origin, sexual orientation, gender identity or other, and it can occur with different linguistic styles, even in subtle forms or when humour is used. However, any distinct group may be targeted. Hate comes in different shapes and formats, targeting several different groups and minorities. A systematic large scale measurement study of the main targets of hate speech was conducted on the social media platforms Twitter and Whisper, capturing not only common targets of hate but also their frequency on these platforms.

This paper provides a summarized overview of pragmatic approach of automatic hate speech detection that is in present existence. It would be in need for freshers of NLP research who wanted to keep themselves aware of the actual state of art.

2. TEXT PREPROCESSING TECHNIQUES

Extracting features consists of building a set of derived values from a collection of raw data, being a step often decisive in improving the performance of machine learning problems.

Tokenization: It is defined as slicing a stream of text into pieces, denoted as tokens. The tokenization varies from language to language but lexical characteristics such as colloquialism (e.g. ‘u’ instead of ‘you’), contractions (e.g. ‘aren’t’ instead of ‘are not’) and others (e.g. ‘O’Neil) make the task harder. Sometimes also removal of less frequent tokens of the data is included.

2.2. Filtering: This involves removal of punctuation marks and irrelevant and/or invalid characters, (e.g. ‘?|%&!’), removal of stop words that are frequently used words that carry no useful meaning whose commonness and lack of meaning makes them useless. These filtering is very necessary since they do not contribute to the classification task.

2.3. Stemming: It is the process of reducing inflected words to a common base form(e.g. ‘ponies’ turns into ‘poni’ and ‘cats’ into ‘cat’). Stemming also improves performance by reducing the dimensionality of the data, since the words ‘fishing’, ‘fished’, and ‘fisher’ are treated as the same word ‘fish’.

2.4. Spellchecker: misspelling is common in online platforms due to their informal nature. A spell checker is needed to avoid unidentified or intentionally camouflaged words (e.g. ‘niggr’, ‘fck’).

2.5. Lemmatization: Although very similar to stemming, lemmatization considers the morphological analysis of the words. While stemming would shorten the words ‘studies’ to ‘studi’ and ‘studying’ to ‘study’, lemmatization would shorten both to ‘study’.

2.6. PoS tagging: Part of speech tagging, is a technique to extract the part of speech associated with each word of the corpus, grammatically wise which might be common to remove words belonging to certain parts of speech that might end up not being so relevant(e.g. pronouns).

2.7 Lowercasing: is converting a stream of text to lowercase which improves the performance of the classification since it reduces the dimensionality of the data. Not applying this technique may raise problems such as ‘tomorrow’, ‘TOMORROW’ and ‘ToMoRroW’ being considered different words.

3. FEATURE EXTRACTION TECHNIQUES

Feature extraction consists of collecting derived values (features) from the input data (text in this specific scenario) and generating distinctive properties, hopefully, informative and non-redundant, inorder to improve the learning and generalization tasks of the machine learning algorithms. Upon their extraction there is usually a subset of features that will contain more relevant information. Some of the frequently used feature extraction approaches is presented here.

3.1. N-Grams: N-grams are one of the most used techniques in hate speech automatic detection and related tasks [1,3,14]. The most common n-grams approach consists in combining sequential words into lists with size N. In this case, the goal is to enumerate all the expressions of size N and count the occurrences of them. This allows to improve the classifiers’ performance because it incorporates at some degree the context of each word. Instead of using words it is also possible to use n-grams with characters or syllables. This approach is not so susceptible to spelling variations as when words are used. In a study character n-gram features proved to be more predictive than token n-gram features, for the specific problem of abusive language detection [2].

3.2. Bag of Words: Bag of words is a representation of words which disregards grammar and the order of the words in sentences, while keeping multiplicity. Similarly to n-grams,BoW can been coded using tfidf, token counter or hashing function. Although it is typically used to group textual elements as tokens, it can also group other representations such as parts of speech.

3.3. TFIDF: Term frequency-inverse document frequency is a numerical statistic that measures the importance of a certain word in a data corpus. This might be an important feature in understanding the importance of certain words to express specific types of speech (e.g.’hate’)[29].

3.4. Word Embeddings: It is a learned representation for text where words that have the same meaning have a similar representation. It is a class of techniques where individual words are represented as real-valued vectors in a predefined vector space. Each word is mapped to one vector and the vector values seem to be a neural network.One of the word embedding technique that gained maximum intrest by researchers in text mining is Word2vec.

· Word2Vec: The granularity of the embedding is word wise, generating a vector for each word of the corpus. There are 2 different possible models: CBOW (continuous bag of words), that learns to predict the word by the context, and skip-grams, which is designed to predict the context itself. According to [22], CBOW is faster to train and has slightly better accuracy for the frequent words. On the other hand, Skip-grams work well with a small amount of training data and represent well even rare words or sentences. Most of the approaches that used Word2Vec[20] apply the skip-gram model.

3.5. Sentiment Analysis: It is important to grasp the sentiment behind the message, otherwise its true meaning will probably be misunderstood and/or misinterpreted (e.g. sarcasm). Users, mainly on social media, tend to formulate opinions on a diversity of topics, especially when they express an extremist attitude, in which we include hate speech. Regarding social media, sentiment analysis approaches usually focus on identifying the polarity (positive or negative connotation) of comments and sentences as a whole.

3.6. Template Based Strategy: The basic idea of this strategy is to build a corpus of words, and for each word in the corpus, collect K words that occurring around. This information can be used as context.

4. CLASSIFICATION METHODOLOGY

Hate speech detection in text is mostly a supervised classification using machine learning algorithms. The usage of Deep learning approaches have increased significantly because of its intense accuracy which caused the emergence of neural networks on large scale for text classification.

4.1 TRADITIONAL SUPERVISED LEARNING METHOD:

4.1.1 Support Vector Machines : SVM’s are widely used in classification problems and the algorithm can be described as an hyperplane that categorizes input data (text in this case). In 2017, SVM’s held the best results for text classification tasks, but in 2018 deep learning took over, especially in hate speech detection as described here [24].

4.1.2 Logistic Regression: logistic regression is a (predictive) regression analysis which estimates the parameters of a logistic model, a statistical model that uses a logistic function to model a binary dependant variable [28].

4.1.3 Naive Bayes: This is an algorithm based on the Bayes’ theorem with strong naive independence assumptions between the features of the data. It generally assumes that a particular feature in a class is unrelated to any other feature. Naive Bayes is a model useful for large datasets and does well despite being a simple method.

4.1.4 Random Forest: Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest [27]. This model requires almost no input preparation, performs implicit feature selection and is very quick to train, performing well overall.

4.1.5 Decision Tree: This is an algorithm that provides support for decision making, providing a tree-like model of decisions and their possible consequences and other measures (e.g. resource cost, utility). They are often used since their output is usually readable, being simple to understand and interpret by humans. They are also fast and perform well on large datasets, but they are prone to overfiting.

4.1.6 Gradient Boosting : This is a predicition model consisting of an ensemble of weak prediction models, typically decision trees (that’s why it may also be called gradient boosted trees), in which the predictions are not made independently (as in Bagging), but sequentially. The sequential modeling allows for each model to learn from the mistakes made by the previous one[23].

4.2 DEEP LEARNING METHODOLOGY:

4.2.1 CNN (Convolutional neural networks) : a class of deep feed-forward artifical neural networks. A CNN consists of an input and output layer and multiple hidden layers which consist of convolutional layers, pooling layers and fully connected layers[26].

4.2.2 RNN (Recurrent neural networks): Unlike CNN’s, are able to handle sequential data, allowing to produce temporal dynamic behaviors according to a time sequence. The connections between nodes form a directed graph. RNN’s have feedback loops in the recurrent layer, which act as a memory mechanism. Despite this fact, long-term temporal dependencies are hard to grasp by the standard architecture, because the gradient of the loss function decays exponentially with time (vanishing gradient problem). For this reason, new architectures have been introduced

· LSTM : Long short-term memory neural networks, are a type of RNN that use special units in addition to standard units, by including a memory cell able to keep information in memory for long periods of time. A set of gates is used to control when information enters the memory, when it’s output, and when it’s forgotten enabling this architecture to learn longer-term dependencies as detailed in [25] and [26].

· GRU: Gated recurrent unit neural networks, are similar to LSTM’s, but their structure is slightly simpler. Although they also use a set of gates to control the flow of information, these are fewer when compared to LSTM’s [25,26].

RNN supports sequential architectures where CNN has a hierarchical architecture.GRU and CNN results can be compared with respect to text size, GRU is better when the sentences are bit longer. Finally, they concluded that deep neural network performance is highly dependable on tuning the hyperparameters.

5. PERFORMANCE METRICS

The measures for evaluating performance of machine learning algorithm are originally built from a confusion matrix where output can be two or more classes. The confusion matrix records which samples of the data have been correctly and incorrectly predicted for each class.

Accuracy is a generic performance measure that assesses the overall effectiveness of the algorithm, by computing the number of correct predictions over all the predictions made. Although it is commonly used accuracy doesn’t distinguish between different classes. Consequently, this performance metric may be misleading, especially when the classes of the data are unbalanced.

There is a subset of performance metrics that consider classes. These are usually more useful in sets of data that contain unbalanced classes, since the performance of the algorithm can be assessed class wise. This is quite often in hate speech datasets. The most used class wise, performance measures in hate speech detection are:

Recall (R), also known as Sensitivity or True Positive Rate, is defined as the proportion of real positives that are correctly predicted as positive. Precision (P) denotes the proportion of predicted positive cases that area actually positive.

F1 score is defined as the harmonic mean of Precision and Recall, and considers class imbalance, unlike accuracy, hence it’s wide usage in hate speech detection.

Using these performance metrics, a graphical visualization of the algorithm’s predictions can be computed, known as ROC (Receiver operating characteristic). It shows the relation between the sensitivity and the specificity of the algorithm and is created by plotting the true positive rate (TPR) against the false positive rate (FPR). The higher the TPR, the higher the area under ROC, also known as AUC (Area under curve).

6. RELATED WORK

This section presents a comprehensive review on the key works and existing studies related to the area of automatic detection and hate speech in English Language in particular. In English language, hate speech detection has been intensively investigated by more than 14 contributors in all the categories of hate speech (racial, sexism, religious and general hate).Hate speech in other languages such as Dutch, German, Italian, Turkish, Indonesian, Arabic, Portugese was also investigated but in a limited number. This paper surveys on hate speech detection in English language which has majority researches.

6.1 DATASETS AND ANNOTATION:

One of the issues in hate speech detection in text is the dataset availability. The majority of existing works were executed on privately collected datasets, often for different problem. [3] claimed to have created the largest datasets for abusive language by annotating comments posted on Yahoo!. The datasets were later used by [2]. However, the datasets are not publicly available. Currently, the only publicly available hate speech datasets include those reported in [1,4,14,17,21]. All these publicly available corpus is collected from Twitter by searching for tweets containing frequently occurring terms (based on some manual analysis) in tweets that contain hate speech and references to specific entities.

In order to annotate a data set manually, either expert annotators are used or crowd sourcing services, such as Amazon Mechanical Turk (AMT), are employed. Crowd sourcing has obvious economical and organizational advantages, especially for a task as time-consuming as the one at hand, but annotation quality might suffer from employing non-expert annotators.

[14] annotate 16,914 tweets, including 3,383 as ‘sexist’, 1,972 as ‘racist’ and 11,559 as ‘neither’. It is then annotated by crowd-sourcing over 600 users. The dataset is later expanded in [21], where some 6,900 tweets are collected, where about 4,000 are new to their previous dataset. This dataset is then annotated by two groups of users to create two different versions: domain experts who are either feminist or anti-racism activist; and amateurs that are crowd-sourced. Experiments show that amateur annotators are more likely than expert annotators to label tweets as hate speech. Later in [17], the authors merge both expert and amateur annotations in this dataset by using majority vote, giving expert annotations double weight; and in [4], the dataset in [14] is merged with the expert annotations in [21] to create a single dataset. [1] annotate some 24,000 tweets for ‘hate speech’, ‘offensive language’ but not ‘hate’, and ‘neither’. It is found that distinguishing hate speech from non hate offensive language is a challenging task, as hate speech does not always contain offensive words while offensive language does not always express hate.

In addition to the issues mentioned above that, to some extent, challenge the comparability of the research conducted on various data sets, the fact that no commonly accepted definition of hate speech exists further exacerbates this situation. Previous works remain fairly vague when it comes to the annotation guidelines their annotators were given for their work. Despite providing annotators with a definition of hate speech, in their work the annotators still fail to produce annotation at an acceptable level of reliability.

6.2 SUMMARY AND ANALYSIS:

The next tables present a summary of all the discussed papers in English language in all the categories of hate speech (racial, sexism, religious and general hate). These tables can serve as a quick reference for all the key works done in the automatic detection in social media. All the approaches and their respective experiments results are listed in a concise manner.

Table 1: Summary of the current state of anti-social behaviour detection, and their respective results, in the metric: Precision (P), Recall (R), F1-Score (F).

AUTHOR

YEAR

PLATFORM

FEATURE REPRESENTATION

ALGORITHM

P

R

F1

[4]

2017

Twitter

Character and Word2vec

Hybrid CNN

0.71

0.75

0.73

[5]

2017

Youtube, MySpace, SlashDot

Word embeddings

Fast Text

0.76

[6]

2018

Twitter, Wikipedia, UseNet

Lexical, Linguistics and Word embeddings

SVM

0.82

0.80

0.81

[7]

2011

Youtube

Tf-idf, lexicon, PoS tag, bigram

SVM

0.66

[8]

2018

FormSpring

Bag of Words

M-NB and Stochastic Gradient Descent

0.90

[9]

2018

Twitter

Semantic Context

SVM

0.85

0.84

0.85

[10]

2013

Yahoo News Group

Template-based, PoS tagging

SVM

0.59

0.68

0.63

[11]

2013

Twitter

Unigram

Naïve Bayes

[12]

2014

Twitter

BOW, Dependencies, Hateful Terms

Bayesian Logistic Regression

0.89

0.69

0.77

[13]

2015

Yahoo Finance

Paragraph2vec, CBOW

Logistic regression

[14]

2016

Twitter

Character ngrams

Logistic regression

0.72

0.77

0.78

PAPER

YEAR

PLATFORM

FEATURE REPRESENTATION

ALGORITHM

P

R

F1

[15]

2018

Twitter

Sentiment Based, Semantic, Unigram,

J48graft

0.79

0.78

0.78

[16]

2018

Twitter

N-grams, Skipgrams, hierarchical word clusters

RBF kernel SVM

0.78

0.80

0.79

[17]

2017

Twitter

Character Ngrams, word2vec

CNN

0.85

0.72

0.78

[18]

2017

Twitter

Random Embedding,

LSTM and GBDT

0.93

0.93

0.93

[19]

2018

Twitter

Word-based frequency vectorization

RNN and LSTM

0.90

0.87

0.88

[20]

2018

Twitter

Word embeddings

CNN +GRU

0.94

7. RESULTS AND DISSCUSSION

Choosing the most appropriate machine learning approach is another challenging decision. Previous works employed mostly all the varieties of techniques. According to table majority of researchers relied on supervised machine learning approaches in their automatic detection task. For instance, one major factor is the size of the corpus, as some ML algorithms works pretty well with small datasets. Others such as Neural Networks needs more intensive and complex training.

Resent researches are oriented towards deep learning to solve complex learning tasks. Researchers claimed that deep learning is powerful when it comes to finding data representation for classification and obviously it has a promising future in the field of the automatic detection. Choosing to adopt deep learning needs commitment in both of preparing and training the model with large amount of data. Generally, there are two main architectures for deep neural networks that are usually utilized for NLP tasks, these models are: RNN and CNN. In the previous tables, there were 4 hate speech researches that adopted deep learning, two of them were RNN and the two others were CNN. These researches concluded with the effectiveness of both approaches.For that reason, more investigation needs to be done to make the appropriate choice of deep learning architecture.

8. CHALLENGES

· Low agreement in hate speech classification by humans, indicating that this classification would be harder for machines

· The task of annotating a dataset is also more difficult because it requires expertise about culture and social structure.

· The evolution of social phenomena and language makes it difficult to track all racial and minority insults. Besides, language evolves quickly mainly among young populations that communicate frequently in social networks.

· Despite the offensive nature of hate speech, abusive language may be very fluent and grammatically correct, can cross sentence boundaries and it is also common the use of sarcasm in it.

· The majority of the studies focus in English. Besides, only isolated studies were conducted in other languages such as German, Dutch, Italian and others. In this case, research in other languages commonly used on the internet is also needed (e.g. French, Mandarin, Portuguese, Spanish).

· Finally, hate speech detection is more than simple keyword spotting.

9. CONCLUSIONS AND FUTURE WORK

This paper was established with the goal to understand the state of the art and opportunities in the field of automatic hate speech detection. We presented a comprehensive study on the methodology in automatic hate speech detection in social networks. In this paper we also investigated some challenges which can be a guide for the implementation of more accurate hate speech detection Additionally, in order to have a picture from the state of the art in the field, we conducted a Systematic Literature Review. We concluded that the number of studies and papers published in automatic hate speech detection in text is limited and usually those works regard the problem as a machine learning classification task. In this field, researchers tend to start by collecting and classifying new messages, and often the used datasets remain private. This slows down the progress in this research field because less data is available and also makes more difficult to compare the results in the different studies.

The future work will include incorporating the latest deep learning architectures to build a model that is capable to detect and classify other languages than focusing only on English. Comparative studies and surveys are also scarce in the area which should be concentrated. Also, for better comparability of different features and methods, we argue for a benchmark data set for hate speech detection.

REFERENCES

[1] Thomas Davidson, Dana Warmsley, Michael Macy, and Ingmar Weber. Automated hate speech detection and the problem of offensive language. arXiv preprint arXiv:1703.04009, 2017.

[2] Yashar Mehdad and Joel Tetreault. Do characters abuse more than words? In Proceedings of the SIGdial 2016 Conference: The 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 299–303, 2016.

[3] Chikashi Nobata, Joel Tetreault, Achint Thomas, Yashar Mehdad, and Yi Chang. Abusive language detection in online user content. In Proceedings of the 25th International Conference on World Wide Web, pages 145–153. International World Wide Web Conferences Steering Committee, 2016.

[4] J. H. Park and P. Fung, “One-step and Two-step Classification for Abusive Language Detection on Twitter,” in AICS Conference, 2017.

[5] H. Chen, S. McKeever, and S. J. Delany, “Abusive text detection using neural networks,” in CEUR Workshop Proceedings, 2017, vol. 2086, pp. 258–260.

[6] M. Wiegand, J. Ruppenhofer, A. Schmidt, and C. Greenberg, “Inducing a Lexicon of Abusive Words – a Feature-Based Approach,” in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 2018, pp. 1046–1056.

[7] K. Dinakar, R. Reichart, and H. Lieberman, “Modeling the detection of Textual Cyberbullying.,” Soc. Mob. Web, vol. 11, no. 02, pp. 11–17, 2011.

[8] R. Pawar, Y. Agrawal, A. Joshi, R. Gorrepati, and R. R. Raje, “Cyberbullying Detection System with Multiple Server Configurations,” 2018 IEEE Int. Conf. Electro/Information Technol., pp. 90–95, 2018.

[9] M. Fernandez and H. Alani, “Contextual semantics for radicalisation detection on Twitter,” CEUR Workshop Proc., vol. 2182, 2018.

[10] W. Warner and J. Hirschberg, “Detecting Hate Speech on the World Wide Web,” no. Lsm, pp. 19–26, 2012.

[11] Kwok and Y. Wang, “Locate the Hate: Detecting Tweets against Blacks,” Twenty-Seventh AAAI Conf. Artif. Intell., pp. 1621–1622, 2013.

[12] P. Burnap and M. L. Williams, “Hate Speech, Machine Classification and Statistical Modelling of Information Flows on Twitter: Interpretation and Communication for Policy Decision Making,” in Proceedings of the Conference on the Internet, Policy & Politics, 2014, pp. 1–18

[13] N. Djuric, J. Zhou, R. Morris, M. Grbovic, V. Radosavljevic, and N. Bhamidipati, “Hate Speech Detection with Comment Embeddings,” in Proceedings of the 24th International Conference on World Wide Web, 2015, pp. 29–30.

[14] Z. Waseem and D. Hovy, “Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter,” Proc. NAACL Student Res. Work., pp. 88–93, 2016.

[15] H. Watanabe, M. Bouazizi, and T. Ohtsuki, “Hate Speech on Twitter: A Pragmatic Approach to Collect Hateful and Offensive Expressions and Perform Hate Speech Detection,” IEEE Access, vol. 6, pp. 13825–13835, 2018

[16] S. Malmasi and M. Zampieri, “Challenges in Discriminating Profanity from Hate Speech,” J. Exp. Theor. Artif. Intell., vol. 30, pp. 187–202, 2018

[17] B. Gambäck and U. K. Sikdar, “Using Convolutional Neural Networks to Classify Hate-Speech,” Assoc. Comput. Linguist., no. 7491, pp. 85–90, 2017.

[18] P. Badjatiya, S. Gupta, M. Gupta, and V. Varma, “Deep Learning for Hate Speech Detection in Tweets,” in Proceedings of the 26th International Conference on World Wide Web Companion, 2017, pp. 759–760

[19] G. K. Pitsilis, H. Ramampiaro, and H. Langseth, “Effective hate-speech detection in Twitter data using recurrent neural networks,” Appl. Intell., vol. 48, no. 12, pp. 4730–4742, Dec. 2018.

[20] Z. Zhang and L. Luo, “Hate Speech Detection: A Solved Problem? The Challenging Case of Long Tail on Twitter,” vol. 1, no. 0, pp. 1–5, 2018.

[21] ZeerakWaseem.2016. Are You a Racist or Am I Seeing Things? Annotator Influence on Hate Speech Detection on Twitter. In Proceedings of the First Work shop on NLP and Computational Social Science. Association for Computational Linguistics,Austin,Texas,138–142.

[22] Yoav Goldberg and Omer Levy. word2 vec explained: deriving mikolov et al.’s negative-sampling word-embedding method. Computing Research Repository, abs/1402.3722, 2014.

[23] Alexey Natekin and Alois Knoll. Gradient boosting machines, a tutorial. Frontiers in neuro robotics, 7:21, 2013.

[24] Hajime Watanabe, Mondher Bouazizi, and Tomoaki Ohtsuki. Hate speech on twitter a pragmatic approach to collect hateful and offensive expressions and perform hate speech detection. IEEE Access, 2 2018. ISSN 2169-3536.

[25] Junyoung Chung, Çaglar Gülçehre, Kyung Hyun Cho, and Yoshua Bengio. Empirical evaluation of gated recurrent neural networks on sequence modeling. Computing Research Repository, abs/1412.3555, 2014.

[26] Wenpeng Yin, Katharina Kann, Mo Yu, and Hinrich Schütze. Comparative study of CNN and RNN format ural language processing. Computing Research Repository,abs/1702.01923,2017.

[27] Leo Breiman. Random forests. Machine Learning, 45(1):5–32, Oct 2001. ISSN 1573-0565. doi: 10.1023/A:1010933404324.

[28] Sandro Sperandei. Understanding logistic regression analysis. Biochemia medica, 24(1):12–18, 2014.

[29] Sanjana Sharma, Saksham Agrawal, and Manish Shrivastava. Degree based classification of harmful speech using twitter data. Computing Research Repository, abs/1806.04197, 2018.

.

Misogyny In India: A Virulent Form Of Hate Speech

Over time, our supposedly egalitarian society has nourished misogynist attitudes and beliefs and pushed ideologies that glorify the speaker as a maverick but inflict hatred on women for being as unfortunate as they are, to be women in men’s world. The laws that govern Indian women are dictated by social perceptions formed by the self-acclaimed censorious champions of morality. This exclusively masculine and misogynist society that has tied women in the fetters of these laws, endorsing sexism, has been complicit with the archaic Indian law in perpetuating patriarchal codes of conduct.

Sexist and misogynist speech takes the form of hate speech through practices like slut-shaming, giving sexualised threats of death, rape, and/or any other form of violence, and commenting to ridicule and humiliate the target. The current legal framework for gender-based violence needs to be seen through the prism of hate speech to be able to do justice to the myriad of anguishing experiences that women face. This piece is an attempt to figure out and then argue that misogynist speech that enforces patriarchy is a perfect fit in the category of hate speech. In doing so, I assert that women at innumerable times are targets of hate speech and a legal system that perfunctorily dismisses this is oppressive. I dare to place the misogynist ideologies of the regressive Indian society in an explicitly feminist paradigm that situates violence induced by speech within the continuum of violence.

DEFINING HATE SPEECH: DEATH BY WORDS IS VIOLENCE

“We’re told they’re ‘only words,’ but we live and die by them.” Words simply put together, cannot constitute speech. Speech is constituted by social reality. A form of speech that constitutes an act characteristically hostile and aims to silence, malign, disparage, humiliate, intimidate, incite violence, or vilify, is liable to be termed as hate speech. Human Rights Watch defines hate speech as any form of expression that is regarded as offensive to various groups including women. It is evident from these definitions that hate speech may oppress through copious means including but not at all limited to violence and subordination. One such means is rooted in the deep desire for anti-feminist elements in our country to pass off their bitter hostility towards women through words and then bury the realities of this systemic misogyny by disregarding it as hate speech.

While adjudicating cases of rape and sexual assault, a woman’s character is mercilessly besmirched and she is made to sit through the incessant reiteration of what is “unbecoming” of an Indian woman. Humiliations as harsh as this are nothing short of hate speech conveniently disguised as misogyny. What should fill a judge with nothing but great indignation and censure becomes the subject of a nonchalant opinion about a woman’s “promiscuity”, “adventurism”, and “experimentation in sexual encounters.” Victims standing in Indian courtrooms are regular Indian girls who often find themselves on the seemingly wrong side of the law that traces its inception back in the paternalistic minds of old men. Every word, though carefully uttered by them, is met with a demand for strong corroboration and the ‘tale’ as told by them is put aside for ‘credibility deficit.’ But the sad irony is that this deeply entrenched misogyny and male privilege evident in the judges’ conduct is not seen as ‘unbecoming’ of a judge.

When a girl or a woman digresses from the socially acceptable norms of femininity, or at the risk of sounding shamelessly crude when a girl or a woman engages in casual sexual relations with multiple partners, she is accorded the label of ‘slut.’ In a society where slut-shaming should be tantamount to hate speech, it is recognised as a playful insult, one which can become an accusation so strong that it can be synonymised with misogyny. Calling a woman an ‘easy lay’ because of her sexual choices is a classic example of slut-shaming by people with fragile egos who need to hyper-‘masculinate’ at all available opportunities. ‘Bois locker room’ incident yet again reinforced the reality that the best way to express one’s masculinity is to denigrate women through the use of expletives and brazen means of objectification. Far from convicting perpetrators of this nameless and unrecognized crime, our government attempts to disable the feminist movement against this and without advocating actual physical violence, seeks to cause permanent psychological and emotional harm by robbing women of their rights, discriminating against them in the wage system, and confining them in idealised domestic spaces.

GENDERED HATE SPEECH: TO BE NORMALISED OR CRIMINALISED?

As a society, we have normalized gendered hate speech to an extent that it is no longer recognised as a form of hate speech. The sense of superiority of one group over another resulting in the former’s urge for dominance over the latter gives rise to hate speech. Gendered opinions, however, are excused from this domain on account of being simply parochial.

Section 153A of the Indian Penal Code (IPC) criminalises the promotion of enmity between groups of people on grounds of religion, race, place of birth, residence, language, caste or community or any other ground whatsoever, and carrying out of acts prejudicial to maintenance of harmony. The primary lacuna in this law is the blatant exclusion of ‘gender’ as an essential aspect of a person’s identity. Inclusion of the phrase ‘any other ground whatsoever’ does not promote a more inclusive hate speech law and is a sham in the name of a legal solution that strengthens the laws around gendered hate speech. Remarks against women, as scornful as “fortune huntress mafia moll only used as a sex bait to trap rich men” are not simply ‘sexist.’ These are misogynist comments fuelled by unwarranted hate that are considered only offensive by a law that provides no remedy for speech that not only promotes gender stereotypes but irreparably injures the self-esteem of an entire group. Section 14 of the Press Council of India (PCI) Act, 1978 entrusts the PCI with the ‘power to censure’ based on complaints against news agencies for offending journalistic ethics, public taste, among others. A law as infirm as this provides no real legal recourse to women as subjects of hate speech because it empowers the council to merely ‘disapprove’ of journalistic conduct. The Criminal Law (Amendment) Bill, 2017 proposes the insertion of Sections 153C and 505A in the IPC to criminalise incitement of hatred and violence on the grounds of sex, gender identity, and sexual orientation, among others. The Law Commission relies on the status of the authors and victims of the speech, its potentiality, and context, as well as the extremity of speech, as criteria used to distinguish offensive speech from hate speech. However, the objective that this amendment seeks to achieve is blurred due to the lack of a clear definition of hate speech identifying the harm that it may cause and thereby makes the provision ambiguous.

The aforementioned laws vest an arbitrary discretion in the judiciary to determine what constitutes hate speech and what does not. The ire of an Indian judge is aroused only in cases where hate speech results in incitement of violence, especially physical. An incitement to discriminate and harm by verbally abusing women with rape threats fails to make the cut. The regrettable reality that we are faced with is that any person who vilifies and disparages women might be a nuisance, to the extent of becoming dangerous, but his/her actions will not constitute a toxic social situation that could be punishable as hate speech. A person’s misogynist ideology might result in actions that are punishable as crimes but the ideology remains protected.

Hate speech has itself become a part of mainstream discourse in a manner that is increasingly acceptable to a large section of the society thereby defeating the objective of the law. This assertion is strengthened by the Court’s ignorant observation that existing laws in India are sufficient to tackle hate speeches.

CONCLUSION

To accomplish the goal of recognizing misogyny as a form of hate speech, we need to let go of our ‘what-about-it’ approach. Attempting to discredit a woman’s stance on indecent and derogatory comments against her by refuting them or trying to disprove them is further strengthening the ingrained prejudice against women in our society. Our constitution recognises dignity as intrinsic to a person and misogyny and patriarchal notions of sexual control should find no place in this constitutional order.

The discussion of gender justice and critique of our misogynist system is punctured by the ignorant law governing hate speech. It is rarely followed by the discussion and plea to develop a rigorous hate speech law that takes into consideration gender as one’s identity. Legal recognition has been accorded to bad speech that causes harm, but misogyny, which is just a stealthier way of harassing, intimidating, and ridiculing women, has often failed the test of being dangerous enough. Misogyny, as ubiquitous as it is in our phallocentric society, enjoys impunity. To treat misogynist speech as “merely words” is to fail to acknowledge the reason for speaking and the impact we want to make. It is violence against women in the rawest and the most shameful form. Addressing it as ‘only misogyny’ as it falls short of the manifold criteria to qualify as hate speech, would be an oversimplification.

Essay on Why Hate Speech Should Be Illegal

I. Introduction

I.I. Communication & The Internet

It has been almost 30 years since the internet was invented. People of this generation now have access to information and the ability to share information in a way that was never done before. Even though there are arguments against this, online communication adds to the volume of contact to the traditional offline modes of communication through e-mails, chatrooms, instant messaging, and social media.

The sophisticated multi-disciplinary tool enables individuals to connect with people from all over the world. The Internet doesn’t have borders, and individuals aren’t tied down to their geographical location. It brings together topic-based communities. The internet is the largest ungoverned space. There are innumerous pros to this invention and on the other hand, it is also manipulated by cynical forces. People hide behind the screens, there is a lack of accountability, and it provides a platform to spread malicious, hateful, and deceitful content.

I.II. Harassment and hateful content online

In the midst of the social media revolution, hate speech on the internet also grows with it. People experience online harassment on different levels and in different forms. In the milder forms, it harbors negativity and in its severe forms it taints the reputation of an individual, raises privacy concerns, and even poses a threat to physical safety sometimes.

According to the Pew Research Center survey, 41% of Americans have been subjected to some form of online harassment or other. Internet has changed the way people communicate, connecting the world instantaneously but undoubtedly has its negative effects too. The inter-connectedness makes it possible to target individuals, communities, or races and influence opinions or spread hate speech.

I.III. Laws around Hate Speech & Online Harassment

More often than not, hate speech and online harassment fall under free expression. While governments around the world have difficulty passing laws that discourage free speech, European countries have laws in place against hate speech. As baseless conspiracies, polarizing statements, false ideas, and offensive commentary infiltrate our social media, the impending question is whether should there be laws to erase these messages. Should posts like these constitute free expression? Should action be taken by non-governmental entities at the organization level or at the governmental level?

The manifestation of targeted hate messaging is a form of psychological terrorism that exists on a spectrum of severity and those who have experienced severe forms of cyber harassment or cyberbullying have dramatically different reactions and attitudes towards the issue. The Supreme Court has never defined a category of speech as hateful conduct. Hate speech is a part of the First Amendment what it excludes is what is labeled as hate speech. Cyber harassment is not protected under free speech. The harasser(s) can be sued for defamation. Lawsuits require a lot of resources, perpetrators are hard to identify, and, law enforcement will have to employ a considerable amount of forensic expertise to track down individuals who engage in this anonymously.

II. Hate Speech on Social Media

II.I. Proliferation of Hate Speech on Social Media

Social media platforms are an open playfield for online harassment and hate speech. These frequently target one’s personal characteristics, appearance, race and ethnicity, and gender. A recent study by Hate Lab shows an increase in hate speech on social media leads to more crimes against minorities in the physical world. The real-world consequences of hate that spreads on social media are another reason why this should be taken seriously.

Language is weaponized and is used as a means to inflict violence. Anonymity is a facilitating factor in the spread of harassment. Online harassment can take many forms from name-calling to targeted campaigns. The people who experience ‘harassment’ are uncertain whether it was or wasn’t harassment. The topic of online harassment and hate speech can be highly subjective on an individual’s perception of the two.

II.II. Social Media Community Guidelines

II.II.a. Facebook

Facebook is the place people feel empowered to communicate and they take their role of keeping abuse off their service very seriously. Their guidelines are broken into six different categories. Safety and objectionable content are the clauses that are most important for this project.

Under safety, Facebook acknowledges that bullying and harassment come in different forms, from threats to releasing personally identifiable material, and unwanted malicious content. They believe context and intent matter and have a self-reporting system in place. They also have a bullying prevention hub which is a resource for teenagers, parents, and educators.

Objectionable content covers hate speech and is defined as an attack against people based on characteristics. Hate speech can be shared to raise awareness or for humor but if the intention isn’t clear the post is taken down. They separate attacks into three tiers of severity. Apart from monitoring the content on the platform they also have a built-in reporting system in place.

II.II.b. Instagram

Instagram strives to foster a positive and diverse environment by removing content and comments that have possible threats, hate speech, target individuals, degrade them, or shame them. They do allow for stronger conversations to take place. Attacks based on one’s race, ethnicity, nationality, sex, gender, or sexual orientation are not OK.

Instagram doesn’t allow nudity or photos, videos, or digitally created content that shows sexual intercourse and genitals. Pictures and videos of kids that are nude or partially nude may be taken down due to unanticipated usage by others. Even if it is shared with the right intentions.

Instagram has a built-in report feature that can be used when you see something that violates guidelines. They have a global team that reviews these reports and removes them as quickly as possible. They may remove comments, imagery, or the entire post associated with the comments.

Instagram is an entity of Facebook. Facebook has more extensive guidelines in place and there is no clear mention of whether they apply to Instagram or not.

II.II.c. Twitter

Twitter’s purpose is to promote public conversation. Violence and harassment in any form diminish the value of public conversation. Twitter’s rules are to ensure all people can participate in the public conversation freely and safely.

II.III. Where do we draw the line?

Social media platforms have played a huge role in social and political protests. From Occupy Wall Street, #MeToo, to BlackLivesMatter it is evident that social media has the potential to create social change. . There is a growing concern about media sexism. While they have guidelines in place these media technologies still suffer from harassment, misogyny, hatred, trolling, and online stalking. This has proliferated to an extent where a recent poll shows that 51% of Americans think that the First Amendment is outdated and should be rewritten and 48% believe “hate speech” should be illegal.

Free speech shouldn’t be used as a shield from the social consequences of your words. Even though a huge percentage of people believe that hate speech should be illegal we should take into account the subjectivity of the matter. So, where do we draw the line? Humans haven’t been able to distinguish between hate speech and offensive language. They also haven’t reached a consensus on what constitutes hate speech. The insidious nature of hate speech is that it can take different shapes depending on the context. While there are different opinions on the topic of hate speech itself Facebook has decided to control it using machine learning for its detection.

III. Fighting against online harassment & hate speech

III.I. Machine Learning for Hate Speech Detection

Machines aren’t like human beings. Their understanding of language is highly mathematical and algorithms can classify text but they are highly sensitive to change. A crucial challenge for machine learning is understanding the context in which it is being communicated. The algorithms are still pre-mature and are susceptible to deception. The system is easy to evade.

Perspective AI is a hate speech machine learning model that is available there and it uses Natural Language Processing to determine the toxicity of a word, sentence, or paragraph. The algorithms of Google perspective view profanity as toxic so when profanity is used in a non-hateful sentence it automatically violates the community guidelines. It also doesn’t consider typos and white spaces or the lack thereof. False positives in sentences that provide an aggregate toxic score aren’t always accurate. These are some of the limitations of the current state of hate speech detection. However, Facebook claims that by using machine learning the rates of removal for hate speech content have increased and that their company removes 72 percent of their illegal hate speech on the platform. I acknowledge by using existing models that the prediction of hate speech might not be fully accurate.

III.II. Project proposal

The project is called ‘Your Enemies Love You’ and what it aims to do is modify messages of hate and harassment into sentences of self-love and endearment on a social platform. The purpose of the project is to start a bigger conversation about the effects of positive and negative communication amongst communities online. “Your enemies love you” is a project born out of several explorations during the 7-in-7s.

Project 1: A hate plug-in was a conceptual idea to create a social media credit system based on usage and reporting. If a person has been reported several times and his posts are flagged as harassment or hate speech his credit score is reduced. This would inherently increase accountability online.

Project 2: Minor interventions that prompt before posting to notify the user that a particular message might not adhere to community guidelines. This would also prompt the user to ask if they would want to view a comment on their post or targeted at them that might not adhere to guidelines.

The method used to build will be a web extension. This way users will have the option of choosing to opt into this or not. The project that I’m trying to achieve will use a sentiment analysis API to recognize hate speech and harassment and then work towards making individuals rethink before posting and viewing deceitful comments and providing users history of comments that do not adhere to community guidelines.

The project aims at increasing accountability in these ungoverned spaces, promoting ethical speech and expression.

    1. This is a victim-first approach. The person who is most affected by online harassment and hate speech is the person who is being targeted. Intervention 1 introduces a prompt that alerts the individual before viewing a comment and gives the choice to view the content.
    2. Think before acting. The second intervention is a prompt that uses hate detection before posting and letting the user know that a comment or a post may violate community guidelines. This allows the user to not post something out of haste or anger and seconds to reconsider before posting something that might be hateful.
    3. Intervention 3 is what increases the accountability of the user. The intervention aims at using hate speech detection to track the comments and posts of an individual and enlist all the comments that violate the guidelines. It not only increases the accountability of a profile but is also great for the subjectivity of the issue since humans can’t agree on hate speech, now they have a list which they can refer to and decide for themselves.

IV. Conclusion

IV.I. Significance of the project

I believe that this project is important and public speech that expresses hate shouldn’t be looked at. The growing awareness of the topic and studies that support that hate speech has increased worldwide and can have an impact in real life is reason enough to address online harassment and hate speech. This is a sincere effort to reduce online misconduct. At the end of a computer screen is another human being, it is crucial to see the person behind the screen and not hide behind the anonymity that the Internet offers. This makes social media safer, inclusive, and a little more equitable.

IV.II. Limitations of the project

The chrome extension is not fully built, yet. The project was originally supposed to be built on Twitter due to a lack of technical knowledge and the resilient development of Twitter this had to be foregone. The project aims to use existing hate speech detection models which aren’t fully accurate at this date.

Bibliography

    1. “Building a Feminist Data Set for a Feminist AI.” Akademie Schloss Solitude: Schlosspost, 1 Nov. 2017, https://schloss-post.com/building-feminist-data-set-feminist-ai/.
    2. Duggan, Maeve. “Online Harassment 2017.” Pew Research Center: Internet, Science & Tech, Pew Research Center, 4 Jan. 2018, https://www.pewresearch.org/internet/2017/07/11/online-harassment-2017/.
    3. eschulze9. “EU Says Facebook, Google and Twitter Are Getting Faster at Removing Hate Speech Online.” CNBC, CNBC, 4 Feb. 2019, https://www.cnbc.com/2019/02/04/facebook-google-and-twitter-are-getting-faster-at-removing-hate-speech-online-eu-finds–.html.
    4. Matsakis, Louise. “To Break a Hate-Speech Algorithm, Try ‘Love’.” Wired, Conde Nast, 27 Sept. 2018, https://www.wired.com/story/break-hate-speech-algorithm-try-love/.
    5. “Online Hate Speech Predicts Hate Crimes on the Streets.” HateLab, https://hatelab.net/2019/10/14/online-hate-speech-predicts-hate-crimes-on-the-streets/.
    6. Seaquist, Carla. “Free Speech vs. Responsible Speech: We Need to Talk, Again.” HuffPost, HuffPost, 5 Apr. 2015, https://www.huffpost.com/entry/free-speech-vs-responsible-speech_b_6563162?guccounter=1&guce_referrer=aHR0cHM6Ly93d3cuZ29vZ2xlLmNvbS8&guce_referrer_sig=AQAAACIPwQIXp53tv37yv02FPUsua3gPyGDtORLoKyL-AXzeUJxy3kOVqSmMSr54mSdqV6iL84XSpjXRs_zpq202GtAC5wWGbk1RAyQl1df3XBqBzet0RP28FTpDhhVzU01zHuxxMTLUwJIW8VV4jZLCWmGyiwb4sXfUzyYc9_Df9nXj.
    7. “The War Against Online Trolls.” The New York Times, The New York Times, 3 Dec. 2014, https://www.nytimes.com/roomfordebate/2014/08/19/the-war-against-online-trolls/free-speech-does-not-protect-cyberharassment.
    8. Gröndahl, Tommi, et al. ‘All You Need Is “Love”: Evading Hate-Speech Detection’. ArXiv:1808.09115 [Cs], Nov. 2018. arXiv.org, http://arxiv.org/abs/1808.09115.