Hate expressions can take many forms - they can also be very subtle in the field of language. Credit: Roberto Schirdewahn |
Artificial intelligence can well identify swear words. But it can also recognize more hidden forms of linguistic violence?
"Piss off, you bitch!"" I'll get the bum. I'll stab you."" You should all pop them off. "Just a few examples of the form that language can take on social media. People are insulted, threatened or incited to crime. Prof. is interested in what distinguishes hate speech and other forms of damaging language from a linguistic perspective and how you can automatically recognize them. Dr. Tatjana Scheffler. She conducts research at the RUB in the field of digital forensic linguistics.
"Language processing in general has made big leaps in recent years," says Scheffler. Anyone who uses translation programs such as Google Translator or language assistants such as Siri today will achieve significantly better results than a few years ago. The classification of texts is now working quite well. Artificial intelligence algorithms can learn to assign statements to different categories. For example, you can decide whether a text passage contains a direct insult or not. The algorithms learn the categories using large training data sets that people have previously classified. Later they can transfer the knowledge of the learned categories to new data.
Tatjana Scheffler is an expert in digital forensic linguistics. Credit: RUB, Marquard |
At worst, such moods can turn into real actions. A popular example is the storm on the Capitol by supporters of the then US President Donald Trump on the 6th. January 2021. Social media are held responsible for the escalating situation.
Telegram chat analyzed by Trump followers
Credit: Roberto Schirdewahn |
Tatjana Scheffler's team checked how well existing algorithms could identify damaging language in this data set. In order to evaluate the hit rate of the algorithms, they analyzed about a fifth of the messages by hand and compared their results with those of the automated processes. They differentiated five different forms of harmful language.
Five categories of harmful language
The first category included an inciting language, such as passages such as "violence is 100,000% justified now" (violence is now 100,000% justified). The second category included derogatory terms such as "scum" or "retarded". In the third category, the team summarized expressions that are not derogatory in themselves, but were meant derogatory in the context in which they appeared - such as "they are a sickness". A fourth category was devoted to the so-called othering: comments that are used to differentiate a group of people from another, as in the example: “Are women banned from this chat? If not, why the fuck not? “(Are women excluded from this chat? If not, why not damn it?). The last category included insider formulations that a group of like-minded people uses to differentiate themselves from others and to strengthen the group feeling. Trump supporters use the term "patriot" in a certain way.
Automated processes and people in comparison
The comments coded in this way also made the researchers label automated processes such as tech companies using them to find hate speech or offensive language. 4,505 messages were included in the comparison. 3,395 of these classified both the scientists and the automated processes as not harmful, at 275 they agreed that they contained harmful language. 835 messages, on the other hand, rated man and machine differently: about half of the algorithms incorrectly classified them as hate speech or insult; unlike the scientists, they did not recognize the rest as a damaging language.
The automated processes were often wrong, especially when it came to stirring comments, insider terms and othering. "When we see in which cases established methods make mistakes, it helps us to make future algorithms better," summarizes Tatjana Scheffler. With her team, she also develops automated processes that are intended to recognize harmful language even better. On the one hand, this requires better training data for artificial intelligence. On the other hand, the algorithms themselves must also be optimized. Linguistics comes into play here again: "Certain grammatical structures can be an indication, for example, that a term is meant to be derogatory," explains Scheffler. "When I say 'you leek' it is different from when I just say 'leek'."
Improve algorithms
Tatjana Scheffler searches for such linguistic features in order to feed the algorithms of the next generation with further background knowledge. Context information could also help the machines find harmful language. Which person made the comment? She has previously made derogatory comments about others? Who is addressed - a politician or a journalist? These groups are particularly often exposed to verbal attacks. Such information could also increase the hit rate of artificial intelligence.
Without machine support, the problem of the damaging language will not be dealt with, Tatjana Scheffler is convinced of that. The volume of comments is too large for people to view and evaluate them all without support. "But it won't work without human expertise," the researcher clarifies. Because there will always be cases in which the machines are wrong or not safe.
Original publication:
Tatjana Scheffler, Veronika Solopova, Mihaela Popa-Wyatt: The Telegram chronicles of online harm, in: Journal of Open Humanities Data
Source/Credit: Ruhr University Bochum
tn042522_01