A Decidedly Non-Toxic Research Visit to Sheffield University
Those of us who are involved in Natural Language Processing (NLP) research are likely aware of the General Architecture for Text Engineering (GATE) toolkit that is maintained by the University of Sheffield. Back when I was just finishing my undergraduate degree, GATE was my introduction to the field of NLP research. I therefore have very fond associations with the GATE toolkit, and enormous respect for the research behind it. When the opportunity to pursue a research project with these academics presented itself through the SoBigData++ TNA, I jumped at the opportunity.
These days I work in the detection of criminal hate speech online. This is a challenging ethical area to work in for a variety of reasons. Labelling content as hateful is a serious accusation against the original poster. When we automate the process of classifying content into these categories, it is important to be precise. Training high quality hate speech classifiers requires data. This data must come from somewhere. Often this is content published on the web through channels such as WhatsApp, Twitter, Telegram, etc.
There are three major potential issues that I would highlight with respect to hate speech datasets gathered in such a manner and their use for training hate speech classifiers:
- Classifiers will generally learn that a given word or phrase is considered hateful. In some cases, targeted communities will adopt hateful phrases to describe themselves in an effort to reclaim such terms. If a classifier is not trained with enough context to understand the nuance of the use of a term, then members of the targeted community can be blocked, making it harder for them to fight back against hate.
- Hate speech datasets are often harvested by searching for relevant terms. This can mean that they lack samples of subtle toxicity which goes undetected by classifiers.
- Gathering hate speech datasets from online platforms can erode people’s rights to privacy.
My proposal is therefore to establish practices for generating synthetic hate speech data which can be used to augment real-world data. This will improve the quality of classifiers that we can train and enhance our ability to deal with the ethical challenges around respecting people’s rights to privacy.
The researchers at Sheffield University are experts both in the various applications of GenAI, and in the detection of criminal hate speech. From my first day on the university campus I was engaged in deep, insightful conversations about how we could use GenAI to produce the kind of content my work required. We were able to establish that steering (a process of influencing the output of a model by manipulating the weights of the transformer network) was not an appropriate approach for generating the kind of outputs we needed. Rather, RAG seemed to be a more appropriate solution.
The generation of personas (identities that the model would embody) presented an interesting ethical challenge as we sought to remove our own biases from the design of synthetic individuals. We identified several ethical issues around how these personas should be assembled. This was a deep, fruitful conversation. When it comes to GenAI, we are to some degree writing the ethical guidelines as we work. I am delighted that at Sheffield I was able to have these ethical conversations, and happier still to report that these are conversations that are still ongoing and yielding fruitful research outputs.
Speaking briefly about Sheffield itself, I can honestly say that I have fallen in love with the city. The people are friendly and warm. I had wonderful conversations with everyone I met, from the people I worked with, to the people I lived with, to the random people I encountered on the streets, in the climbing centres, and at the various cafés dotted around. In Sheffield I found an environment where I could discuss and share my ideas. I found an environment where I felt welcome. I found an environment to which I very much hope to return one day.