Ponencia
AI in the Gray: Exploring Moderation Policies in Dialogic Large Language Models vs. Human Answers in Controversial Topics [Póster]
Autor/es | Ghafouri, Vahid
Agarwal, Vibhor Zhang, Yong Sastry, Nishanth Such, José Suarez Tangil, Guillermo |
Coordinador/Director | Varela Vaca, Ángel Jesús
Ceballos Guerrero, Rafael Reina Quintero, Antonia María |
Fecha de publicación | 2024 |
Fecha de depósito | 2024-08-26 |
Publicado en |
|
ISBN/ISSN | 978-84-09-62140-8 |
Resumen | The increasing sophistication of Large Language Models (LLMs), particularly ChatGPT, has revolutionized how users interact with information and make decisions. However, when addressing controversial topics without universally ... The increasing sophistication of Large Language Models (LLMs), particularly ChatGPT, has revolutionized how users interact with information and make decisions. However, when addressing controversial topics without universally ac cepted answers, such as religion, gender identity, or freedom of speech, these models face the challenge of potential bias. Biased responses in these complex domains can amplify misinformation, fuel harmful ideologies, and undermine trust in AI systems. This paper investigates the biases embedded within LLMs like ChatGPT when responding to controversial questions. We use the Kialo social debate platform as a benchmark, comparing AI generated responses to human discussions. Our analysis reveals significant progress in reducing explicit biases in recent ChatGPT versions. However, residual implicit biases, including subtle right-wing leanings, call for further moderation. These findings hold substantial cybersecurity implications, emphasizing the need to mitigate the spread of misinformation or the promotion of extremist viewpoints through AI-powered systems. |
Cita | Ghafouri, V., Agarwal, V., Zhang, Y., Sastry, N., Such, J. y Suarez Tangil, G. (2024). AI in the Gray: Exploring Moderation Policies in Dialogic Large Language Models vs. Human Answers in Controversial Topics [Póster]. En Jornadas Nacionales de Investigación en Ciberseguridad (JNIC) (9ª.2024. Sevilla) (486-487), Sevilla: Universidad de Sevilla. Escuela Técnica Superior de Ingeniería Informática. |
Ficheros | Tamaño | Formato | Ver | Descripción |
---|---|---|---|---|
JNIC24_504.pdf | 1.152Mb | [PDF] | Ver/ | |