Abstracts
“Feels Feminine to Me”: Understanding Perceived Gendered Style through Human Annotations
Authors: Hongyu Chen, Neele Falk, Michael Roth, Agnieszka Falenska
In most NLP tasks, language–gender associations are grounded in the author's gender identity, inferred from their language use. However, this identity-based framing risks reinforcing stereotypes and marginalizing individuals who do not conform to normative gender–language associations. To address this, we operationalize the language-gender association as a perceived gender expression of language, focusing on how such expression is externally interpreted by humans, independent of the author's gender identity. We present the first dataset of its kind: 3,100 human annotations of perceived gendered style -- human-written texts rated on a five-point scale from very feminine to very masculine. While perception is inherently subjective, our analysis identifies textual features--expressive syntactic structures and lower emotional intensity--associated with higher agreement among annotators. Moreover, annotator demographics also influence perception—for example, women annotators are more likely to label texts as feminine. Feature analysis further highlights that perceived gendered style is shaped by both affective and structural properties of text. For example, neutral style is characterized by moderate emotional intensity. Our findings lay the groundwork for operationalizing gendered style through human annotation, while also highlighting the inherent subjectivity involved in this process.
AI Argues Differently: Distinct Argumentative and Linguistic Patterns of LLMs in Persuasive Contexts
Esra Dönmez, Maximilian Maurer, Gabriella Lapesa, Agnieszka Falenska
Distinguishing LLM-generated text from human-written is a key challenge for safe and ethical NLP, particularly in high-stakes settings such as persuasive online discourse. While recent work focuses on detection, real-world use cases also demand interpretable tools to help humans understand and distinguish LLM-generated texts. To this end, we present an analysis framework comparing human- and LLM-generated arguments using two easily-interpretable feature sets: general-purpose linguistic features (e.g., lexical richness, syntactic complexity) and domain-specific features related to argument quality (e.g., logical soundness, engagement strategies). Applied to /r/ChangeMyView arguments by humans and three LLMs, our method reveals clear patterns: LLM-generated counter-arguments show lower type-token and lemma-token ratios but higher emotional intensity — particularly in anticipation and trust. They more closely resemble textbook-quality arguments — cogent, justified, explicitly respectful toward others, and positive in tone. Moreover, counter-arguments generated by LLMs converge more closely with the original post's style and quality than those written by humans. Finally, we demonstrate that these differences enable a lightweight, interpretable, and highly effective classifier for detecting LLM-generated comments in CMV.
| Contact |
|---|