Dieses Bild zeigt Kieron Kretschmar

Kieron Kretschmar

Herr M.Sc.

PhD student
Stuttgart Research Focus "Interchange Forum for Reflecting on Intelligent Systems"
IRIS3D / AI Safety / Untersuchung von Hawthorne Effekten in Large Reasoning Models
[Bild: privat]

Kontakt

Universitätsstraße 32
70569 Stuttgart
Raum: 00.118

Vaugrante, Laurène; Niepert, Mathias; Hagendorff, Thilo (2024): A Looming Replication Crisis in Evaluating Behavior in Language Models? Evidence and Solutions. In arXiv:2409.20303, pp. 1–23. (Link)

Vaugrante, Laurène; Carlon, Francesca; Menke, Maluna; Hagendorff, Thilo (2025): Compromising Honesty and Harmlessness in Language Models via Deception Attacks. In arXiv:2502.08301, pp. 1-14. (Link)

Zum Seitenanfang