JudgeRail: Harnessing Open-Source LLMs for Fast Harmful Text Detection with Judicial Prompting and Logit Rectification

Published in Under Review, 2025

Recommended citation: Zhongjie Ba, Hongye Fu, Yiqi Yang, Hongcheng Chen, Qinglong Wang, **Peng Cheng**, Zhan Qin, Kui Ren. (2025). "JudgeRail: Harnessing Open-Source LLMs for Fast Harmful Text Detection with Judicial Prompting and Logit Rectification." *Under Review*.
Download Paper | Download Bibtex