JudgeRail: Harnessing Open-Source LLMs for Fast Harmful Text Detection with Judicial Prompting and Logit Rectification
Published in Under Review, 2025
Recommended citation: Ba, Z., Fu, H., Yang, Y., Chen, H., Wang, Q., Cheng, P., Qin, Z., Ren, K. (2025). "JudgeRail: Harnessing Open-Source LLMs for Fast Harmful Text Detection with Judicial Prompting and Logit Rectification." *Under Review*.
Download Paper | Download Bibtex