-
Abstract
Artificial intelligence (AI) is rapidly advancing scientific discovery, but this progress carries risks of misuse, such as the creation of harmful substances, or circumvention of established regulations. In this paper, we first demonstrate the risks by highlighting real-world examples of AI misuse in chemical science, which underscore the need for effective safety alignment for these AI models. In response, we propose SciGuard, an agent-based guardrail that employs large language models, tools and external knowledge to assess and control risks in scientific AI interactions. For a fair comparison, we introduce a benchmark SciMT (Scientific Multi-Task) to assess both the safety and utility of different AI systems. SciGuard achieves a state-of-the-art harmlessness score on red-teaming queries, while maintaining high performance on benign tasks, without sacrificing scientific knowledge. Finally, we call for continued research and dialogue to ensure the safe deployment of AI in science. -
