GPT-4 Can Exploit Real Vulnerabilities By Reading Security Advisories
AI agents, which combine large language models with automation software, can successfully exploit real world security vulnerabilities by reading security advisories, academics have claimed.
In a newly released paper, four University of Illinois Urbana-Champaign (UIUC) computer scientists — Richard Fang, Rohan Bindu, Akul Gupta, and Daniel Kang — report that OpenAI’s GPT-4 large language model (LLM) can autonomously exploit vulnerabilities in real-world systems if given a CVE advisory describing the flaw. “To show this, we collected a dataset of 15 one-day vulnerabilities that include ones categorized as critical severity in the CVE description,” the US-based authors explain in their paper. “When given the CVE description, GPT-4 is capable of exploiting 87 percent of these vulnerabilities compared to 0 percent for every other model we test (GPT-3.5, open-source LLMs) and open-source vulnerability scanners (ZAP and Metasploit)….”
The researchers’ work builds upon prior findings that LLMs can be used to automate attacks on websites in a sandboxed environment. GPT-4, said Daniel Kang, assistant professor at UIUC, in an email to The Register, “can actually autonomously carry out the steps to perform certain exploits that open-source vulnerability scanners cannot find (at the time of writing).”
The researchers wrote that “Our vulnerabilities span website vulnerabilities, container vulnerabilities, and vulnerable Python packages. Over half are categorized as ‘high’ or ‘critical’ severity by the CVE description….”
“Kang and his colleagues computed the cost to conduct a successful LLM agent attack and came up with a figure of $8.80 per exploit”
Read more of this story at Slashdot.