Zero-day vulnerabilities found in open-source AI projects using Vulnhuntr

Vulnhuntr’s innovative static code analysis reveals over a dozen zero-day vulnerabilities in popular open-source AI projects on Github, raising significant security alarms.

Discovery of Zero-Day Vulnerabilities in Open-Source AI Projects Using Vulnhuntr

In a startling revelation, Vulnhuntr, a cutting-edge static code analyser leveraging large language models (LLMs), has uncovered over a dozen zero-day vulnerabilities in prominent open-source AI projects hosted on Github. These projects, each boasting more than 10,000 stars, were examined over the span of just a few hours, highlighting a crucial issue within the realm of software security.

The vulnerabilities detected encompass a range of serious security threats including Local File Inclusion (LFI), Cross-Site Scripting (XSS), Server-Side Request Forgery (SSRF), Remote Code Execution (RCE), Insecure Direct Object Reference (IDOR), and Arbitrary File Overwrite (AFO). This discovery underscores the potential risks inherent in widely used software but also showcases the efficacy of advanced tools in identifying such hidden threats.

Vulnhuntr’s success in identifying these vulnerabilities stems from its advanced use of LLMs to scrutinise Python codebases. The security tool circumvents the traditional limitations posed by the context window size within LLMs by fragmenting the code into smaller segments, which are then examined systematically. This innovative approach enables Vulnhuntr to reconstruct the entire call chain from user input to server output, enhancing the detection of vulnerabilities through prompt engineering techniques designed to guide the LLM towards a thorough analysis.

At present, Vulnhuntr concentrates its capabilities on Python and specific types of vulnerabilities, offering a marked improvement over conventional static code analyzers. Its methodology also reduces the frequency of false positives and negatives, making it a valuable asset for developers seeking to fortify their projects against complex multi-step vulnerabilities.

In their investigative endeavour, researchers explored using Retrieving Augmented Generation (RAG) and fine-tuning LLMs to trace vulnerability call chains within code. However, RAG’s accuracy was compromised by ambiguous function names, whereas fine-tuning the models resulted in a high incidence of false positives and difficulties in handling multi-file vulnerabilities. Static parsing also faced challenges, particularly with dynamically typed languages like Python, due to runtime modifications and inherent limitations of static analysis tools.

To address these challenges, the solution involved supplying the LLM with the specific line of code where a function is called, alongside the function name, which aids in pinpointing files and function locations more effectively within a project, enhancing call chain precision.

Among the discovered vulnerabilities, a notable Remote Code Execution (RCE) flaw was highlighted. This vulnerability could allow attackers to execute arbitrary code due to insufficient input validation within a custom component’s functionality. Additionally, Server-Side Request Forgery (SSRF) vulnerabilities were identified, where improperly sanitised user-controlled URLs could potentially grant access to internal resources.

Moreover, an Insecure Direct Object Reference (IDOR) vulnerability within a PUT endpoint was found to permit unauthorised modifications to messages. LFI and AFO vulnerabilities were noted in relation to inadequate filename sanitisation during file uploads. The presence of Cross-Site Scripting (XSS) vulnerabilities, resulting from a lack of output encoding and the handling of user-controlled content, was also documented, presenting a significant security risk.

Vulnhuntr assigns confidence scores ranging from 1 to 10 to its findings, indicating the probability of an identified issue being a genuine vulnerability. As LLMs continue to evolve and gain power, the reliance on static code parsing might diminish, yet an emphasis on the call chain between user input and server output is anticipated to continue enhancing the accuracy of vulnerability detection.

The findings illuminated by Vulnhuntr and partnered experts bring to light vital security concerns within popular AI projects, emphasising the importance of ongoing vigilance and improvement in software security practices.

Source: Noah Wire Services