Centris, a new tool developed by a global team of researchers from Korea University and the Georgia Institute of Technology, is designed to make the reuse of open source software components more manageable and secure.
Centris uses a novel approach to track open source components in software projects, even when the integration is partial and under a modified structure. It has already managed to root out old vulnerabilities in hundreds of GitHub projects, the developers say.
The DevSecOps-friendly tool was introduced in a paper on the arXiv preprint server earlier this month and will be presented at the International Conference on Software Engineering (ICSE) conference later this year.
The challenges of managing open source software
The reuse of open source software (OSS) has many advantages, including slashing the time of software development and opening applications up to public scrutiny, which can help improve security.
OSS reuse, however, does have its own challenges, especially when components are not used in their original form.
Many programs integrate part of an OSS project or use OSS components in a nested format, where one component contains part of another open source project.
To further add to the complication, some developers change the file name and hierarchy of the open source projects they integrate into their code. All this makes it hard to keep track of changes in OSS components.
“We discovered that modified OSS reuse accounts for 95% of the total OSS reuse in the popular OSS ecosystem,” Seunghoon Woo, lead author of the Centris paper, told The Daily Swig.
Traditional tools used for managing OSS elements in software projects generally miss modified components because they assume the code is being used in its original form. Other tools that use code cloning detection techniques, however, generate too many false positives.
“Approaches that considered only unmodified OSS components resulted in missing many modified components (i.e., low recall), or misinterpreted that an OSS, which was actually not reused, was a component (i.e., low precision),” Woo said.
Open source security issues
Losing track of OSS dependencies can quickly become a serious security problem. When a vulnerability crops up in an untracked OSS component it tends to remain in an application for a long time.
For instance, the researchers found that Godot Engine, a GitHub project with more than 36,000 stars, was reusing a single file from an open source JPEG-compressor that had a vulnerability with a 7.8 CVSS score dating back to 2017.
According to the researchers, the exploit “could be reproduced by simply uploading a malicious image file to the Godot project”. And since Godot was using a single file from the JPEG-compressor project, OSS dependency trackers didn’t spot the dependency and vulnerability.
“As another example, NMAP reused PCRE with modification, and this modified PCRE has not been properly managed and updated for over 10 years,” Woo said.
Centris: A new way to detect OSS modification
Centris has a component database, which is composed of functions extracted from more than 10,000 GitHub repositories and spanning more than 80 billion lines of code.
All versions of the projects are processed and distilled to eliminate redundancies and minimize the required space to store the functions.
Centris uses this database to spot reused OSS functions and their respective versions in target projects. This granular approach allows Centris to identify OSS components in software projects regardless of whether all or parts of the codebase are reused.
According to the researchers, Centris can identify reused OSS components with 91% precision and 94% recall, even when modified OSS reuse is prominent.
Centris aims to help solve the ‘dependency problem’ by detecting reused open source components
“Centris discovered that 572 OSS projects contain at least one other vulnerable OSS component. Among them, 27 OSS projects are still reusing the vulnerable OSS in their latest version,” the researchers wrote in their paper.
Improving vulnerability detection
Woo told The Daily Swig that Centris can be an effective solution to vulnerability propagation and license violation.
It can also help the mitigation of software supply chain attacks, in which hackers spread malicious payloads through legitimate software distribution channels.
For instance, if attackers manage to upload malicious code into an OSS repository, the vulnerability will be propagated to all software projects that depend on it.
If developers clearly identify and manage the components being reused in their software, a goal that Centris pursues, they will be able to cope with supply chain attacks much faster and more efficiently, Woo says.
Doubling down
In the future, the researchers are considering adding more security features to Centris. “We are considering a way to provide alerts when a new vulnerability is detected (this can be checked through crawling public vulnerability database information) in components identified by Centris,” Woo says.
The researchers also plan to combine Centris with VUDDY, another vulnerable code detection method they developed in 2017. This will enable developers to resolve vulnerability propagation problems more efficiently, Woo says.
Finally, the researchers are planning to integrate Centris with a soon-to-launch vulnerability detection platform.
“We plan to provide a public open web service of Centris for free in an automated vulnerability analysis platform soon, so that anyone can identify the components in their software freely to eliminate the potential threats before using open-source software,” Woo says.