More than 35,000 Java packages, amounting to over 8% of the Maven Central repository (the most significant Java package repository), have been impacted by the recently disclosed log4j vulnerabilities (1, 2), with widespread fallout across the software industry. The vulnerabilities allow an attacker to perform remote code execution by exploiting the insecure JNDI lookups feature exposed by the logging library log4j. This exploitable feature was enabled by default in many versions of the library.
As far as ecosystem impact goes, 8% is enormous. The average ecosystem impact of advisories affecting Maven Central is 2%, with the median less than 0.1%.
Direct dependencies account for around 7,000 of the affected artifacts, meaning that any of its versions depend upon an affected version of log4j-core or log4j-api, as described in the CVEs. The majority of affected artifacts come from indirect dependencies (that is, the dependencies of one’s own dependencies), meaning log4j is not explicitly defined as a dependency of the artifact, but gets pulled in as a transitive dependency.
We counted an artifact as fixed if the artifact had at least one version affected and has released a greater stable version (according to semantic versioning) that is unaffected. An artifact affected by log4j is considered fixed if it has updated to 2.16.0 or removed its dependency on log4j altogether.
At the time of writing, nearly five thousand of the affected artifacts have been fixed. This represents a rapid response and mammoth effort both by the log4j maintainers and the wider community of open source consumers.
That leaves over 30,000 artifacts affected, many of which are dependent on another artifact to patch (the transitive dependency) and are likely blocked.
Why is fixing the JVM ecosystem hard?
Another difficulty is caused by ecosystem-level choices in the dependency resolution algorithm and requirement specification conventions.
In the Java ecosystem, it’s common practice to specify “soft” version requirements — exact versions that are used by the resolution algorithm if no other version of the same package appears earlier in the dependency graph. Propagating a fix often requires explicit action by the maintainers to update the dependency requirements to a patched version.
This practice is in contrast to other ecosystems, such as npm, where it’s common for developers to specify open ranges for dependency requirements. Open ranges allow the resolution algorithm to select the most recently released version that satisfies dependency requirements, thereby pulling in new fixes. Consumers can get a patched version on the next build after the patch is available, which propagates up the dependencies quickly. (This approach is not without its drawbacks; pulling in new fixes can also pull in new problems.)
How long will it take for this vulnerability to be fixed across the entire ecosystem?
But things are looking promising on the log4j front. After less than a week, 4,620 affected artifacts (~13%) have been fixed. This, more than any other stat, speaks to the massive effort by open source maintainers, information security teams and consumers across the globe.
Thanks and congratulations are due to the open source maintainers and consumers who have already upgraded their versions of log4j. As part of our investigation, we pulled together a list of 500 affected packages with some of the highest transitive usage. If you are a maintainer or user helping with the patching effort, prioritizing these packages could maximize your impact and unblock more of the community.
We encourage the open source community to continue to strengthen security in these packages by enabling automated dependency updates and adding security mitigations. Improvements such as these could qualify for financial rewards from the Secure Open Source Rewards program.
You can explore your package dependencies and their vulnerabilities by using Open Source Insights.