A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/pmd/pmd/issues/2704 below:

[core] Analysis cache classpath checksum should check ABI and not jar binary blob · Issue #2704 · pmd/pmd · GitHub

The current implementation hashes the complete jars in the classpath in order

private long computeClassPathHash(final URL... classpathEntry) { final Adler32 adler32 = new Adler32(); for (final URL url : classpathEntry) { try (CheckedInputStream inputStream = new CheckedInputStream(url.openStream(), adler32)) { // Just read it, the CheckedInputStream will update the checksum on it's own while (IOUtils.skip(inputStream, Long.MAX_VALUE) == Long.MAX_VALUE) { // just loop } } catch (final FileNotFoundException ignored) { LOG.warning("Auxclasspath entry " + url.toString() + " doesn't exist, ignoring it"); } catch (final IOException e) { // Can this even happen? LOG.log(Level.SEVERE, "Incremental analysis can't check auxclasspath contents", e); throw new RuntimeException(e); } } return adler32.getValue(); }

However, this could be further improved.

We do care about jar order in the classpath, and we do care about the contents of each class in the JAR. However, the jar itself is a zip, and as such contains not only the contents of the class files, but also metadata. In particular:

  1. The order of the entries (class files) within the jar may differ, but as each file name is unique this change on it's own means nothing.
  2. The timestamp for each file entry may differ, but once again, this change on it's own means nothing.
  3. Non-java classes resources and entries may be present (META-INF, properties files, etc.), that have no impact on the ABI

Furthermore, as for type resolution analysis, we only care about the ABI, not the implementation… so a further refinement would be to simply check for exposed symbols and signatures. There is a trade-off however between potential saves due to cache vs effort needed to fingerprint the classpath. Multiple level hashes could be used in an optimistic approach (ie: first try complete binary match for each jar, in case of differences, look deeper to see if it's avoidable).

Looking more deeply into the jars may therefore avoid cache invalidations.


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4