Skip to content

Releases: kucherenko/jscpd

v5.0.9

13 Jun 06:07

Choose a tag to compare

New Features

  • GitHub Action for jscpd (Rust v5) — jscpd-copy-paste-detector action for GitHub Actions Marketplace. Scan your repo for copy/paste in CI with uses: kucherenko/jscpd/.github/workflows/action.yml@v5

Bug Fixes

  • Resolve platform binary resolution when cpd is installed as a nested dependency (e.g. in a project's node_modules via a parent package). The runner now correctly locates the platform-specific binary relative to the installed package rather than assuming a top-level install. Fixes #816

Release v5.0.8

12 Jun 06:29

Choose a tag to compare

Bug Fixes

  • Prevent mmap exhaustion crashes when scanning repositories with more files than (default 131 072 on Linux). The walker previously held a live per discovered file; each rayon worker now opens and drops its mapping within the processing closure, capping concurrent mappings to the thread-pool size (typically 8–32). Fixes #813
  • Fix not matching relative paths when the scan root is absolute (e.g. CWD). Patterns like now match correctly by comparing against both the relative path and the full absolute path, and bare patterns like gain a prefix to match at any depth. Fixes #811

Release v5.0.7

11 Jun 05:20

Choose a tag to compare

Bug Fixes

  • Prevent stack overflow when scanning directories containing deeply-nested JS/TS files (e.g. Bun's with 320K+ nested for-loops). OXC's recursive-descent parser allocates one stack frame per AST nesting level; pathological inputs now exceed the default 8 MiB thread stack. Fixed by building a local rayon with 64 MiB stacks instead of using the global pool (which silently fails on re-init)
  • Default to — files exceeding the limit are skipped at walk time, consistent with jscpd v4's behavior. This prevents OXC from ever seeing megabyte-scale generated files that would overflow the stack
  • now correctly takes effect on every call (previously silently no-op'd after the first invocation)

Release v5.0.6

10 Jun 07:52

Choose a tag to compare

New Features

  • v4 config backward compatibility — fields , , , and are now read and applied, matching jscpd v4 behavior
  • and are now distinct: matches file-level globs, matches code-level regex patterns (previously conflated)
  • path config support — reads scan directories from the field, resolving relative paths against the config file's directory
  • npm wrapper package — publishes the same Rust binary under the name on npm with v5.x versioning
  • now matches v4 behavior: accepts optional integer value ( exits 1, exits 2); and are now independent
  • Performance improvements: memory-mapped file I/O (via ) eliminates heap copies of file contents; SIMD-accelerated line counting (via ); parallel detection pipeline uses to avoid intermediate allocations; JS tokenizer no longer clones source strings before parsing (thanks to @auterium, #808)

Bug Fixes

  • Fixed to match jscpd v4's behavior (was boolean, now optional integer)
  • Fixed unique temp dir generation in reporter tests (added PID to prevent race conditions under parallel test runners)

Release v5.0.4

08 Jun 07:54

Choose a tag to compare

New Features

  • CLI alignment with jscpd v4: new --absolute, --ignore-case, --formats-exts, --formats-names flags; fixed --threshold, improved --max-size
  • Detection and statistics aligned with jscpd for consistent output across Rust and TypeScript versions
  • Side-by-side blame comparison in console-full reporter
  • Clone list display in console reporter

Bug Fixes

  • HTML reporter now outputs jscpd-report.html at the output_dir root
  • Resolved all clippy warnings across workspace
  • Fixed unique temp dir generation in tests (use as_nanos() instead of subsec_nanos())

Release v4.2.5

07 Jun 17:27

Choose a tag to compare

Bug Fixes

  • JSON reporter duplicate token counts — was always reported as in JSON output; now computed from token positions () (#801).
  • Gitignore parent-directory walk — files in parent directories up to the repo root are now read and combined with scan-directory files. Also reads and the global for full parity with Git's ignore resolution (#741).
  • Commander v15 migration — CLI option parsing migrated from direct property access (, etc.) to the API required by Commander v8+. The / flag handling was rewritten to use Commander's native negation support instead of inspection.
  • Vitest 4.1.0 — bumped from 3.2.4 to address CVE-2026-47429.
  • Commander v15 — bumped from v5 to v15, enabling modern Node.js compatibility.
  • Pug 3.0.4, node-sarif-builder 4.1.0, nodemon 3.1.14 — dependency bumps for security and compatibility.

jscpd v4.2.0

14 May 11:30

Choose a tag to compare

Breaking Changes

  • Vue SFC tokenization.vue files are no longer tokenized as markup. Each block is now dispatched to its own sub-format: <script>javascript, <script lang="ts">typescript, <template>markup, <style>css, <style lang="scss">scss, <style lang="less">less. Clone reports for .vue files now appear under these resolved sub-format names. Any tooling or configuration that relied on .vue clones being reported under markup must be updated.
  • --formatsExts users — custom mappings that pointed .vue to markup (e.g. "formatsExts": { "markup": ["vue"] }) will no longer take effect because .vue is handled by the dedicated vue format processor. Remove or update such mappings.

New Features

  • Custom tokenizer backend — replaced the prismjs npm package with a self-contained reprism-based grammar engine. ~11.5% faster tokenization on real projects (avg 1126 ms → 997 ms on a 548-file, 223-format scan).
  • Cross-format detection — Vue SFC (.vue), Svelte (.svelte), Astro (.astro), and Markdown files are now tokenized per-block/per-section. A <script> block in a .vue file can match a .ts file; a fenced code block in Markdown can match a .py file.
  • 223 supported formats — Apex, CFML/ColdFusion, GDScript, Svelte, Astro, and 70+ additional languages added (up from 152). See supported_formats.md.
  • Shebang detection — extensionless executable scripts (e.g. /usr/bin/env python3) are auto-detected by their #! shebang line and tokenized in the correct language.
  • --store-path — configure a custom directory for the LevelDB cache, eliminating collisions when multiple jscpd processes run in parallel on the same machine.
  • --skipComments — shorthand flag for --mode weak, which strips comments before detection.
  • --formats-names — map specific filenames (e.g. Makefile, Dockerfile) to a detection format.

Bug Fixes

  • Entire-file duplicates silently dropped (@jscpd/core #728) — RabinKarp flushed the pending clone on a store hit at end-of-file instead of on a miss. Files that are complete copies of each other were undetected. Fixed.
  • ReDoS hang on Lisp/Elisp files (@jscpd/tokenizer #737) — the Lisp string regex /"(?:[^"\\]*|\\.)*"/ could catastrophically backtrack (O(2ⁿ)) on unterminated strings. Replaced with a linear /"(?:[^"\\]|\\[\s\S])*"/ pattern.
  • Process crash on malformed package.json (#739) — readJSONSync threw an unhandled SyntaxError when package.json contained invalid JSON, killing the process. Now emits a warning and continues with an empty config.
  • Vue SFC cross-file detection broken — the detector used the file-level format (vue) as the store namespace for all SFC blocks, preventing a <script> block in one .vue file from ever matching a <script> block in another. The namespace now reflects each block's resolved sub-format.
  • Vue SFC incorrect column numbers — tokens on the first line of a block carried block-relative column 1 instead of file-absolute column numbers. Fixed in @jscpd/tokenizer.
  • 50 dependency security vulnerabilities remediated across the monorepo (Dependabot batches).

Known Limitations

  • Malformed SFC blocks (e.g. unclosed tags, invalid attributes) are silently skipped and do not contribute tokens.