Releases: kucherenko/jscpd
Releases · kucherenko/jscpd
v5.0.9
New Features
- GitHub Action for jscpd (Rust v5) —
jscpd-copy-paste-detectoraction for GitHub Actions Marketplace. Scan your repo for copy/paste in CI withuses: kucherenko/jscpd/.github/workflows/action.yml@v5
Bug Fixes
- Resolve platform binary resolution when
cpdis installed as a nested dependency (e.g. in a project'snode_modulesvia a parent package). The runner now correctly locates the platform-specific binary relative to the installed package rather than assuming a top-level install. Fixes #816
Release v5.0.8
Bug Fixes
- Prevent mmap exhaustion crashes when scanning repositories with more files than (default 131 072 on Linux). The walker previously held a live per discovered file; each rayon worker now opens and drops its mapping within the processing closure, capping concurrent mappings to the thread-pool size (typically 8–32). Fixes #813
- Fix not matching relative paths when the scan root is absolute (e.g. CWD). Patterns like now match correctly by comparing against both the relative path and the full absolute path, and bare patterns like gain a prefix to match at any depth. Fixes #811
Release v5.0.7
Bug Fixes
- Prevent stack overflow when scanning directories containing deeply-nested JS/TS files (e.g. Bun's with 320K+ nested for-loops). OXC's recursive-descent parser allocates one stack frame per AST nesting level; pathological inputs now exceed the default 8 MiB thread stack. Fixed by building a local rayon with 64 MiB stacks instead of using the global pool (which silently fails on re-init)
- Default to — files exceeding the limit are skipped at walk time, consistent with jscpd v4's behavior. This prevents OXC from ever seeing megabyte-scale generated files that would overflow the stack
- now correctly takes effect on every call (previously silently no-op'd after the first invocation)
Release v5.0.6
New Features
- v4 config backward compatibility — fields , , , and are now read and applied, matching jscpd v4 behavior
- and are now distinct: matches file-level globs, matches code-level regex patterns (previously conflated)
- path config support — reads scan directories from the field, resolving relative paths against the config file's directory
- npm wrapper package — publishes the same Rust binary under the name on npm with v5.x versioning
- now matches v4 behavior: accepts optional integer value ( exits 1, exits 2); and are now independent
- Performance improvements: memory-mapped file I/O (via ) eliminates heap copies of file contents; SIMD-accelerated line counting (via ); parallel detection pipeline uses to avoid intermediate allocations; JS tokenizer no longer clones source strings before parsing (thanks to @auterium, #808)
Bug Fixes
- Fixed to match jscpd v4's behavior (was boolean, now optional integer)
- Fixed unique temp dir generation in reporter tests (added PID to prevent race conditions under parallel test runners)
Release v5.0.4
New Features
- CLI alignment with jscpd v4: new
--absolute,--ignore-case,--formats-exts,--formats-namesflags; fixed--threshold, improved--max-size - Detection and statistics aligned with jscpd for consistent output across Rust and TypeScript versions
- Side-by-side blame comparison in console-full reporter
- Clone list display in console reporter
Bug Fixes
- HTML reporter now outputs
jscpd-report.htmlat theoutput_dirroot - Resolved all clippy warnings across workspace
- Fixed unique temp dir generation in tests (use
as_nanos()instead ofsubsec_nanos())
Release v4.2.5
Bug Fixes
- JSON reporter duplicate token counts — was always reported as in JSON output; now computed from token positions () (#801).
- Gitignore parent-directory walk — files in parent directories up to the repo root are now read and combined with scan-directory files. Also reads and the global for full parity with Git's ignore resolution (#741).
- Commander v15 migration — CLI option parsing migrated from direct property access (, etc.) to the API required by Commander v8+. The / flag handling was rewritten to use Commander's native negation support instead of inspection.
- Vitest 4.1.0 — bumped from 3.2.4 to address CVE-2026-47429.
- Commander v15 — bumped from v5 to v15, enabling modern Node.js compatibility.
- Pug 3.0.4, node-sarif-builder 4.1.0, nodemon 3.1.14 — dependency bumps for security and compatibility.
jscpd v4.2.0
Breaking Changes
- Vue SFC tokenization —
.vuefiles are no longer tokenized asmarkup. Each block is now dispatched to its own sub-format:<script>→javascript,<script lang="ts">→typescript,<template>→markup,<style>→css,<style lang="scss">→scss,<style lang="less">→less. Clone reports for.vuefiles now appear under these resolved sub-format names. Any tooling or configuration that relied on.vueclones being reported undermarkupmust be updated. --formatsExtsusers — custom mappings that pointed.vuetomarkup(e.g."formatsExts": { "markup": ["vue"] }) will no longer take effect because.vueis handled by the dedicatedvueformat processor. Remove or update such mappings.
New Features
- Custom tokenizer backend — replaced the
prismjsnpm package with a self-contained reprism-based grammar engine. ~11.5% faster tokenization on real projects (avg 1126 ms → 997 ms on a 548-file, 223-format scan). - Cross-format detection — Vue SFC (
.vue), Svelte (.svelte), Astro (.astro), and Markdown files are now tokenized per-block/per-section. A<script>block in a.vuefile can match a.tsfile; a fenced code block in Markdown can match a.pyfile. - 223 supported formats — Apex, CFML/ColdFusion, GDScript, Svelte, Astro, and 70+ additional languages added (up from 152). See supported_formats.md.
- Shebang detection — extensionless executable scripts (e.g.
/usr/bin/env python3) are auto-detected by their#!shebang line and tokenized in the correct language. --store-path— configure a custom directory for the LevelDB cache, eliminating collisions when multiple jscpd processes run in parallel on the same machine.--skipComments— shorthand flag for--mode weak, which strips comments before detection.--formats-names— map specific filenames (e.g.Makefile,Dockerfile) to a detection format.
Bug Fixes
- Entire-file duplicates silently dropped (
@jscpd/core#728) — RabinKarp flushed the pending clone on a store hit at end-of-file instead of on a miss. Files that are complete copies of each other were undetected. Fixed. - ReDoS hang on Lisp/Elisp files (
@jscpd/tokenizer#737) — the Lisp string regex/"(?:[^"\\]*|\\.)*"/could catastrophically backtrack (O(2ⁿ)) on unterminated strings. Replaced with a linear/"(?:[^"\\]|\\[\s\S])*"/pattern. - Process crash on malformed
package.json(#739) —readJSONSyncthrew an unhandledSyntaxErrorwhenpackage.jsoncontained invalid JSON, killing the process. Now emits a warning and continues with an empty config. - Vue SFC cross-file detection broken — the detector used the file-level format (
vue) as the store namespace for all SFC blocks, preventing a<script>block in one.vuefile from ever matching a<script>block in another. The namespace now reflects each block's resolved sub-format. - Vue SFC incorrect column numbers — tokens on the first line of a block carried block-relative column 1 instead of file-absolute column numbers. Fixed in
@jscpd/tokenizer. - 50 dependency security vulnerabilities remediated across the monorepo (Dependabot batches).
Known Limitations
- Malformed SFC blocks (e.g. unclosed tags, invalid attributes) are silently skipped and do not contribute tokens.