Skip to content

fix: share one exchanged WIF credential across spawned Claude processes#1407

Open
KeisukeYamashita wants to merge 1 commit into
anthropics:mainfrom
KeisukeYamashita:fix/wif-shared-credentials-cache
Open

fix: share one exchanged WIF credential across spawned Claude processes#1407
KeisukeYamashita wants to merge 1 commit into
anthropics:mainfrom
KeisukeYamashita:fix/wif-shared-credentials-cache

Conversation

@KeisukeYamashita

Copy link
Copy Markdown

Fixes #1406

Problem

GitHub OIDC tokens are single-use at the Anthropic token-exchange endpoint (the same jti cannot be exchanged twice). When the plugins / plugin_marketplaces inputs are configured, the action spawns several short-lived claude processes per job (claude plugin marketplace add, one claude plugin install per plugin, then the main query). Each process resolved workload identity federation from the bare env vars and exchanged the same identity-token file independently: the first exchange succeeded and every later process got 401 (jti_reused), which the main query retried for ~3 minutes before failing the job. Details, eBPF process traces, and request-ids in #1406.

Fix

The SDK only enables its on-disk credentials cache (<config_dir>/credentials/<profile>.json, shared across processes) when federation is loaded from a profile config file, not from bare env vars. setupWorkloadIdentity() now additionally writes a profile pointing at the identity-token file and selects it via ANTHROPIC_CONFIG_DIR + ANTHROPIC_PROFILE, so the first process exchanges once and every other process reuses the cached access token.

  • The profile lives in an action-owned dir under RUNNER_TEMP (0700 dir / 0600 file), next to the identity token.
  • ANTHROPIC_IDENTITY_TOKEN_FILE and the federation env vars are kept as a fallback for CLIs that predate profile support — older CLIs degrade to today's behavior, never worse.
  • The existing 4-minute identity-token refresh is unchanged; after the cached access token expires, the next exchange picks up the refreshed (new-jti) token from the same file path recorded in the profile.

Verification

  • bun test: 702/702 pass (includes 2 new unit tests for the profile file), bun run typecheck, bun run format:check all green.
  • Local reproduction against a mock exchange endpoint that enforces single-use jti: two sequential claude processes sharing one identity token — env-var path: second process retries jti_reused for ~3 minutes and dies (matches production failures exactly); profile path: second process reuses the cached credential with zero additional exchanges and succeeds. Verified with both Claude Code 2.1.167 and 2.1.173.
  • On a GitHub-hosted runner with a real federation rule: the exact configuration that previously failed on 100% of runs (two trailofbits plugins + WIF, on both v1.0.139 and v1.0.144) succeeds with this branch — is_error: false, main query completes in ~6s. An eBPF trace shows the same five claude processes contacting api.anthropic.com, now with a single token exchange.

Notes / limitations

  • A cold-start race (two processes exchanging before the first cache write) is still theoretically possible, but the action spawns its subprocesses sequentially, so in practice the first one exchanges and the rest reuse. A complete fix (serializing the exchange / locking the credentials file) would belong in the CLI/SDK.
  • If a user has pre-set ANTHROPIC_CONFIG_DIR / ANTHROPIC_PROFILE, they are overridden while federation inputs are configured (federation inputs are an explicit opt-in, and previously the env-var path was similarly authoritative).

GitHub OIDC tokens are single-use at the Anthropic token-exchange
endpoint (the same jti cannot be exchanged twice). With plugins
configured, the action spawns several short-lived claude processes
(plugin marketplace add, one plugin install per plugin, then the main
query). Each resolved federation from bare env vars and exchanged the
same identity-token file independently: the first exchange succeeded
and every later process got 401 (jti_reused), which the main query
retried for ~3 minutes before failing the job.

The SDK only enables its on-disk credentials cache when federation is
loaded from a profile config file, not from bare env vars. Write a
profile pointing at the identity-token file and select it via
ANTHROPIC_CONFIG_DIR / ANTHROPIC_PROFILE so the first process exchanges
once and the rest reuse the cached access token. The env vars are kept
as a fallback for CLIs that predate profile support.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WIF federation fails with 401 (jti_reused) when plugins are configured — each spawned claude process re-exchanges the single-use OIDC token

1 participant