Skip to content

fix: AuthChallengeBadToken on concurrent session restore#321348

Open
mukaschultze wants to merge 2 commits into
microsoft:mainfrom
mukaschultze:fix/concurrent-session-restores
Open

fix: AuthChallengeBadToken on concurrent session restore#321348
mukaschultze wants to merge 2 commits into
microsoft:mainfrom
mukaschultze:fix/concurrent-session-restores

Conversation

@mukaschultze

Copy link
Copy Markdown

Restoring multiple remote SSH workspace sessions concurrently fails with AuthChallengeBadToken due to a race condition. VS Code doesn't use the cached _connections until they're successfully connected, which lead to multiple connection creations simultaneously.

My proposed fix adds a new _pendingConnects map with the pending connections that serves as a cache similar to the existing _connections map, so only one connection is attempted at once.

Fixes #249861

Copilot AI review requested due to automatic review settings June 14, 2026 23:17
@mukaschultze

Copy link
Copy Markdown
Author

@microsoft-github-policy-service agree

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds in-flight connection de-duplication for SSH remote agent host connects to prevent concurrent establishment races when restoring multiple windows for the same host.

Changes:

  • Introduced _pendingConnects to share a single in-progress connect across concurrent callers.
  • Refactored connect flow by extracting establishment logic into _doConnect.
  • Added tests covering concurrent connect de-duping and post-establishment fast-path reuse.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
src/vs/platform/agentHost/node/sshRemoteAgentHostService.ts Adds pending-connect tracking and refactors connect logic to deduplicate concurrent establishments.
src/vs/platform/agentHost/test/node/sshRemoteAgentHostService.test.ts Adds regression tests ensuring concurrent connects share one establishment and subsequent connects reuse the established connection.

Comment on lines +711 to +721
// No established connection yet. Share a single in-flight
// establishment across concurrent callers for the same host so that
// restoring multiple windows doesn't spawn competing agent hosts.
const pending = this._pendingConnects.get(connectionKey);
if (pending) {
this._logService.info(`${LOG_PREFIX} Reusing in-flight connect for ${connectionKey}`);
return pending;
}

const establishment = this._doConnect(config, connectionKey, replaceRelay);
this._pendingConnects.set(connectionKey, establishment);

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't think of a way for incorporating replaceRelay wihtout re-adding the race condition that this PR fixes.

Comment on lines +582 to +592
/**
* In-flight first-time establishments, keyed by `connectionKey`. The
* {@link _connections} map only dedups *completed* connections; without
* this, multiple windows restoring the same remote at once each run the
* full establishment in parallel, every one spawning its own competing
* `code agent host` (a machine-wide singleton) and clobbering the shared
* remote lockfile/token — so all but one fail to connect (see #249861).
* Concurrent callers for the same key share a single establishment and
* therefore a single agent host.
*/
private readonly _pendingConnects = new Map<string, Promise<ISSHConnectResult>>();

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says "first-time establishments", but the implementation applies whenever there is no entry in _connections (which could also happen after disconnect/cleanup, not just first-time). Consider rewording to reflect the actual condition (e.g., "in-flight establishments when no completed connection exists") to avoid misleading future readers.

Muéstrame todo el contenido en español

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Restoring multiple SSH workspaces fails when restoring more than one workspace

4 participants