Skip to content

Add support for compression dictionary transport#1854

Open
pmeenan wants to merge 20 commits into
whatwg:mainfrom
pmeenan:dictionaries
Open

Add support for compression dictionary transport#1854
pmeenan wants to merge 20 commits into
whatwg:mainfrom
pmeenan:dictionaries

Conversation

@pmeenan

@pmeenan pmeenan commented Aug 25, 2025

Copy link
Copy Markdown
Contributor

Add processing steps for handling HTTP Compression Dictionary Transport content encoding and dictionary negotiation (RFC pending publication).

This adds a processing layer between the HTTP cache and network fetch that handles most of the dictionary-based content encoding (including matching dictionaries to outgoing requests).

Additionally, it adds processing above the HTTP cache for storing the dictionaries for future use and defines the "compression-dictionary" initiator and destination (the matching HTML spec update is in-process).

Support for clearing the caches through clear-site-data is in this PR.

Fix #1739, #1839


Preview | Diff

@pmeenan

pmeenan commented Sep 3, 2025

Copy link
Copy Markdown
Contributor Author

The RFC is pending publication so this will have to wait until that happens (should be any day now). RFC number has been assigned and final edits are complete: https://www.rfc-editor.org/auth48/rfc9842

@pmeenan

pmeenan commented Nov 11, 2025

Copy link
Copy Markdown
Contributor Author

RFC has published now so this should be ready to go (just rebased it).

@pmeenan pmeenan requested a review from annevk November 11, 2025 16:47
@pmeenan

pmeenan commented Nov 13, 2025

Copy link
Copy Markdown
Contributor Author

From the TPAC 2025 discussions, I added support for opaque responses to use compression dictionary when the Cross-Origin-Resource-Policy response header is set to cross-origin.

Comment thread fetch.bs Outdated
Comment thread fetch.bs Outdated
Comment thread fetch.bs Outdated
Comment thread fetch.bs Outdated
Comment thread fetch.bs Outdated
Comment thread fetch.bs Outdated
Comment thread fetch.bs Outdated
Comment thread fetch.bs Outdated
Comment thread fetch.bs Outdated
Comment thread fetch.bs Outdated
@pmeenan

pmeenan commented Mar 19, 2026

Copy link
Copy Markdown
Contributor Author

Sorry for dragging this out. I think I addressed all of the questions/issues. The main question outstanding right now is if the no-cors path using CORP is "approved" or if there are more steps to getting that integrated before this can land.

I filed it as a question with the TAG but I'm happy do do a more formal review somewhere if necessary.

@pmeenan

pmeenan commented Apr 28, 2026

Copy link
Copy Markdown
Contributor Author

I switched back to only allowing dictionary use for non-opaque requests. It's cleaner from the browser's perspective and harder to get wrong for sites. This is what the browsers have implemented anyway and what the current WPT's test.

The spec change should be ready to go now and match Chrome's and Mozilla's implementations.

If we want to bring dictionary support to third-party embeds in some way (the main pain point for no-cors), we can solve that separately.

Comment thread fetch.bs Outdated

<li><p>Let <var>pattern</var> be the result of
<a for=/>creating a URL pattern</a> from <var>dictionaryValue</var>["<code>match</code>"]
and <var>request</var>'s <a for=request>current URL</a>.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems the text for callers of this algorithm is written that way: "creating a URL pattern, given dictionaryValue["match"], request's current URL and an empty map".

Also, request's current URL is an URL not a string, so we should perform a serialization similar to https://urlpattern.spec.whatwg.org/#other-specs-http (IIUC we can't use that one, because dictionary["match"] is a dictionary value not an HTTP structured field value).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. This should be fixed now. I changed it to extract the "item" from match and to serialize the URL. Let me know if you think a different serialization is needed than the one I used.

fred-wang added a commit to fred-wang/WebKit that referenced this pull request May 20, 2026
https://bugs.webkit.org/show_bug.cgi?id=295249

- Add build/runtime flag for compression dictionary transport.

- Add "compression-dictionary" destination type.
  whatwg/fetch#1854

- Add Link rel "compression-dictionary".
  whatwg/html#11619
Comment thread fetch.bs
the <a lt="URL serializer">serialization</a> of <var>request</var>'s <a for=request>current URL</a>,
and an empty map.

<li><p>If <var>pattern</var> is failure or <var>pattern</var> <a for=/>has regexp groups</a>,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"creating a URL pattern" uses algos from https://url.spec.whatwg.org/ that can return failure, but when that happens it actually throw an exception, rather than returning failure?

I wanted to ask about these potential exceptions the other day, but couldn't find an obvious input here that would make the algo throw... Anyway, I find this spec indeed deals with the case when an exception is thrown so I guess we probably want the same here: https://wicg.github.io/connection-allowlists/#abstract-opdef-parse-a-connection-allowlist-header

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also seems we lack tests for this. We should have somewhat exhaustive tests for error conditions.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I added similar language that it should return response if the URL Pattern creation throws an exception. I'll add WPT tests for that and the rest of the edge cases that aren't currently covered now (will take a few days to work their way through).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just landed a (hopefully comprehensive) set of WPT tests to cover all of the edge cases I could think of: web-platform-tests/wpt#60164

  • Invalid and out-of-scope match properties (syntax failures and cross-origin patterns).
  • Invalid/unsupported dictionary type tokens.
  • Comprehensive match-dest parsing (unknown destinations, matching/non-matching destinations, and explicit wildcards).
  • Dictionary id maximum length validation (exactly 1024 characters vs exceeding 1025 characters).
  • Robustness against unknown additional dictionary parameters.
  • Rejection of entirely malformed structured headers (Use-As-Dictionary: ?0) and non-cacheable responses (max-age=0).
  • Opaque response tainting resulting from cross-origin redirects under no-cors mode correctly ignoring registration.
  • Evaluation of Available-Dictionary on matching redirect targets across redirect chains.
  • Precedence evaluation when matching overlapping pattern scopes.
  • Decoding failures resulting from dictionary hash mismatches throwing clean network errors for both dcb and dcz.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for these tests! In general they look good but I have some suggestions/comments/questions:

Dictionary with unknown match-dest is not used for fetch() API

So this is aligned with your PR because we return the response if matchDestList is empty (so similar to "Dictionary registration with empty match-dest list acts as wildcard'"). But can you please also add a similar test that does not hit the empty case:

compression_dictionary_promise_test(async (t) => {
  const match_dest = encodeURIComponent('("asdf", "")');
  ...

As I read your PR, "asdf" would be removed but the dictionary registration would still succeed, right? (similar to "Dictionary registration with matching fetch destination")

Dictionary registration with 1024 character dictionary ID

For completeness, can we also have a test that checks the character range. That would also exercise parsing/serialization of the dquotes and backslash is properly done as per rfc965:

+compression_dictionary_promise_test(async (t) => {
+  // https://www.rfc-editor.org/info/rfc9651/#name-parsing-a-string
+  // double quotes and backslash are escaped in headers per RFC9651.
+  let dictionaryIDsuffix = " !#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvwxyz{|}~";
+  let dictionaryID = `\\"\\\\${dictionaryIDsuffix}`;
+  const dict = await (await fetch(`${kRegisterDictionaryPath}?id=${encodeURIComponent(dictionaryID)}`)).text();
+  // Wait until `available-dictionary` header is available.
+  assert_equals(
+      await waitUntilAvailableDictionaryHeader(t, {}),
+      kDefaultDictionaryHashBase64);
+  assert_equals(await checkHeader('dictionary-id', {}), `"${dictionaryID}"`);
+}, `Dictionary registration with dictionary ID (valid characters and backslash escaping)`);

Dictionary with 1025 character dictionary ID is not registered

This test does not seem aligned with your PR? The ID would be ignored, so that should just behave like the "Simple dictionary registration and unregistration" case?

Also we can have a test for invalid character range (like '€') but in that case https://httpwg.org/specs/rfc9651.html#error-handling says the entire field should be ignored, so we would indeed really check that there is no available-dictionary header here (as opposed to a successful registration without ID).

Note that per https://httpwg.org/specs/1.html#error-handling the spec author can decide whether they want to ignore the entire field or define other handling (here remove the id from the dictionary).

Overlapping match patterns prioritize the more specific dictionary

Would be nice to test a bit more the priority from https://www.rfc-editor.org/info/rfc9842/#section-2.2.3

At least I think

  • A Use-As-Dictionary with "match-dest" has priority over a Use-As-Dictionary without "match-dest" (with a longer "match" length even).
  • Two Use-As-Dictionaries with the same match's length and matchDest's emptiness but different fetch time (the most recent wins).

Would be nice to test for the "best match" algorithms that involve more than 2 dictionaries too.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I'll update the tests (may take a week to make it through the plumbing).

As far as priority goes, path specificity is the only factor in prioritization (with freshness being a tie-breaker). match-dest is just a filter for what candidates can be considered.

For the unknown types, I'll see if something is broken with the current spec language and update it but the intent is:

  • Default to match-everything (by defaulting to an empty list)
  • If a match-dest are specified and it is not an empty list, ONLY match if the current dest is in the list.

For unsupported dests, this likely means we don't want to register them at all since there is nothing they could match and we don't want them to be treated like a wildcard. This might need some additional logic in the registration flow to remember the initial state of the match-dest list (empty or not) before filtering for known dest's and then drop the registration if the list is empty but it wasn't initially empty.

We want to avoid the case of a new dest being introduced and dictionaries that are targeting that dest accidentally behaving like a wildcard (and maybe taking priority over all other dictionaries for browsers that don't support the new dest).

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as priority goes, path specificity is the only factor in prioritization (with freshness being a tie-breaker). match-dest is just a filter for what candidates can be considered.

But https://www.rfc-editor.org/info/rfc9842/#name-multiple-matching-dictionar actually does use match-dest non-emptiness as the 1st factor for prioritization, with match length being second and freshness being third:

  1. For clients that support request destinations, a dictionary that specifies and matches a "match-dest" takes precedence over a match that does not use a destination
  2. Given equivalent destination precedence [...]
  3. Given equivalent destination and match length precedence [...]

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yikes - thanks for pointing that out - Chrome's implementation probably needs to be fixed then (sorry, been working on it for so many years at this point that I'm forgetting parts). I'll fix the tests to make sure they test the behavior (and fix Chrome's implementation).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, I updated the PR to fail registration when id is invalid to match the test. Generally we want to ignore unknown fields (or unknown values in fields where we expect for future extensibility of values) but known fields with invalid values we should fail the registration.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, I updated the PR to fail registration when id is invalid to match the test.
For unsupported dests, this likely means we don't want to register them at all since there is nothing they could match and we don't want them to be treated like a wildcard.

OK I think these address my remaining concerns regarding mismatch between spec and tests. For match-dest, I believe the match-dest change is still pending, but probably you can just do an early check for the emptiness of match-dest rather than remembering the initial state (see my other comment above).

Comment thread fetch.bs
"<code>dictionary</code>", and <var>response</var>'s <a for=response>header list</a>.

<li><p>If <var>dictionaryValue</var> is null or <var>dictionaryValue</var>["<code>match</code>"]
does not <a for=map>exist</a>, then return <var>response</var>.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about other parameters?

For example https://www.rfc-editor.org/rfc/rfc9842#name-type says a client should not deal with a type value it does not understand, so I guess we should bail out if that happens.

For id what happens if it exceeds the 1024 characters limit?

For match-dest, I guess clients would just ignore unknown destinations.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added details on how to handle invalid values for the other parameters. I don't explicitly call out that unknown keys should be ignored but I can add an explicit step to delete unknown keys if you think it would help.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was not sure what is the common way to handle this, however https://httpwg.org/specs/rfc9651.html#preserving-extensibility mentions this:

Specifications that use Dictionaries can also allow for forward compatibility by requiring that the presence of -- as well as value and type associated with -- unknown keys be ignored. Subsequent specifications can then add additional keys, specifying constraints on them as appropriate.

So maybe the spec should be explicit about ignoring unknown keys.

Incidentally, https://httpwg.org/specs/rfc9651.html#error-handling says ignoring the entire field is the default behavior when field-specific constraints are violated so it's indeed good the spec is now explicit for id, type, etc that we only ignore the corresponding key.

(note: probably rfc9651 should also have similar wording)

Comment thread fetch.bs Outdated
Comment thread fetch.bs

<li><p>If <var>key</var> is null, then return null.

<li><p>Return the unique compression-dictionary cache associated with <var>key</var>. [[!RFC9842]]

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should make compression-dictionary cache a link.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mirrors the section above on HTTP cache partitions where it does "Return the unique HTTP cache associated with...". The RFC itself doesn't go into detail about the cache itself and how it operates. Is that something that needs to be fully defined somewhere?

Comment thread fetch.bs Outdated
<li><p>If <var>compressionDictionaryCache</var> is null, then return <var>response</var>.

<li><p>Let <var>pattern</var> be the result of
<a for=/>creating a URL pattern</a> given the bare item of <var>dictionaryValue</var>["<code>match</code>"],

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is bare item?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a link to the structured field parsing reference of bare item.

Comment thread fetch.bs Outdated
Comment thread fetch.bs Outdated
Comment thread fetch.bs Outdated
Comment thread fetch.bs Outdated
Comment thread fetch.bs Outdated
Comment thread fetch.bs Outdated
Comment thread fetch.bs Outdated

<li><p>If <var>availableDictionaryItem</var> is null, then return a <a>network error</a>.

<li><p>Let <var>availableDictionaryHash</var> be the bare item of <var>availableDictionaryItem</var>.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could maybe be inlined, though again we need more clarity on bare item.

@annevk

annevk commented May 22, 2026

Copy link
Copy Markdown
Member

I vaguely remember we discussed the token name incorrectly containing a hyphen, but I can't find it. Do you remember where that is? I suspect it's not worth correcting at this stage, but we should call it out in the specification is some kind of error we regret in the name of web compatibility so that we don't make it again for a new value.

@pmeenan

pmeenan commented May 22, 2026

Copy link
Copy Markdown
Contributor Author

I vaguely remember we discussed the token name incorrectly containing a hyphen, but I can't find it. Do you remember where that is? I suspect it's not worth correcting at this stage, but we should call it out in the specification is some kind of error we regret in the name of web compatibility so that we don't make it again for a new value.

@annevk the link relation discussion was here which is where the token came from AFAIK.

fred-wang added a commit to fred-wang/WebKit that referenced this pull request May 23, 2026
https://bugs.webkit.org/show_bug.cgi?id=295249

- Add build/runtime flag for compression dictionary transport.

- Add "compression-dictionary" destination type.
  whatwg/fetch#1854

- Add Link rel "compression-dictionary".
  whatwg/html#11619
fred-wang added a commit to fred-wang/WebKit that referenced this pull request May 29, 2026
https://bugs.webkit.org/show_bug.cgi?id=295249

- Add build/runtime flag for compression dictionary transport.

- Add "compression-dictionary" destination type.
  whatwg/fetch#1854

- Add Link rel "compression-dictionary".
  whatwg/html#11619
fred-wang added a commit to fred-wang/WebKit that referenced this pull request Jun 2, 2026
https://bugs.webkit.org/show_bug.cgi?id=295249

- Add build/runtime flag for compression dictionary transport.

- Add "compression-dictionary" destination type.
  whatwg/fetch#1854

- Add Link rel "compression-dictionary".
  whatwg/html#11619
Comment thread fetch.bs
with an algorithm that verifies that the dictionary hash in the stream matches
<var>availableDictionaryHash</var> and decodes the rest of the stream with the applicable
algorithm as defined in [[!RFC9842]]. If verification or decoding fails,
error the transformed stream.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what's implied by this "error the transformed stream". Shouldn't we return a network error if decoding/verification fails? If not what do we expect for newBody in that case?

Also, about the hash verification, I don't see a lot from RFC 9842... Looking for "hash" or "available-dictionary", I find these paragraphs:

and in particular "The dictionary is validated using an SHA-256 hash of the content to make sure that the client and server are both using the same dictionary." from which one can infer the server is expected to send back the same hash if it has found the same on its side.

So can we actually just add a previous step that checks availableDictionaryHash matches the Available-Dictionary in request's header at step 9 and return a network error otherwise. Or Am I missing something?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

availableDictionaryHash and Available-Dictionary are both generated by the client and attributes of the request. The issue we need to protect against is a server responding with a dcb or dcz stream that was compressed with a dictionary other than the one requested (has happened to more than a few people when deploying because of incorrectly-configured Vary response headers).

I'm happy to change it to be a network error of some kind if there's a sensible way to plumb that. Where it gets a bit complicated is that it's a problem with the stream, not necessarily the HTTP-level response container and is closer to being a corrupt payload (like sending a brotli payload with Content-Encoding: zstd). It also won't show up until the body starts being processed/read.

That feels like it's a problem at the stream level rather than the network level but I'm happy to plumb it however it best fits into the spec.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I see! So this is referring to the remaining instances of "hash" (that I disregarded yesterday because I thought they were irrelevant):

https://www.rfc-editor.org/info/rfc9842/#name-dictionary-compressed-brotl
https://www.rfc-editor.org/info/rfc9842/#name-dictionary-compressed-zstan

The header consists of a fixed 4-byte sequence and a 32-byte hash of the external dictionary that was used.

A "Dictionary-Compressed Zstandard" stream is a binary stream that starts with a 40-byte fixed header

So what do you think of this minor clarifications:

that first verifies that the dictionary hash in the stream's header matches availableDictionaryHash and decodes the rest of the stream with the applicable algorithm as defined in §4. Dictionary-Compressed Brotli and §5. Dictionary-Compressed Zstandard of [!RFC9842]].

So now I agree this is more an error at the stream level. Still I don't understand the implication of "error the transformed stream" (probably because I'm not familiar with the spec terminologies). What will be newBody after such an error?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll see if I can find a better way to plumb the error that isn't hand-wavy (FWIW, I'm not all that well versed in writing specs so thanks for pushing on these unclear cases).

Comment thread fetch.bs
with an algorithm that verifies that the dictionary hash in the stream matches
<var>availableDictionaryHash</var> and decodes the rest of the stream with the applicable
algorithm as defined in [[!RFC9842]]. If verification or decoding fails,
error the transformed stream.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I see! So this is referring to the remaining instances of "hash" (that I disregarded yesterday because I thought they were irrelevant):

https://www.rfc-editor.org/info/rfc9842/#name-dictionary-compressed-brotl
https://www.rfc-editor.org/info/rfc9842/#name-dictionary-compressed-zstan

The header consists of a fixed 4-byte sequence and a 32-byte hash of the external dictionary that was used.

A "Dictionary-Compressed Zstandard" stream is a binary stream that starts with a 40-byte fixed header

So what do you think of this minor clarifications:

that first verifies that the dictionary hash in the stream's header matches availableDictionaryHash and decodes the rest of the stream with the applicable algorithm as defined in §4. Dictionary-Compressed Brotli and §5. Dictionary-Compressed Zstandard of [!RFC9842]].

So now I agree this is more an error at the stream level. Still I don't understand the implication of "error the transformed stream" (probably because I'm not familiar with the spec terminologies). What will be newBody after such an error?

Comment thread fetch.bs
<p>If <var>dictionaryValue</var>["<code>match-dest</code>"] <a for=map>exists</a>:

<ol>
<li><p>Let <var>matchDestList</var> be <var>dictionaryValue</var>["<code>match-dest</code>"][0].

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the [0] mean here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll see if I can find a better way to word it but it is coming from the tuple definition from the structured field parsing of an item where the individual items of the parsed dictionary are tuples.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I see thanks

Comment thread fetch.bs
the <a lt="URL serializer">serialization</a> of <var>request</var>'s <a for=request>current URL</a>,
and an empty map.

<li><p>If <var>pattern</var> is failure or <var>pattern</var> <a for=/>has regexp groups</a>,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for these tests! In general they look good but I have some suggestions/comments/questions:

Dictionary with unknown match-dest is not used for fetch() API

So this is aligned with your PR because we return the response if matchDestList is empty (so similar to "Dictionary registration with empty match-dest list acts as wildcard'"). But can you please also add a similar test that does not hit the empty case:

compression_dictionary_promise_test(async (t) => {
  const match_dest = encodeURIComponent('("asdf", "")');
  ...

As I read your PR, "asdf" would be removed but the dictionary registration would still succeed, right? (similar to "Dictionary registration with matching fetch destination")

Dictionary registration with 1024 character dictionary ID

For completeness, can we also have a test that checks the character range. That would also exercise parsing/serialization of the dquotes and backslash is properly done as per rfc965:

+compression_dictionary_promise_test(async (t) => {
+  // https://www.rfc-editor.org/info/rfc9651/#name-parsing-a-string
+  // double quotes and backslash are escaped in headers per RFC9651.
+  let dictionaryIDsuffix = " !#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvwxyz{|}~";
+  let dictionaryID = `\\"\\\\${dictionaryIDsuffix}`;
+  const dict = await (await fetch(`${kRegisterDictionaryPath}?id=${encodeURIComponent(dictionaryID)}`)).text();
+  // Wait until `available-dictionary` header is available.
+  assert_equals(
+      await waitUntilAvailableDictionaryHeader(t, {}),
+      kDefaultDictionaryHashBase64);
+  assert_equals(await checkHeader('dictionary-id', {}), `"${dictionaryID}"`);
+}, `Dictionary registration with dictionary ID (valid characters and backslash escaping)`);

Dictionary with 1025 character dictionary ID is not registered

This test does not seem aligned with your PR? The ID would be ignored, so that should just behave like the "Simple dictionary registration and unregistration" case?

Also we can have a test for invalid character range (like '€') but in that case https://httpwg.org/specs/rfc9651.html#error-handling says the entire field should be ignored, so we would indeed really check that there is no available-dictionary header here (as opposed to a successful registration without ID).

Note that per https://httpwg.org/specs/1.html#error-handling the spec author can decide whether they want to ignore the entire field or define other handling (here remove the id from the dictionary).

Overlapping match patterns prioritize the more specific dictionary

Would be nice to test a bit more the priority from https://www.rfc-editor.org/info/rfc9842/#section-2.2.3

At least I think

  • A Use-As-Dictionary with "match-dest" has priority over a Use-As-Dictionary without "match-dest" (with a longer "match" length even).
  • Two Use-As-Dictionaries with the same match's length and matchDest's emptiness but different fetch time (the most recent wins).

Would be nice to test for the "best match" algorithms that involve more than 2 dictionaries too.

fred-wang added a commit to fred-wang/WebKit that referenced this pull request Jun 6, 2026
https://bugs.webkit.org/show_bug.cgi?id=295249

- Add build/runtime flag for compression dictionary transport.

- Add "compression-dictionary" destination type.
  whatwg/fetch#1854

- Add Link rel "compression-dictionary".
  whatwg/html#11619
Comment thread fetch.bs
};

enum RequestDestination { "", "audio", "audioworklet", "document", "embed", "font", "frame", "iframe", "image", "json", "manifest", "object", "paintworklet", "report", "script", "sharedworker", "style", "text", "track", "video", "worker", "xslt" };
enum RequestDestination { "", "audio", "audioworklet", "compression-dictionary", "document", "embed", "font", "frame", "iframe", "image", "json", "manifest", "object", "paintworklet", "report", "script", "sharedworker", "style", "text", "track", "video", "worker", "xslt" };

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are adding a new destination, that might affect the internal priority calculated by step 15 of https://fetch.spec.whatwg.org/#fetching ; currently that's completely implementation-defined so don't really need to say anything in the spec and that's probably off topic for this PR... But anyway I'm curious for WebKit and Firefox to know how Chromium is calculating fetch priority for the "compression-dictionary" destination? (naively, I would suspect it is low priority).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chrome defaults it to an idle-level priority request (and I believe it tries to wait until after the document onload has fired but I need to check to verify).

Comment thread fetch.bs
<p>If <var>dictionaryValue</var>["<code>match-dest</code>"] <a for=map>exists</a>:

<ol>
<li><p>Let <var>matchDestList</var> be <var>dictionaryValue</var>["<code>match-dest</code>"][0].

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I see thanks

Comment thread fetch.bs
<ol>
<li><p>Let <var>matchDestList</var> be <var>dictionaryValue</var>["<code>match-dest</code>"][0].

<li><p>For each <var>dest</var> of <var>matchDestList</var>: if <var>dest</var>'s <a>bare item</a>

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So per what you said earlier, condition this and the next item on whether matchDestList is nonempty?

Comment thread fetch.bs
the <a lt="URL serializer">serialization</a> of <var>request</var>'s <a for=request>current URL</a>,
and an empty map.

<li><p>If <var>pattern</var> is failure or <var>pattern</var> <a for=/>has regexp groups</a>,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, I updated the PR to fail registration when id is invalid to match the test.
For unsupported dests, this likely means we don't want to register them at all since there is nothing they could match and we don't want them to be treated like a wildcard.

OK I think these address my remaining concerns regarding mismatch between spec and tests. For match-dest, I believe the match-dest change is still pending, but probably you can just do an early check for the emptiness of match-dest rather than remembering the initial state (see my other comment above).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

Add compression dictionary negotiation and decoding to the fetch processing model

3 participants