design-reviews#721: Design Review: Speculation Rules (Prefetch)

#721: Design Review: Speculation Rules (Prefetch)

Opened Mar 16, 2022

Past reviews: Speculation Rules, Prerendering

Braw mornin' TAG!

I'm requesting a TAG review of Speculation Rules (prefetch).

Speculation Rules is a flexible syntax for defining what outgoing links/URLs are eligible to be prepared speculatively before navigation (e.g., prefetched).

In particular this request covers the use of this feature to cause prefetching. In particular, the specification attempts to define prefetching consistent with partitioned storage (cross-partition prefetches are isolated) and with IP anonymization (implementation-defined, but e.g. via a proxy service).

Explainer¹ (minimally containing user needs and example code): https://github.com/jeremyroman/alternate-loading-modes/blob/main/triggers.md
Specification URL:
- https://wicg.github.io/nav-speculation/speculation-rules.html
- https://wicg.github.io/nav-speculation/prefetch.html
Tests: in progress; will likely be speculation-rules/prefetch/ in WPT
User research: n/a
Security and Privacy self-review²: https://github.com/jeremyroman/alternate-loading-modes/blob/main/speculation-rules-security-privacy-questionnaire.md (completed about 1 year ago for previous review)
GitHub repo (if you prefer feedback filed there): https://github.com/WICG/nav-speculation/issues
Primary contacts (and their relationship to the specification):
- Jeremy Roman (@jeremyroman), Google
Organization(s)/project(s) driving the specification: Google Chrome
Key pieces of existing multi-stakeholder review or discussion of this specification:
- Early TAG review of speculation rules: https://github.com/w3ctag/design-reviews/issues/611
- TAG review of prerendering: https://github.com/w3ctag/design-reviews/issues/667
External status/issue trackers for this specification (publicly visible, e.g. Chrome Status): https://chromestatus.com/feature/5740655424831488

Further details:

I have reviewed the TAG's Web Platform Design Principles
Relevant time constraints or deadlines: no hard deadlines, but if things go well we may request shipping in Chromium in March or April
The group where the work on this specification is currently being done: WICG
The group where standardization of this work is intended to be done (if current group is a community group or other incubation venue): WHATWG
Major unresolved issues with or opposition to this specification: some concerns are tracked in https://github.com/WICG/nav-speculation/issues
This work is being funded by: Google

We'd prefer the TAG provide feedback as:

☂️ open a single issue in our GitHub repo for the entire review

Discussions

Comment by @otherdaniel Mar 25, 2022 (See Github)

This proposal uses a <script> element to host JSON content that describes the prefech rules. This potentially conflicts with CORB and the proposed ORB security mechanisms. Both try to prevent loading JSON resources into unexpected contexts. And JSON in <script> is certainly unexpected.

It's not entirely clear to me whether there is actually a conflict or whether this is a near miss, but in either case I believe the interaction with CORB/ORB requires a close look. (Possibly CSP, also.)

Since this concern is merely about rule representation, there should be numerous ways to avoid the issues without touching the substance of the proposal: Using something other than <script>, or having a unique mimetype and strictly require it, or insisting that speculation rules are always inline and won't be fetched. One could also try to modify CORB/ORB in order to accommodate Speculation Rules. The explainer thankfully already touches on these issues, so I'm hopeful this can be resolved.

Comment by @jeremyroman Mar 25, 2022 (See Github)

@otherdaniel As you've noted the explainer touches on this and the spec also has an issue to expand the section type confusion to cover MIME issues.

Given the precedent from import maps (which also uses <script>), I think this would likely resolve to requiring a MIME type whose essence is application/speculationrules+json explicitly. AFAIK this is consistent with the guidance for CORB/ORB; is there more work that is required there?

Comment by @otherdaniel Mar 25, 2022 (See Github)

I think this would likely resolve to requiring a MIME type whose essence is application/speculationrules+json explicitly. AFAIK this is consistent with the guidance for CORB/ORB; is there more work that is required there?

I agree, I think that's a viable solution.

(The "requiring" bit is important, though. The problems CORB/ORB try to solve ultimately stem from the fact that browsers try to be extra clever and accept content with missing or inappropriate MIME types. So the required MIME type should be a must, not a should. But I think that's what you meant)

Comment by @hadleybeeman Apr 13, 2022 (See Github)

Linking to our early design review

Discussed Apr 18, 2022 (See Github)

Lea: generalisation of the prefetch/prerender attributes.. using json blob to define these rules in a sript element. Weird. There is precedent. Unclear .. syntax is confusing. source list / source document.. things to be said about syntax. Score is unclear how it's used. Question of do we want this logic to lie in a script element as json or should it be an html based syntax. Motivation - I am convinced, mentioned example of quicklink library that determins which links to prefetch automatically. Whole process is explicit, authors need to duplicate links to prefetch in the head. This allows the to handle this en masse to some degree. More extensible because its json, can add more properties. Main point is do we agree that this is a problem that needs to be solved, and is json in a script element a good way to solve it? Low level stuff we can talk about another time, or just leave a comment. I don't know any history about this.. seen speculation rules in our agenda a few times.. some background?

Dan: based on speculation rules? Leave comments and questions..

Lea: [will leave comment]

Comment by @LeaVerou Apr 18, 2022 (See Github)

I looked at this briefly during a breakout today, but we ran out of time before discussing with the rest of the group, so these are just my own thoughts, and do not necessarily represent TAG consensus (yet):

I do agree that the current syntax for prefetch/prerender is clumsy and tedious, and the Quicklink example is quite compelling. Although, with the current syntax, a library would still be needed to do what Quicklink does, since there is no criteria for "is this link in the user's viewport?"
Any syntax for speculatively prefetching a lot of stuff brings up concerns about sustainability etc. It's a lot of wasted bandwidth, which for some users may be prohibitively expensive. If we are to make this a whole lot easier, which will lead to even more wasted bandwidth, there should be a way for users to opt out.
I agree it's weird to use a different language (JSON) to essentially annotate HTML elements but I also agree that extending HTML with these annotations would be clumsy.
On syntax: Cramming the entire logic for a conditional in a property name is not very extensible. E.g. if_not_selector_matches is essentially a microsyntax for doing negation and specifying what this criteria is going to match on (selector, href, etc). These could be entirely independent if the conditionals are an array of object literals, with one object literal per conditional. This would also allow for additional matching metadata in the future, which you may need for other criteria. E.g. if a proximity to cursor criteria is introduced, you may want to specify distance, velocity etc, if a viewport criteria is introduced you may want to specify offset etc. With the current syntax, each criteria only takes a single argument.

A rule may include a score between 0.0 and 1.0 (inclusive), defaulting to 0.5, which is a hint about how likely the user is to navigate to the URL. It is expected that UAs will treat this monotonically (i.e., all else equal, increasing the score associated with a rule will make the UA speculate no less than before for that URL, and decreasing the score will not make the UA speculate where it previously did not). However, the user agent may select a link with a lower author-assigned score than another if its heuristics suggest it is a better choice.

I would imagine the likelihood of a link being clicked might change throughout the user's interaction, so I wonder if this would make more sense as an <a> attribute? That way it can be updated by script to account for things like is it in the viewport, is the cursor close to it etc or any other arbitrary thing that makes it more or less likely to be clicked.

Comment by @tomayac Apr 19, 2022 (See Github)

Any syntax for speculatively prefetching a lot of stuff brings up concerns about sustainability etc. It's a lot of wasted bandwidth, which for some users may be prohibitively expensive. If we are to make this a whole lot easier, which will lead to even more wasted bandwidth, there should be a way for users to opt out.

One way apps could make this more sustainable could be by checking for the Save-Data header or the navigator.conection.saveData bit.

In tomayac/netinfo/README.md, I have brought up the idea of exposing the fact whether a network is metered via a new navigator.conection.metered bit, which would be more aligned with the way Android handles this: save data is for foreground things like sending lower-res images, and metered network is for background things like not syncing data (or in the concrete case not prefetching).

Comment by @jeremyroman Apr 21, 2022 (See Github)

I looked at this briefly during a breakout today, but we ran out of time before discussing with the rest of the group, so these are just my own thoughts, and do not necessarily represent TAG consensus (yet):

I do agree that the current syntax for prefetch/prerender is clumsy and tedious, and the Quicklink example is quite compelling. Although, with the current syntax, a library would still be needed to do what Quicklink does, since there is no criteria for "is this link in the user's viewport?"

I'm hopeful that since this is more about heuristics for picking the best links to prefetch/prerender rather than correctness, that user agents will be able to do a decent job of this by default, appropriate for the device's form factor etc. For example, on a UA with a cursor input device (like a desktop computer), hover might be a really good signal (instant.page uses this), whereas on a UA with touch input but a small viewport (like a mobile phone), hover might not be available but the viewport is a really strong signal of user intent.

If that evolution doesn't prove as fruitful as I hope we might benefit from having authors expressly tell UAs which signals are strong hints about user intent.

Any syntax for speculatively prefetching a lot of stuff brings up concerns about sustainability etc. It's a lot of wasted bandwidth, which for some users may be prohibitively expensive. If we are to make this a whole lot easier, which will lead to even more wasted bandwidth, there should be a way for users to opt out.

I agree that users should be able to opt out, for a variety of reasons including cost and privacy. User agents are in a position to help with this too, for example by not prefetching (or by prefetching only very high-probability URLs) when the user is on a metered connection or has low battery life.

This is one advantage of giving a place for UAs to help developers out with this decision -- if we don't, developers can do it anyway, but using APIs that don't make it easy for the UA to intercede on the user's behalf because it's less distinguishable from fetches critical to the immediate user intent.

I agree it's weird to use a different language (JSON) to essentially annotate HTML elements but I also agree that extending HTML with these annotations would be clumsy.

It's certainly imperfect but JSON has great tooling in both server- and client-side web technologies, and there is similar use of it in import maps and web bundles.

On syntax: Cramming the entire logic for a conditional in a property name is not very extensible. E.g. if_not_selector_matches is essentially a microsyntax for doing negation and specifying what this criteria is going to match on (selector, href, etc). These could be entirely independent if the conditionals are an array of object literals, with one object literal per conditional. This would also allow for additional matching metadata in the future, which you may need for other criteria. E.g. if a proximity to cursor criteria is introduced, you may want to specify distance, velocity etc, if a viewport criteria is introduced you may want to specify offset etc. With the current syntax, each criteria only takes a single argument.

That's a fair point. If you never need two conditions of the same type, then I do think you can get somewhat far with the syntax sketch I had there (it's not specified or implemented yet), but your proposal certainly has its advantages. What I am hoping to avoid is making a boolean algebra microsyntax of arbitrary complexity (or having to write a little language that needs a parser, even) -- but I'm not strongly attached to this yet.

{
  "prefetch": [
    { "source": "document", "if": [{"not": {"href_matches": "..."}}, {"selector_matches": "..."}] }
  ]
}

Definitely a tough balance to strike so that it's as simple as possible while still being useful enough. At some point I start to be reminded of JSON Schema.

Filed WICG/nav-speculation#160 for this question.

A rule may include a score between 0.0 and 1.0 (inclusive), defaulting to 0.5, which is a hint about how likely the user is to navigate to the URL. It is expected that UAs will treat this monotonically (i.e., all else equal, increasing the score associated with a rule will make the UA speculate no less than before for that URL, and decreasing the score will not make the UA speculate where it previously did not). However, the user agent may select a link with a lower author-assigned score than another if its heuristics suggest it is a better choice.

I would imagine the likelihood of a link being clicked might change throughout the user's interaction, so I wonder if this would make more sense as an <a> attribute? That way it can be updated by script to account for things like is it in the viewport, is the cursor close to it etc or any other arbitrary thing that makes it more or less likely to be clicked.

Very possibly! I think it depends a little on whether developers find it more useful to give live-updating hints as the user interacts or whether they just want to provide a bit of a bump to built-in heuristics. This can to some extent be emulated by developers defining e.g. classes for "low likelihood", "medium likelihood", "high likelihood" and then keying off those in rules. But that's awkward if this is a common case, in which case this absolutely should become an attribute as you suggest.

Filed WICG/nav-speculation#159 for this question.

Discussed Apr 25, 2022 (See Github)

reading response

Hadley: using a lot of bandwidth in connections that are not able to handle it?

Amy: i think we did bring this up and privacy stuff for prerender.. but don't think most of the privacy concerns apply to prefetch.... assume it would take note of the user's context and not prefetch or prerender stuff on low bandwidth

Dan: left comment we will revisit at plenary

[update - bumped to 9th May]

Comment by @torgo Apr 25, 2022 (See Github)

Question: is this an "early review"? What is the time-line? Also it says in Chrome status that you're in the middle of an origin trial. Has there been any feedback from this that you can share? Also it still shows no signal from other implementers. Can you let us know any multistakeholder status?

Comment by @jeremyroman Apr 29, 2022 (See Github)

#611 was the corresponding early review. We're hoping to ship in Chrome relatively soon.

So far most of the feedback I'm aware of can largely be bucketed as:

related to the effect on the page which was prefetched (unsurprisingly, most obviously reduced load time metrics)
servers wanting to identify and drop prefetch traffic (resulting in writing down a Sec-Purpose header to replace vendor-specific headers, and not storing non-ok responses by default)
servers receiving anonymized traffic (much of which is presently tracked in https://github.com/buettner/private-prefetch-proxy, such as https://buettner.github.io/private-prefetch-proxy/traffic-advice.html to limit traffic from polite proxies)

We've asked on multiple occasions for engagement from other vendors, though there hasn't been a lot. The WICG proposal for this repository captures some of that. We've requested Mozilla and WebKit positions, without response.

Discussed May 9, 2022 (See Github)

Dan: Reviewing status ... no info on multistakeholder but some developer feedback has come in.

[punted to B hope to get some feedback from Rossen]

Discussed Jun 6, 2022 (See Github)

latest comment from requestor

Amy: lea left a review - not sure if the answers are satisfactory - I also did a related review.

Dan: let's maybe organise a special session on this? I can reach out to Chris H at Google.

Amy: it's interesting to know why they're not getting interest from other vendors... But doesn't feel like the sort of thing that if only rolled out in Chrome it would break web sites across other browsers.

Comment by @hober Jul 27, 2022 (See Github)

@jeremyroman, thanks for filing WICG/nav-speculation#159 and WICG/nav-speculation#160 back in April, based on @LeaVerou's feedback. It doesn't look like those issues have been touched since, though. Is there anything more you need from us to make progress on them?

Comment by @jeremyroman Aug 17, 2022 (See Github)

@jeremyroman, thanks for filing WICG/nav-speculation#159 and WICG/nav-speculation#160 back in April, based on @LeaVerou's feedback. It doesn't look like those issues have been touched since, though. Is there anything more you need from us to make progress on them?

Not at present, but I expect to incorporate that insight when we extend the feature in that direction, which I think will be relatively soon. Thanks again for the feedback.

Discussed Aug 22, 2022 (See Github)

Peter: marked pending feedback - we got some...

Rossen: says "we'll work on it, thanks for the ping"... i think it's safe to push it back ...

Peter: bumps

Comment by @domenic Sep 2, 2022 (See Github)

Hi TAG. We have a slight expansion to this feature coming up, which I will drop a comment about here instead of opening a new issue. However please let me know if it'd be more helpful to open something new, especially since what I'm asking about is prerendering and this review at least started being about prefetch.

Chrome shipped same-origin prerendering, based on speculation rules, in May. We're now looking to expand this to cover cross-origin same-site prerendering, i.e. cases like https://a.example.com/ prerendering https://b.example.com/. This will include credentials/storage access, since those are site keyed, but it will also require an opt-in from the target site via a new HTTP response header, Supports-Loading-Mode: credentialed-prerender, to protect the origin security boundary.

We've updated the spec and explainer in https://github.com/WICG/nav-speculation/commit/16570ff808267383a393064ff951b764911be78f , with perhaps the most relevant reading being:

A new section of the prerendering explainer giving our full analysis of what this expansion does in terms of security and privacy properties.
The overall Supports-Loading-Mode explainer (which includes a few values we aren't yet shipping, such as uncredentialed-prerender).
The Supports-Loading-Mode spec, as well as where it is used in the main fetching part of the prerendering spec.

We've also updated the relevant security & privacy questionnaire, but none of the questions there were directly relevant to this expansion; probably the new section mentioned above is the most useful from a security and privacy perspective.

Thanks for your time!

Comment by @pweis88 Nov 2, 2022 (See Github)

We're now looking to expand this to cover cross-origin same-site prerendering, i.e. cases like https://a.example.com/ prerendering https://b.example.com/. This will include credentials/storage access, since those are site keyed, but it will also require an opt-in from the target site via a new HTTP response header, Supports-Loading-Mode: credentialed-prerender, to protect the origin security boundary.

We're exploring different use cases for prerendering in Google Workspace apps that would require cross-origin same-site support. This is generally needed for cross-app user journeys, a good example of which is prerendering of Google Docs documents (docs.google.com) that users are likely to navigate to from e.g. Drive (drive.google.com) or Gmail.

Discussed Nov 28, 2022 (See Github)

Amy: they reopened the review for prefetch - though it's pre-rendering.

Dan: Same comment left in the Mozilla standards positions and no response. Likewise discussion about prefetch.

Yves: adding a new http header is not an issue in itself. All the things people need to configure is becoming a bit much. Would be good to either reuse something existing by adding a new value. Even that adds to complexity. Or finding a way to .. define profiles that will select a set of behaviours or something like that. Still need to have a more comprehensive description of all the knobs used for security wrt all the http headers. We miss a comprehensive picture of all the http headers and their interactions, not only for this proposal, many recent proposals

<blockquote> We're noting a lack of multi-stakeholder interest in this. Do you have any info on this can you can share? We're concerned about developer complexity when it comes to this feature, especially considering the need for a new HTTP header that requires server configuration. Is there an alternative design that wouldn't require as much complexity? Never the less, regarding the design it's good to see it's an opt in. </blockquote>

leaves comment

Comment by @torgo Nov 28, 2022 (See Github)

Hi @domenic we're noting a lack of multi-stakeholder interest in this. Do you have any info on this can you can share? We're concerned about developer complexity when it comes to this feature, especially considering the need for a new HTTP header that requires server configuration. Is there an alternative design that wouldn't require as much complexity? Never the less, regarding the design it's good to see it's an opt in.

Comment by @domenic Dec 5, 2022 (See Github)

we're noting a lack of multi-stakeholder interest in this. Do you have any info on this can you can share?

Nothing beyond the standards positions linked in the original post, sorry!

Edit: I realized they were not linked in the original post after all. Here they are:

In general we've found that second and third implementers often take a while to follow on these sort of progressive enhancement/for-performance features.

We're concerned about developer complexity when it comes to this feature, especially considering the need for a new HTTP header that requires server configuration. Is there an alternative design that wouldn't require as much complexity?

It depends on what you mean. Fundamentally, an opt-in is needed, for security reasons. I don't think that opt-in is very complex; it's a single HTTP header, and things don't get much simpler than that.

It's possible that you're referring to the difficulty of configuring HTTP headers, which is e.g. impossible on some older static hosts like GitHub pages, and thus requires the use of other free hosting like Netlify/CloudFlare Pages/etc. (Or to use a non-free host.) We could support even those older static hosts by working on this future extension which the explainer explained, i.e. <meta http-equiv="supports-loading-mode">. Arguably, this adds a good bit of complexity, as in-markup versions come with a lot of restrictions around parsing, appearing within the first few thousand bytes, etc. But it does make the feature easier, at least for those stuck on such older static hosts.

Is that what you were referring to, or was there a different meaning of complexity that I missed?

We have a number of other small enhancements to speculation rules/prefetching/prerendering coming up, e.g. support for customizing the referrer policy used. We're planning to continue pinging this thread with small summaries like I did previously, but if you'd prefer us to hold off (e.g. until the base feature gains more implementers) or start new threads, please let us know!

Comment by @kjmcnee Dec 9, 2022 (See Github)

Hello. We have an extension to the speculation rules syntax to allow the referrer policy of a speculative request to be set explicitly. A key use case for this is to allow a site with a lax referrer policy to adopt cross-site prefetching by using a strict policy specifically for the prefetch.

Explainer: https://github.com/WICG/nav-speculation/blob/main/triggers.md#explicit-referrer-policy Spec: https://wicg.github.io/nav-speculation/speculation-rules.html Tests: Mainly in https://github.com/web-platform-tests/wpt/blob/master/speculation-rules/prefetch/referrer-policy-from-rules.https.html Chrome Status: https://chromestatus.com/feature/4694585584910336

(Note that as of this writing, the most recent version of the spec hasn't yet been published at that link, but should be available soon.)

Please take a look.

Comment by @jeremyroman Dec 16, 2022 (See Github)

Just adding a quick update here. We're planning on launching an origin trial which covers some of the extended aspects of this, some of which TAG previously provided feedback on:

document rules (which incorporated feedback from https://github.com/w3ctag/design-reviews/issues/721#issuecomment-1101600240 to use a more general way of expressing the logic)
the Speculation-Rules header, which is a fairly simple structured field which allows loading rules from a URL provided in the response headers (this is of particular use in cases where it is not convenient to modify document content, but the server operator is a service provider which is well-positioned to specify rules in coordination with the document author)
PerformanceResourceTiming.deliveryType, which is a fairly small API addition we've discussed with Web Perf WG; relevance here is that it can be used to determine whether the current navigation was served from the prefetch cache

Discussed Dec 19, 2022 (See Github)

Tess: looks like they filed a couple of issues based on Lea's feedback.. one of them is closed and the other hasn't had comments...

Dan: parsing Domenic's response - yes the complexity is the http header - requring server configuration... I'll make that clear.

Comment by @torgo Dec 19, 2022 (See Github)

Hi @domenic thanks for sending the standards positions links. Just quickly on this point of complexity:

It's possible that you're referring to the difficulty of configuring HTTP headers

Yes that's what I'm referring to - it kind of puts it out of reach for "rank and file" web developers...

Comment by @LeaVerou Dec 19, 2022 (See Github)

Hello. We have an extension to the speculation rules syntax to allow the referrer policy of a speculative request to be set explicitly. A key use case for this is to allow a site with a lax referrer policy to adopt cross-site prefetching by using a strict policy specifically for the prefetch.

Explainer: WICG/nav-speculation@main/triggers.md#explicit-referrer-policy Spec: wicg.github.io/nav-speculation/speculation-rules.html Tests: Mainly in web-platform-tests/wpt@master/speculation-rules/prefetch/referrer-policy-from-rules.https.html Chrome Status: chromestatus.com/feature/4694585584910336

(Note that as of this writing, the most recent version of the spec hasn't yet been published at that link, but should be available soon.)

Please take a look.

Hi there,

Could you please submit this as a separate design review (and link to this issue, since it's blocked on this). Thanks!

Comment by @torgo Dec 19, 2022 (See Github)

We have a number of other small enhancements to speculation rules/prefetching/prerendering coming up, e.g. support for customizing the referrer policy used. We're planning to continue pinging this thread with small summaries like I did previously, but if you'd prefer us to hold off (e.g. until the base feature gains more implementers) or start new threads, please let us know!

Yes please do continue to update this thread with summaries of the changes or additions, but also can you please update the explainer when you do this? I think we have more work to do as TAG on this review before we can close it.

Comment by @domenic Dec 20, 2022 (See Github)

Hmm, we seem to be getting conflicting messages :). Should we post new issues, or update this thread? Both?

We always update the explainer (and spec, and tests) whenever we work on new features, as you can see from the above updates wherein we link to the new explainer sections.

Discussed Feb 13, 2023 (See Github)

<blockquote> Hi @domenic - Sorry for the mixed messages. Lea was asking specifically about the [new proposal](https://github.com/w3ctag/design-reviews/issues/721#issuecomment-1344814699) from @kjmcnee and I was referring to general updates of the original subject of the review. So yes, we would very much like to see a separate review opened when there is a new proposal. However the explainer link Kevin provided is to a document fragment, not to a [separate explainer doc](https://tag.w3.org/explainers/). Could you please clarify what we're being asked to review? To be clear: if it's a new piece of functionality that can be written up in an explainer doc, with the specific user needs being addressed, then please open up a new review (with a new explainer). If it's a delta to something we're already reviewing, then please update the current review text/explainer and let us know specifically what changed and why in a comment. </blockquote>

Comment by @torgo Feb 13, 2023 (See Github)

Hi @domenic - Sorry for the mixed messages. Lea was asking specifically about the new proposal from @kjmcnee and I was referring to general updates of the original subject of the review. So yes, we would very much like to see a separate review opened when there is a new proposal. However the explainer link Kevin provided is to a document fragment, not to a separate explainer doc. Could you please clarify what we're being asked to review? To be clear: if it's a new piece of functionality that can be written up in an explainer doc, with the specific user needs being addressed, then please open up a new review (with a new explainer). If it's a delta to something we're already reviewing, then please update the current review text/explainer and let us know specifically what changed and why in a comment.

Comment by @kjmcnee Feb 13, 2023 (See Github)

Hello. My previous comment was about specifying how speculation rules interact with referrer policy, so I would consider that a delta of this review, rather than a new proposal.

The change is the addition of the "Explicit referrer policy" section of the explainer.

Discussed Feb 27, 2023 (See Github)

Amy: they've replied - the person who we thought was making a new proposal says they are making a delta.

Amy: we're supposed to reviewing the explainer that they keep updating.

Dan: the explainer

Amy: lea left some comments as well.

Dan: this is what we are talking about

Amy: still under active development; shipping in chrome; noone else is interested; explainer has no user needs in it (but may be too late to tell them that)

Lea: No archietctural concerns with the actual functionality. However I don't like the syntax. see comment

Dan: have they addressed those issues?

Lea: they opened up issues for some of them. I see one of them is resolved. Others [in tag] should look at syntax as well.

Dan: are there other examples where the same pattern is being used, around the use of JSON?

Peter: import maps

Dan: if there are other examples then maybe it's not..

Lea: import maps are using json to define metadata, but not to essentially annotate html elements. The only precedent for using a differetn language to annotate html is css (not saying they should have used css for this)

Dan: what about web annotation?

Amy: it's only a data model.

Rossen: proposal to use an <a> attribute? Do we have feedback on what can be better?

Peter: i don't understand json being used to annotate HTML elements? Isn't this one blob of json that's metadata for the document?

Lea: it's like a mini querying language about which elements it applies to.

Peter: selector matches... tagging elements. The rest is more of a map of the site.

Lea: My impression is that any anchor that meets those criteria would be (or not be) pre-rendered.

Peter: yes that example would better be on the element..

Lea: does this apply to link tag? image element?

Amy: it's about pre-rendering things the user might click on? So it would only work on href?

Lea: I think it's only about <a> elements.

Peter: I agree that the selector matching seems like a layering violation.

Lea: will create future features that use same syntax... precedent...

Peter: everything other than the selector matches seems like metadata about the site... that seems like it should be one blob of data somewhere. Could be a map on a site...

Rossen: no well formed opinion.

Amy: an interesting issue in their repo... Alex says "woould have expected this to come up in TAG review"...

Lea: the syntax doesn't allow for likelyhoods to be changed. My understanding is that the json is read once. Whereas if this was an attribute on the anchor then it would be dynamic.

Peter: how does that work if you have multiple pages...

Lea: something to define in the spec...

Amy: should we invite the requestor to come talk to us?

Peter: if this shipped in chrome and nobody else cares is this a waste of time?

Amy: the conversations are ongoing so it's at a stage where we could influence.

Lea: do they need the whole power of RegExps in href matches?

Peter: wasn't there another proposal for URL matches?

Lea: URL pattern.

Yves: yes in service worker and they ruled out RegExp because it's not cheap.

Lea: if they don't need it then they could just use selector matching... If their entire selection ctiteria can be described with selectors then why do they need this microsyntax at all?

Dan: does seem like it adds complexity - yet another query syntax..

Peter: does look like they're referring to url pattern spec, not a regex

Peter: should this be in CSS?

Rossen: elaborate

Peter: well selector matching.. it's a language for mapping properties to random HTML .. can be referenced by multiple pages ..

Lea: reactive

Peter: question is can you consider this presentational?

Rossen: that's my issue.

Peter: you could start to consider this presentational? doesn't preculde it from being in CSS.

Lea: i don't think that's a terrible idea though CSS doesn't have URL pattern matching. Maybe CSS could add URL pattern matching... that could be useful for CSS as well.

Points to raise:

Have they considered using CSS?
Concern continues about use of a novel JSON format for annotation
Something about user needs not being clearly defined in the explainer
Meta issue: Things that need to annotate HTML elements en masse that are not necesarilly presentational - how should they work?

Amy: explicit referer - if the user goes on to click the link - if the referer policy is different - it still uses the referer policy for pre-rendering.

Amy: if you set the referer policy in the json and in the link then the pre-renderer referer policy takes precedent. Are there consequences for this? Maybe means if there is a link to an external site that they author doesn't trust - but pre-rendering is configured differently - then it could have possible privacy implications? could be a stretch. [posted a question]

Hi there,

We looked at this again in a breakout today.

The general consensus was that we agree that the functionality is useful, but we have some architectural concerns about the syntax this is introducing in the Web Platform, especially since its is introducing a novel precedent that could affect the direction of even more future Web Platform features.

One thing that came up in the discussion was that this is syntax that is annotating other HTML elements en masse, and the only precedent in the Web Platform for doing so is CSS. Since it could be argued that this is presentational, we were wondering if you have explored extending CSS for this.

Most of your criteria syntax is CSS selectors anyway, and while Selectors do not include matching for URL patterns, this would be more broadly useful anyway, and would make a good addition to selectors anyway. Since CSS is inherently reactive, this would also naturally afford dynamic scoring for links.

It seems that something like this could reduce the API surface of this feature to a couple of CSS properties, making it simpler to implement and test.

Sorry this has been taking so long. We're trying to make a concerted effort to come back with actionable feedback so we can close this issue as the consensus is generally positive. We look forward to hearing more as the origin trials progress. If you have other specific issues you would like to ask for TAG's feedback on please let us know.

</blockquote>

Comment by @LeaVerou Mar 1, 2023 (See Github)

Hi there,

We looked at this again in a breakout today.

It seems that something like this could reduce the API surface of this feature to a couple of CSS properties, making it simpler to implement and test.

Comment by @rhiaro Mar 1, 2023 (See Github)

We have an extension to the speculation rules syntax to allow the referrer policy of a speculative request to be set explicitly. A key use case for this is to allow a site with a lax referrer policy to adopt cross-site prefetching by using a strict policy specifically for the prefetch.

Are there risks authors or users should know about if the inverse of this was the case? ie. a strict policy by default is overridden by a lax explicit prefetch policy? (perhaps because of a misconfiguration, or because different people configure the server headers to those who author the pages?)

Comment by @kjmcnee Mar 1, 2023 (See Github)

Are there risks authors or users should know about if the inverse of this was the case? ie. a strict policy by default is overridden by a lax explicit prefetch policy? (perhaps because of a misconfiguration, or because different people configure the server headers to those who author the pages?)

If a lax policy is specified in the rule and it's for a same-site prefetch, that's the policy we use. If it's cross-site however, a lax explicit policy would prevent the prefetch attempt due to the sufficiently-strict referrer policy requirement.

So the risk would be that authors cause their prefetch attempts to be ignored. For debuggability, in the chromium implementation, we surface when an attempt is ignored due to this requirement in DevTools.

Comment by @jeremyroman Mar 8, 2023 (See Github)

Hi again, and thanks for the feedback.

I assume that you're referring to the "where" condition syntax specifically<sup>1</sup>. For further context on that, we did previously revise it to be more general (see #160, #177) in response to https://github.com/w3ctag/design-reviews/issues/721#issuecomment-1101600240. I did look at options leaning more heavily on CSS (and have taken another look now), but I still don't think it is a great fit here.

Speculation rules provide page authors with the ability to determine, across the page, what sorts of links are suitable to preload and what sorts are not. When that's determined by URL it looks a little bit like Content Security Policy or service worker scopes; when it's determined by page structure, CSS selectors are a useful tool. I actually would tend to expect the former to be more common, and so I'd like for it to have good ergonomics. I'm concerned putting it inside CSS syntax might hurt ergonomics in the URL pattern case<sup>2</sup>. The existing case of explicit URLs (especially for navigation other than to existing links) is not solved by a pure CSS solution.

For those speculation rules that do target links, I definitely want to be reactive to changes in document structure in the same way that CSS is – which is why the current proposal does use the selector syntax to describe that structure and Chromium's implementation relies on the style engine for invalidation. More deeply integrating this concept (related to navigation speculation) into CSS implementations and specifications seems from our experience more likely to increase the coupling with, and thus burden to, CSS implementations and specifications.

Fundamentally, this control over preloading doesn't seem presentational to me. Even though it leverages the structure and semantics of the document using selectors, it doesn't affect the appearance (or aural output, etc) of the page. The precedent we most had in mind when creating this was import maps, which also have wide-ranging effects on a page, via a JSON specifier syntax embedded in a <script> element. The JavaScript querySelectorAll API is another example of leveraging the CSS selector syntax to structurally match the document without being presentational in nature.

<hr>

<sup>1</sup> The JSON syntax generally has been previously discussed, but import maps (which are similarly declarative about behavior on the page) are comparable.

<sup>2</sup> I tried to expand on the idea of adding a :link-href pseudo-class which would do URL pattern matching on the href (after accounting for the document's base URL) of elements which match :any-link.

Proposed syntax:

{"and": [
  {"href_matches": "/*\\?*", "relative_to": "document"},
  {"not": {"href_matches": "/logout?*", "relative_to": "document"}},
  {"not": {"selector_matches": ".no-prefetch *"}}]}

CSS selector:

:link-href("/*\\?*" relative-to document):not(:link-href("/logout?*" relative-to document), .no-prefetch *)

CSS selector (quoted):

":link-href(\"/*\\\\?*\" relative-to document):not(:link-href(\"/logout?*\" relative-to document), .no-prefetch *)"

It would be possible, but a little more awkward yet, to permit the dictionary-style (rather than shorthand) URLPattern construction, since there isn't precedent for embedding a dictionary inside a pseudo-class (existing ones generally take very few parameters).

While I could conceive of this being useful in style (e.g., to automatically style cross-origin or insecure links differently), I haven't previously heard demand for this in the CSS ecosystem. I think including this in CSS might actually increase, rather than reduce, the work required to specify, implement and test.

Comment by @toyoshim Mar 9, 2023 (See Github)

Hi. We have delta updates on how the speculation rules should interact with Content Security Policy.

Explainer: https://github.com/WICG/nav-speculation/blob/main/triggers.md#content-security-policy

We added Content Security Policy section to clarify how the speculation rules interact with existing Content Security Policy, and explain the new source keyword "inline-speculation-rules".

We also added Content Security Policy section to the speculation rules spec, in order to explain the motivation and to show spec patches for Content Security Policy. Spec (diff): https://storage.googleapis.com/spec-previews/WICG/nav-speculation/pull/245/diff/speculation-rules.html

Tests:

Chrome Status: https://chromestatus.com/feature/5182859125456896

In short, we clarify how the speculation rules are handled in CSP, and provide a new source keyword to permit safe inline speculation rules without allowing unsafe inline script under the strict CSP environment. Here is an example use.

<meta http-equiv="Content-Security-Policy" content="script-src 'inline-speculation-rules'">

<!-- this just works!! -->
<script type="speculationrules">
...
</script>

<!-- this causes a CSP violation -->
<script>
console.log('hello.');
</script>

Discussed Apr 1, 2023 (See Github)

We left several comments on the review with things we spotted, including:

Hi,

@cynthia, @LeaVerou, and @hober took a look at this during our Tokyo F2F today.

We are sympathetic to the requirements this sets out to fulfill. The complexity of [document rules](https://github.com/WICG/nav-speculation/blob/main/triggers.md#document-rules) is concerning. While having solutions that can cover the whole spectrum of use-cases is nice, significant added complexity will have adverserial effects on adoption - and whenever possible we value [simpler solutions](https://w3ctag.github.io/design-principles/#simplicity) that an average developer could easily understand and make use of.

If you could propose a simpler approach that could cover say, [80% of the use cases](https://lists.w3.org/Archives/Public/public-html/2007Aug/0495.html) as an alternative - we would love to see this. One example that came up in our discussion was an attribute on `<a>` elements instead of an entirely separate technology. After all, more complex approaches can always be added later, if the need arises.

One bit about eagerness - it would be useful to state (maybe not normatively?) that ideally implementations should provide  a way for the users to set their prefetch preferences, and user preferences should be treated as higher priority than the page-declared eagerness preference - in particular in low-data/bandwidth scenarios.

Discussed Apr 10, 2023 (See Github)

Rossen: we need to move this forward in a focussed way.

Dan: I assigned Lea. Let's put this on the agenda for next week and tackle it with Lea and Tess if possible Rossen (remote) to chat on a breakout at the f2f.

Comment by @hober Apr 20, 2023 (See Github)

The explainer says:

Currently, like import maps, script tags are only used for specifying speculation rules inline; future extensions may allow a src attribute to load external rule sets.

But then you allow external rule sets to be loaded with a Speculation-Rules HTTP header.

This seems inconsistent. If external rule sets are to be discouraged, why have the HTTP header? If they aren't to be discouraged, why not support linking to them in markup?

(Several minutes after writing the above, having gotten farther down the explainer document, I found this text which goes into this a bit. Maybe link to this directly so that people who wonder about this inconsistency can click to the rationale?)

Comment by @cynthia Apr 20, 2023 (See Github)

Hi,

@cynthia, @LeaVerou, and @hober took a look at this during our Tokyo F2F today.

We are sympathetic to the requirements this sets out to fulfill. The complexity of document rules is concerning. While having solutions that can cover the whole spectrum of use-cases is nice, significant added complexity will have adverserial effects on adoption - and whenever possible we value simpler solutions that an average developer could easily understand and make use of.

If you could propose a simpler approach that could cover say, 80% of the use cases as an alternative - we would love to see this. One example that came up in our discussion was an attribute on <a> elements instead of an entirely separate technology. After all, more complex approaches can always be added later, if the need arises.

One bit about eagerness - it would be useful to state (maybe not normatively?) that ideally implementations should provide a way for the users to set their prefetch preferences, and user preferences should be treated as higher priority than the page-declared eagerness preference - in particular in low-data/bandwidth scenarios.

Comment by @jeremyroman Apr 24, 2023 (See Github)

Thank you for taking a look. Responses below (not necessarily in order):

This seems inconsistent. If external rule sets are to be discouraged, why have the HTTP header? If they aren't to be discouraged, why not support linking to them in markup? (Several minutes after writing the above, having gotten farther down the explainer document, I found this text which goes into this a bit. Maybe link to this directly so that people who wonder about this inconsistency can click to the rationale?)

It isn't discouraged; it simply isn't supported yet. The rationale is linked from the words "future extensions" in the explainer excerpt originally quoted. I can certainly rephrase that sentence if you think the link could be made more apparent, or it could be otherwise clarified.

it would be useful to state (maybe not normatively?) that ideally implementations should provide a way for the users to set their prefetch preferences, and user preferences should be treated as higher priority than the page-declared eagerness preference

I agree; the specification says so in two places right now, here in the context of eagerness:

"eager" The author believes this is very likely to be worthwhile. User agents should usually enact the candidate, subject only to considerations such as user preferences, device conditions, and resource limits.

and here in the context of privacy:

While efforts have been made to minimize the privacy impact of prefetching, some users may nonetheless prefer that prefetching not occur, even though this may make loading slower. User agents are encouraged to provide a setting to disable prefetching features to accommodate such users.

I've added additional normative text to be more explicit.

If you could propose a simpler approach that could cover say, 80% of the use cases as an alternative - we would love to see this. One example that came up in our discussion was an attribute on <a> elements instead of an entirely separate technology. After all, more complex approaches can always be added later, if the need arises.

Would such a feature be useful for some authors? I do agree it might be, if only because it's very easy to explain. I think it would be quite straightforward to add as a "shorthand".

I'm less confident that it is sufficient to address 80% of use cases well, though, for a few reasons.

Firstly, while adopting such a feature for a single link would be very easy, updating many code paths which emit <a> tags, or existing static content which includes <a> tags, would be extremely tedious for many authors, both large and small. While this can to some extent be done dynamically with script, this is significant work (and there are some non-obvious edge cases), and browsers already have much of the infrastructure to do this matching efficiently. Similarly (though more extremely), an author library which replaced CSS rules and cascade with explicit assignment of styles to individual elements would be possible but tricky to get right and harder yet to make perform well.

Secondly, I expect that many authors would find this technology easier to adopt with support from a service provider or product, such as a CDN, hosted CMS, or application proxy. They are well-positioned to reason about the side effects that requests to particular URLs may have, but not necessarily in a position to modify documents, especially their dynamic content.

For example, a hosted CMS might know that particular URLs simply fetch a blog post or product detail page without side effects. A CDN might be configured to allow prefetching only cached HTML resources, so any same-origin link can be prefetched safely (since both hitting the cache and rejecting the request cannot have side effects on the origin server). These providers can then provide a turnkey solution to their customers, in a way that would be much more difficult if they had to modify each link in the DOM.

Discussed May 15, 2023 (See Github)

Peter: looks like we got some feedback.

Dan: aside - we push back a lot about developer complexity. But we don't really have it documented as a design principle. Should we?

Lea: we do have something... consider tradeoffs? Prefer simple solutions

Hadley: and a reference in EWP, about keeping it possible for everyone to make a webpage, not needing a team of hundreds.

Lea: it's about the tradeoff, developer complexity needs to be justified by the benefit. In this case, when we looked at the big picture, they're adding this complex feature and a different syntax, and the benefit seems rather small - automating something you can do with js anyway

Dan: even Prefer Simple Solutions, it doesn't lay it out as the issue is when you add developer complexity you're giving developers a lot more work to do. The point about is it justified. Maybe there's just an addition to 2.1

Lea: something that clarifies, talks about tradeoffs.

Peter: agree it could be stronger, this keeps happening

Lea: what to reply here? I don't think it's justified... Ading an entire block of json with its own syntax with a filter scheme that combines url patterns and css selectors... I'm not sure it's worth it. Neither did Sangwhan or Tess when we looked at it

Dan: It would be one thing if this was the only way to do it and it had support from additional implementers. Given the lack of support for additional implementers... already an issue with multistakeholder

Amy: We could say "please go find more stakeholders and work together to make this less complex"

Peter: It's just in WICG right now...

Lea: I think the bar for adding a new type of json blob that gets added to html, the bar for that should be pretty high. Compared to adding an html attribute or a css property. We've done this for importmaps which is way more of an important feature than this. I could argue that this is also in the same category as linking to a manifest, sort of extension to the web platform. The benefit should be of that magnitude. If it's not, they should work with existing web platform technologies instead of adding new ones. Add html syntax or a css property.

Lea: also it limits resusability. You have to include inline json on every page, you can't link out to it.

Peter: do we have a principle about locality of data? The further apart you keep related information the more likely it is to get out of sync

Lea: that's really good, please file an issue

Peter: this has that issue as well. By centralising the prefetch rules you're taking it away from the things that trigger these rules. As you edit the document it's more likely to get out of sync.

We understand the arguments for a separate syntax; we did not argue that this is not useful. However, we think that the increase in complexity that adding an entirely separate JSON-based syntax adds to the Web Platform should be comensurate with the benefit developers get from it. Similar cases in the past that warranted this kind of increase in complexity have been JS import maps, or PWA manifests. We don't feel the benefit developers get from this is in the same ballpark, to justify this increase in complexity. Furthermore, considering the lack of multi-stakeholder support, it seems like the resulting fragmentation could create additional developer complexity and confusion.

We would like to suggest that you work with additional stakeholders to see if you can both garner additional support and find a less complex design.