#564: Deprecating `document.domain` setter.

Visit on Github.

Opened Oct 19, 2020

Guten TAG!

I would be thrilled if y'all could peruse the following suggestion that we start seriously looking into removing the document.domain setter from the platform.

We all know that document.domain's setter is a mechanism of weakening the same-origin policy to allow same-site-but-cross-origin documents direct DOM access to each other. This is unfortunate to begin with, but particularly so in light of Spectre and related side-channel attacks that have convinced us that aligning origins to processes is essential. document.domain makes it difficult to accurately commit documents into an origin-bound process, since its level of access to same-site documents could shift at runtime. Rather than asking developers to opt-out of this via COOP/COEP, Origin-Isolated, or Feature/Permission Policy, it would be ideal to first shift to an opt-in model, and then remove the mechanism entirely after usage is sufficiently low.

The proposal, then, entails the following:

  1. After a sufficient amount of communication with developers, shift the document-domain feature policy's allowlist from * to 'none'. This would break document.domain usage by default, allowing developers to opt-into it via Feature-Policy: document-domain 'src' or similar. (Here, it may be more compatible to shift from throwing an exception when the setter is disabled to simply ignoring its usage.)

  2. After driving down usage in step 1, shift to a more restrictive opt-in (enterprise policy, reverse origin trial, etc).

  3. After more usage disappears in step 2, remove the setter entirely.

  4. 🎉


Further details:

  • I have reviewed the TAG's API Design Principles
  • The group where the incubation/design work on this is being done (or is intended to be done in the future): The change is small enough that it would proceed through PRs against WHATWG specifications.
  • The group where standardization of this work is intended to be done ("unknown" if not known): WHATWG for HTML, WebAppSec for Feature/Permission Policy.
  • Existing major pieces of multi-stakeholder review or discussion of this design:
  • Major unresolved issues with or opposition to this design: Usage in the wild is high, something like ~0.4% of Chrome page views.
  • This work is being funded by: Google.

We'd prefer the TAG provide feedback as (please delete all but the desired option):

💬 leave review feedback as a comment in this issue and @-notify moi.

Discussions

Comment by @domenic Oct 19, 2020 (See Github)

shift the document-domain feature policy's allowlist from * to 'none'.

Where did document-domain end up in the feature policy -> { permissions policy, document policy } split? It seems like one of those two would be more suitable for this sort of thing, but I'm never sure which one... /cc @clelland @annevk.

After more usage disappears in step 2, remove the setter entirely.

There are other options which could be slightly less breaking, e.g. making the setter a no-op via [LegacyLenientSetter], or using [Replaceable]. Removing it entirely would be the most clean, though!

Comment by @mikewest Oct 19, 2020 (See Github)

Feature Policy => {Permissions,Document}

For the purposes of this proposal, I think either can work. The opt-in mechanics would be a bit different, but I don't think there would be a substantial difference beyond that.

There are other options which could be slightly less breaking

Indeed! I should have said something like "remove the weird SOP-breaking side-effects of the setter". I can be comfortable with basically any web-facing way of accomplishing that. :)

Comment by @clelland Oct 19, 2020 (See Github)

The biggest difference is the effect on subframes; keeping it in Permissions Policy means that any page which disables it will necessarily disable it in all of its embedded frames as well (with no warning to or opt-in needed from those frames)

Switching to Document Policy would mean that each frame could (generally) set this bit independently from the others. Any requirements imposed by embedding frames would require an explicit opt-in from the embedded page.

Comment by @mikewest Oct 19, 2020 (See Github)

Given that the suggestion is to turn it off by default, the core distinction seems to be whether the top-level can opt all its dependencies in, or whether the dependencies need to opt themselves in. The former is probably easier to deploy, but if developers already have to touch top-level pages, asking them to also touch nested pages doesn't seem like a huge additional burden.

That's a long way of saying that I'm comfortable with whichever philosophical position on document-domain y'all end up taking, and will adjust the proposal's recommendation for an opt-in accordingly. It says feature-policy right now only because that's what currently works in Chromium. :)

Comment by @clelland Oct 19, 2020 (See Github)

The former is probably easier to deploy, but if developers already have to touch top-level pages, asking them to also touch nested pages doesn't seem like a huge additional burden.

This may not be true in the cross-origin case -- Permissions policy will also necessarily silently opt in cross-origin frames as well, with no action required from a third party.

That's a long way of saying that I'm comfortable with whichever philosophical position on document-domain y'all end up taking, and will adjust the proposal's recommendation for an opt-in accordingly.

👍

Comment by @annevk Oct 20, 2020 (See Github)

I keep bringing this up whenever we discuss document.domain so I kinda feel like a broken record at this point, but disabling document.domain is not what you are after. Enabling Origin Isolation by default is. In particular, see the second bullet point at https://github.com/WICG/origin-isolation#origin-isolation-explainer which I think might also apply to some other features that are not as well specified (e.g., GPUDevice comes to mind although that requires cross-origin isolated too now, but I'm pretty sure there are other process-bound objects).

Enforcing that on third parties via document policy seems reasonable to me, though I haven't thought too long about it.

Comment by @mikewest Oct 20, 2020 (See Github)

I keep bringing this up whenever we discuss document.domain so I kinda feel like a broken record at this point

I still agree with you that origin isolation is what we're looking for, and that it has implications beyond document.domain. I also still think that calling out document.domain explicitly is important, as it's the way this change is going to bite most developers in the butt. Usage of document.domain is orders of magnitude more common than usage of WASM shared memory, for example (and we ought to ensure that that's safely locked to an origin; my recollection is that folks were on board with that?).

Comment by @annevk Oct 20, 2020 (See Github)

WebAssembly.Module is not shared memory. It's just a data structure people didn't want to have to send across process boundaries. You are thinking of WebAssembly.Memory (which by relying on SharedArrayBuffer sits behind cross-origin isolated).

(There's also no sensible way to lock things to an origin without origin isolation. I explored that for a bit.)

Comment by @mikewest Oct 20, 2020 (See Github)

You are thinking of WebAssembly.Memory

You're right, apologies for the confusion. I suspect that means the usage would be even lower, and therefore more malleable, at least in theory.

Discussed Nov 23, 2020 (See Github)

Yves: the idea makes sense - but aparently there is dissagreement of how to meet that goal. There is a WICG proposal ... need to check that.

... needs more time on our side.

[bumped]

Discussed Nov 30, 2020 (See Github)

Dan: we had a brief discussion last week...

Tess: reading through most recent comments... I'm cautious about the compat stuff. Some good comments from Domenic & Anne. I like swapping the ... a way to get back to the old behavior if you need it. Doesn't help sites that aren't being actively maintained. We have to be careful of breaking sites that have no plausable path to being fixed... .. but this seems like a reasonable way to do it if you're gonna do it.

Tess: will leave a comment.

Comment by @LeaVerou Jan 9, 2021 (See Github)

Some data that may be useful: document.domain appears to be used in ~9.4% of pages in the HTTP Archive corpus (source, filter by feature containing DocumentDomain), though it's slowly declining.

Comment by @mikewest Jan 10, 2021 (See Github)

Some data that may be useful: document.domain appears to be used in ~9.4% of pages in the HTTP Archive corpus (source, filter by feature containing DocumentDomain), though it's slowly declining.

Thanks for pointing this out! document.domain is indeed very widely set, but it typically has no effect on a page's behavior. The DocumentDomainEnabledCrossOriginAccess and DocumentDomainBlockedCrossOriginAccess features count the pages upon which document.domain changed an access control check's result from what it would have been had the setter not been used. As the spreadsheet you've cited above notes, those cases are two orders of magnitude more rare in HTTP Archive, which aligns roughly with Chrome's telemetry.

I dug (though not to much depth...) into the latter set of cases to form some initial impressions. It seems that there are a handful of entities that could end up being responsible for a substantial number of those pages, which makes me hopeful that change is possible.

Discussed Jan 11, 2021 (See Github)

mike west response to Lea

Tess: it sounds like a 3x above the "removeal" threshold. If we could do this it would be great but we're skeptical for compat reasons. The numnbers are promising since its usage is declining. I think Mike has done a really good job of designing the process ... This sounds like a pretty sensible way to about it.

Dan: so what can we do with this issue?

Hadley: we should write up some kind of response "this seems risky but the approach seems good considering the risk" and propose close.

Yves: document.domain as Anne said is a way to do more isolation... isolation is done doing multiple other tweaks... there is still an issue of figuring out all the ways to do isolation here and explain them to developers so they can figure it out. While I agree it would be good, we need to get better understanding of isolation...

Dan: that could be a part of our feedback - that there should be a clea docuemntation of how to do isolation - e.g. on MDN - for developers.

Comment by @hober Jan 27, 2021 (See Github)

As I wrote yesterday in #578, we recognize the value in moving the web in this direction, and also that it's risky from a compat standpoint. If you decide to try it, let us know how it goes & if it turns out to be doable.

Comment by @ylafon Jan 27, 2021 (See Github)

Also if people have specific usage patterns for using document.domain, it would be good to document how they could fulfil their goal in a better (ie: more secure) way. Documenting changes and implications for old default is important in doing migration (even if it is a tryout)

Discussed May 1, 2021 (See Github)

Set to pending external feedback since Mike hasn't gotten back to us since January. Pinged him.

Comment by @hober May 13, 2021 (See Github)

Hi @mikewest! Any updates for us on this?

Comment by @mikewest May 17, 2021 (See Github)

An update: This is still a good idea, and we've made no progress on it due to other priorities. I was hopeful we'd get started on deprecation in earnest in Q2, but Q3 is looking more likely.

Comment by @otherdaniel Sep 10, 2021 (See Github)

Hi all. Just a note that we're picking this up on the Chromium side.

The plan is largely as initially proposed here: Implement a document policy (or whatever it ends up being named) to enable/disable document.domain setting, initially being default-enabled (i.e., opt-out), and to then switch the default (to opt-in) after a bake-in period. So the capability remains but will eventually require an opt-in. This in turn would then allow us to improve origin-level isolation. Timeline is still uncertain, but we're hoping for the first step in one of the coming milestones (M96 or so), and to switch the default a few months later (provided the earlier steps go well).

Comment by @annevk Sep 13, 2021 (See Github)

Why can the existing Origin-Agent-Cluster header not be used?

Comment by @mikewest Sep 13, 2021 (See Github)

Why can the existing Origin-Agent-Cluster header not be used?

If we did that, we'd basically end up inverting the default clustering algorithm, and asking developers to opt-out by setting Origin-Agent-Cluster: ?0? I think it's a reasonable alternative to consider, as it matches the goal we have from the browser's perspective.

The plan we've been running with has been to carve off the specific web APIs that developers may be depending upon, and give them targeted messaging around those options. A document-domain policy feature seemed like a reasonably well-targeted option whose impact developers could understand easily. But, since Chromium is in the process of carving off the only other remaining sticking point (cross-origin module sharing), perhaps it does make sense to jump right to the end.

Practically, that might be a problem for Chromium's implementation, as I think the process allocation heuristics are currently tuned for Origin-Agent-Cluster hints being rare (especially on mobile); I can imagine performance impacts in the short-term if we enabled it across the board. It would be more practical in the short-term to separate the document.domain question from the process allocation question so that we can handle the one and then the other. But philosophically, I can understand the appeal of skipping the document policy in favor of the existing header.

I'd appreciate the TAG's feedback on the developer messaging, as well as input from folks at Google who have been working with developers on this (@lutzvahl and @agektmr, I think?).

Comment by @agektmr Sep 14, 2021 (See Github)

I like the idea of using Origin-Agent-Cluster as it reduces number of headers developers need to understand. This simplifies things. The less things to learn, the easier for developers to adopt.

Comment by @lutzvahl Sep 16, 2021 (See Github)

I'd appreciate the TAG's feedback on the developer messaging, as well as input from folks at Google who have been working with developers on this (@lutzvahl and @agektmr, I think?).

@domenic as the driver of Origin-Agent-Cluster, WDYT? IIRC Origin-Agent-Cluster is used for performance isolation, which could be impacted by this.

Reducing the number of headers sounds reasonable, but we need to be cautious about what features we should combine. For me it seems harder to explain to developers why the Origin-Agent-Cluster header has an impact on document.domain, than adding a new one to the list...

Comment by @domenic Sep 16, 2021 (See Github)

If we use Origin-Agent-Cluster for this purpose then it will effectively become tri-state instead of boolean:

  • Origin-Agent-Cluster: ?1 means "I would really like an origin-keyed agent cluster, ideally with a separate process, because I have done measurements and that helps my site perform better, and I don't use document.domain"
  • Origin-Agent-Cluster: ?0 means "I definitely use document.domain and so I cannot be put into an origin-keyed agent cluster and need to share a process with other same-site cross-origin pages"
  • Missing header means "I don't use document.domain, so I will go into an origin-keyed agent cluster. But, it's up to the browser to apply heuristics and determine whether that means a separate process or a shared process; I have not done any measurements to figure out whether a separate process is a good tradeoff for me".

I think this is reasonable, although a bit subtle. Having a separate switch specifically for document.domain is more explicit, but it creates a 2x2 matrix where one of the entries is an error:

  • Origin-Agent-Cluster: ?0 + Document-Policy: document-domain=?0 means heuristics for process/separate agent cluster
  • Origin-Agent-Cluster: ?0 + Document-Policy: document-domain=?1 means shared process/shared agent cluster
  • Origin-Agent-Cluster: ?1 + Document-Policy: document-domain=?0 means separate process/separate agent cluster
  • Origin-Agent-Cluster: ?1 + Document-Policy: document-domain=?1 is a conflict and we'd have to pick one of those to win

(but in this 2x2 matrix version, everything is really a boolean; there is no tri-state "boolean".)

Comment by @mikewest Sep 17, 2021 (See Github)

I agree completely that Chromium, at least, would implement this as the tri-state enum you spell out, as I don't think we'll initially be capable of supporting process-isolated agent clusters for all origins, all the time, without performance impact. Over time, I expect we'd be able to make it boolean (or remove the header altogether), but that's a ways out.

I don't think, however, that the developer-facing behavior would actually be tri-state. Sites that don't send the header would see the same behavior as those that send Origin-Agent-Cluster: ?1 from the perspective of API availability (e.g. document.domain would be a no-op in both cases), and the OAC header is specified to leave the process model the header supports up to the user agent.

Comment by @annevk Sep 20, 2021 (See Github)

Yeah, I think what matters here is what is observable to web developers (when they are not performing Spectre attacks).

Comment by @domenic Sep 20, 2021 (See Github)

From the point of view of spec authors that's true. From the point of view of implementers and web developers, and very importantly from the point of web developer facing documentation, I think the tri-state nature is important.

Discussed Dec 1, 2021 (See Github)

Dan: proposed text ...

Hi folks - we are looking at this at our virtual face to face and it looks like this has progressed signfigtantly with a new (tri-state) [proposal](https://github.com/mikewest/deprecating-document-domain/). That looks reasonmable though we have concerns around compat.  Is this work still in flux or is it appropriate for us to re-review at this time based on this propsoal?
Comment by @otherdaniel Dec 14, 2021 (See Github)

Update from Chromium-Land: We're proceeding with the Origin-Agent-Cluster-based approach suggested in this thread.

Comment by @torgo Jan 30, 2022 (See Github)

Hi @otherdaniel is there any further status you can share?

Comment by @otherdaniel Feb 1, 2022 (See Github)

Hi @otherdaniel is there any further status you can share?

Yes, indeed. We're proceeding with the deprecation, based on flipping the default for Origin-Agtent-Cluster to be default-on (rather than default-off), as suggested above.

Current plannig is to start with a console warning in Chromium M100, and to flip the default in M106. There's been concerns about backwards compatibility, so we'll try harder to educate users, particularly large-scale users of document.domain setting. We'll re-evaluate the situation in the M106 timeframe, and will make a final decision then, based on observed usage.

Comment by @torgo Mar 23, 2022 (See Github)

Hi folks - we are just circling back to this at our hybrid F2F. Thanks for taking our feedback on board. We look forward to seeing the results of your experimentation and we would like to again encourage you to produce educational materials aimed at web developers (for example, via MDN) that can help them adapt to this change.

Comment by @yaseenkk Jul 13, 2022 (See Github)

Earliar chromium browser was displaying error message as "document.domain" is going to be disabled by default in M106. Now I can not see this message? - Does this setting origin-agent-cluster:?1 default postponed?. Could you please update when does origin-agent-cluster:?1 going to set and in which version of browser?

Comment by @otherdaniel Jul 13, 2022 (See Github)

Earliar chromium browser was displaying error message as "document.domain" is going to be disabled by default in M106. Now I can not see this message? - Does this setting origin-agent-cluster:?1 default postponed?. Could you please update when does origin-agent-cluster:?1 going to set and in which version of browser?

For Chromium, the deprecation of document.domain, aka defaulting origin-agent-cluster: to ?1, is scheduled for M109, slightly postponed from M106. However, the warning/"issue" in the DevTools issues panel should be active. I'm surprised the message would have disappeared. It should occur whenever document.domain is set, as well as when an access based on a modified document.domain is made.

This is scheduled for M109, but note that Blink/Chromium "API owners" will have the final call of whether and when to launch this.

Reference: https://groups.google.com/a/chromium.org/g/blink-dev/c/_oRc19PjpFo