design-reviews#1130: Incubation: An `Origin` Object

#1130: Incubation: An `Origin` Object

Visit on Github.

Opened Aug 5, 2025

Explainer

https://github.com/mikewest/origin-api/

The explainer

Includes the information requested by the Explainer Explainer.
Follows the Web Platform Design Principles.
Includes or links to answers to the Security/Privacy Questionnaire.
Describes user research you did to validate the problem and/or design.

Where and by whom is the work is being done?

GitHub repo: https://github.com/mikewest/origin-api/
Primary contacts:
- @mikewest, Google, Chrome
Organization/project driving the design: Chrome.
This work is being funded by: Google.
Incubation and standards groups that have discussed the design:
- Nada.
Standards group(s) that you expect to discuss and/or adopt this work when it's ready: HTML @ WHATWG

Feedback so far

Multi-stakeholder feedback:
- Chromium comments: I like it. @domenic didn't hate it.
- Mozilla comments: https://github.com/mozilla/standards-positions/issues/1280
- WebKit comments: https://github.com/WebKit/standards-positions/issues/538
- Some conversation around https://github.com/whatwg/urlpattern/issues/275
Major unresolved issues with or opposition to this design:
- @annevk noted in the URLPattern thread linked directly above that the specific case of postMessage() validation could be satisfied with a narrower matching API that encouraged developers to think about more than the origin, which is a reasonable suggestion.

You should also know that...

There's some relationship to @annevk's https://github.com/whatwg/url/pull/288, though I think that aims to solve a distinct problem.
This would be, I think, the first place we'd directly expose the "same-site" concept in a way that enabled comparison.
This proposal derives a "site" from an origin (a la HTML's "obtain a site" and "same site" definitions), and exposes it as a property of that concept. It could also be reasonable to expose it through the aforementioned URLHost proposal, or more directly on a URL. IMO, none of those are mutually exclusive, and I can see reasonable arguments for several of them (URLHost, for instance, seems particularly well-suited to explain the "schemelessly same site" concept,

Track conversations at https://tag-github-bot.w3.org/gh/w3ctag/design-reviews/1130

Discussions

Discussed Aug 11, 2025 (See Github)

Hadley: Sarven and Dan not here so skip it.

Discussed Aug 18, 2025 (See Github)

Dan: Some of the motivation is to handle opaque URLs. They should flesh out that motivation, and possibly add a targeted API for that.

Matthew: API shape suggestion is interesting. Christin said it's not useful enough to add to platform.

Dan to draft a comment with Christian.

Discussed Aug 25, 2025 (See Github)

Lola: In the docs CG, we experimented with explainer reviews. Consensus on another proposal for Origin objects was "why?"

Dan: I reviewed this, and asked why. Also comments on considered alternatives.

Christian: I like the response.

Lola: Do we need to wait for Sarven?

Dan: Think it's just Christian and me.

Lola: If Christian's happy, I think that's fine.

Dan: Ok, I'll post.

Comment by @dandclark Aug 27, 2025 (See Github)

Thanks @mikewest for sharing this proposal! The TAG looked at this and had a few questions.

In motivating the proposal, is there more that you can share about the practical developer need for UAs supporting this natively, vs encouraging use of a library? The explainer cites PMForce: Systematically Analyzing postMessage Handlers at Scale as evidence of need, but in that paper the authors suggest that this is perhaps less of an issue than prior research had suggested:

While an investigation of how we can adapt the postMessage API to make it secure and usable for developers in this work would not do it justice, we nonetheless want to highlight two observations that we hope to be influential for developers and standard authorities alike. First, most of the handlers that protect sensitive behavior implement origin checks correctly. Second, postMessage relays can undermine the security of correct origin checks. ... Contrary to previous analyses, we show that the majority of origin checks protecting sensitive behavior are implemented correctly; thus, no longer allow an attacker to bypass them.

That said, they did find instances of flawed Origin checks, so it'd be reasonable to say there's still an issue to be solved here. Is the argument that by baking it into the platform, developers are more likely to reach for the built-in solution rather than manually writing the check, even if libraries are available?
As an alternative to consider, what about adding isSameOrigin(), isSameSite(), and maybe originToString() to URL instead of introducing a new Origin type? This does not preclude adding OriginPattern, which could take URLs, if there's a need for it.
- Something that alternative approach might leave out is handling for opaque origins. But if this is a critical part of the proposal, it would be helpful to have the explainer dig into that a bit more, discussing scenarios where it would be useful to compare opaque origins, and what that might look like in example code.
The section An Origin Object mentions Origin will have "a stringifier named serialization" but that isn't shown -- should that instead say toString()?

Comment by @mikewest Aug 27, 2025 (See Github)

Thanks for your feedback!

Thanks @mikewest for sharing this proposal! The TAG looked at this and had a few questions.

In motivating the proposal, is there more that you can share about the practical developer need for UAs supporting this natively, vs encouraging use of a library? The explainer cites PMForce: Systematically Analyzing postMessage Handlers at Scale as evidence of need, but in that paper the authors suggest that this is perhaps less of an issue than prior research had suggested:

...

That said, they did find instances of flawed Origin checks, so it'd be reasonable to say there's still an issue to be solved here. Is the argument that by baking it into the platform, developers are more likely to reach for the built-in solution rather than manually writing the check, even if libraries are available?

The "same-origin" check is trivial to implement in userland (modulo opaque origins which are noted below). Folks make mistakes, but I don't think developers require assistance, as the string serialization is non-lossy for tuple origins. Having an object representing the actual comparison (e.g. isSameOrigin()) is philosophically appealing, but not necessary.

isSameSite(), on the other hand, is ~impossible for something other than the browser to correctly represent to developers, given the delta between the PSL shipping in the user agent in which code is executing and the PSL shipping in the library the developer updated a year ago (or, if they're totally on top of things: today!).

These marginally more complex checks are also the places in which developers tend to make the clearest errors (e.g. const regex = /^https:\/\/.*.example.com$/; which looks right, but matches "https://maliciousexample.com". It's very easy to get these checks wrong. It would be ideal to pave a path for developers' common needs that was easier to hold correctly.

As an alternative to consider, what about adding isSameOrigin(), isSameSite(), and maybe originToString() to URL instead of introducing a new Origin type? This does not preclude adding OriginPattern, which could take URLs, if there's a need for it.

I think adding URL.isSameOrigin() and URL.isSameSite() are quite reasonable to consider, perhaps also as part of @annevk's URLHost linked above for "schemelessly same-site" checks. I'm not sure I understand what originToString() would do, given URL.origin. That would handle a pretty substantial subset of what developers need.

Something that alternative approach might leave out is handling for opaque origins. But if this is a critical part of the proposal, it would be helpful to have the explainer dig into that a bit more, discussing scenarios where it would be useful to compare opaque origins, and what that might look like in example code.

The second paragraph of https://github.com/mikewest/origin-api/blob/main/README.md#what-about-existing-origin-getters outlines the value here. I can add more detail there if it's helpful, but the core point is that all <iframe sandbox> documents (and file: and data: and etc.) look the same with regard to a MessageEvent.origin check because the serialization to "null" is lossy. The sharp edge is a frame that's navigated between calls to postMessage(); it's still an opaque origin, so it still serializes to null, but it's not the same opaque origin.

Similarly, <iframe srcdoc="<iframe sandbox>" sandbox> and various other permutations of opened windows and frames and etc. will have trouble ensuring that the child frame and the parent frame are actually "same origin" if they're only working with a string.

The section An Origin Object mentions Origin will have "a stringifier named serialization" but that isn't shown -- should that instead say toString()?

Thanks for pointing that out. It should read "named toJSON()." This is defined in the spec at https://mikewest.github.io/origin-api/#Origin-stringification-behavior. I updated the explainer in https://github.com/mikewest/origin-api/commit/0261efa9fc6bb77bec2b21935d08839c51ce7f88.

Discussed Sep 1, 2025 (See Github)

Dan: Been looking at this; responded about a week ago. Contention was whether this should be part of the platform, or a library. The response was same-site checks are hard. I find this somewhat persuasive. I know Christian was not as keen - want to get his thoughts. I'm starting to come around to this. The interesting part is about opaque origins, and whether they are the same or not. Mike's proposal is maybe not complete there, because in order to construct the Origin objects you can only really do that from a serialized opaque origin, so that's not telling the full story, but the explaienr touches on this. It owuld be useful to flesh this out.

... I have the makings of a response; would like Christian's and Sarven's take. I'm leaning towards 'satisfied' though.

Lola: Same site difficulties?

Dan: The notion of same site is not baked into the platform. You can take the path out, and compare the origins and try it that way. There are pitfals. But 'is same site' is a different kind of check that depends on public info. You could have danclark.github.io and lola.github.io - those are not same site, but in order to know that github.io is to be treated as an origin but not a site, there's some context there. Sometimes subdomains should be treated as an origin and sometimes not. There are suffix lists that compile this info. The browser has some info on this but doesn't expose it via a platform API.

Sarven: The way you're expressing this makes sense.

Lola: so Sarven is +1 and need Christian's info.

Matthew: Are 'multi-tenated' sites (sub-paths) out of scope?

Dan: Yes. Main use case is, I recieve a message from an iframe, is it from an origin I trust? Url object and Urlpattern would be used for those.

Discussed Sep 8, 2025 (See Github)

Sarven: If an additional review is needed, I can otherwise Dan seems to have covered a lot of ground already.

Lola: Dan's last comment was yesterday.

Comment by @mikewest Sep 28, 2025 (See Github)

Friendly ping. Did y'all have additional feedback or thoughts?

Discussed Sep 29, 2025 (See Github)

Jeffrey: That looks good. Dan to post?

Comment by @dandclark Sep 30, 2025 (See Github)

Thanks @mikewest for the ping. The TAG is in favor of the direction here, with the following additional thoughts to consider.

The point is well-taken that same-site is hard/impossible to get exactly right outside of the browser, so it seems reasonable to expose it. We'd strongly recommend considering the addition of isSameOrigin() and isSameSite() to URL as a simpler possible alternative approach that still achieves that goal. The main question is whether this alternative gives up too much by not improving handling of opaque URLs.

To that end, as a next step in developing the proposal, following through on the line of thought explored in https://github.com/mikewest/origin-api/blob/main/README.md#what-about-existing-origin-getters seems important. We understand the problem with the status quo of all opaque origins serializing to "null", but as the explainer points out, it's important that the platform be able to give out Origin objects directly. If the only way they can be created is from the serialized origin strings currently available from the platform, their usefulness is limited:

let previousSender = null;
window.addEventListener('message', e => {
  const sender = Origin.parse(e.origin);

  // Since we're always minting new Origin objects, this check will
  // fail if two messages in a row are received from an opaque origin even if it's
  // the same one. Information about whether they're the same was lost since
  // the string passed to `Origin.parse` is always just "null" if opaque.
  if (previousSender && sender.isSameOrigin(previousSender)) {
    console.log("Got two messages in a row from the same sender");
  }

 previousSender = sender;
});

This is much safer than resolving to a "null" === "null" check that'd always be true, but the goal of "allow[ing] for isSameOrigin() comparisons that developers could use to establish a sender's consistency over time" is not fully realized here.

So it'd be good to flesh out the details of what else would be needed. At a minimum it seems like we'd need an Origin getter on the message event object, and a way to get the page's current Origin, equivalent to window.origin but returning the object. Are there other places that a getter for the new Origin object would be needed? How will these be named to distinguish them from the string-based getters?

Comment by @marcoscaceres Oct 2, 2025 (See Github)

Just noting that URL.parse(e.origin) is a thing.

Probably redundant, but the method could take an URL object too, so to avoid serialization to string when the methods are called.

Comment by @marcoscaceres Oct 2, 2025 (See Github)

And generally, just speaking out loud, I wonder if the API should be part of the URL API?

Comment by @mikewest Oct 2, 2025 (See Github)

Thanks @dandclark and @marcoscaceres!

After some productive discussion with @annevk yesterday (see mikewest/origin-api#8), I think we have a good approach to obviating existing origin getters via Origin.from(...). That should indeed allow us to avoid serialization generally and address some of the discussion above.

Regarding the distinction between URL and Origin, I see opaque origins as creating a pretty sharp distinction between what those objects can reasonably represent.

Comment by @marcoscaceres Oct 24, 2025 (See Github)

Oh! I like where that is going. That’s looking great.