Design Review #198

#198: Trusted Types

Opened Sep 18, 2017

Hello TAG!

This is earlier than I'd usually ask for y'all's feedback, but since I'm talking about a type system, it seems reasonable to get some sort of directional feedback from y'all before getting to far ahead of things.

Name: Trusted Types for DOM Manipulation
Specification URL: None
Explainer, Requirements Doc, or Example code: https://github.com/WICG/trusted-types
Tests: Aiming to eventually land the tests in https://chromium-review.googlesource.com/c/chromium/src/+/559061
Primary contacts: @mikewest

We'd prefer the TAG provide feedback as (please select one):

open issues in our Github repo for each point of feedback
open a single issue in our Github repo for the entire review
leave review feedback as a comment in this issue and @-notify @mikewest

Again, there's not much concrete in the explainer, and we're pretty far away from having a spec. So now's a great time to give fundamental feedback on the notion of adding this kind of type system to DOM manipulations. We'd love to hear it. :)

Discussions

Comment by @hadleybeeman Sep 27, 2017 (See Github)

Hi @mikewest! Thanks for sending this! We picked it up at our face-to-face in Nice.

A few comments and questions:

Why is trustedURL not a subclass of URL? It would be good if these trusted types should fit meaningfully into the type hierarchy. Example code would help us here.
Overall, we'd like to encourage you to not do this in IDL but instead do it in example code.
It would be good to see integration with es6 template strings. So that it's possible to come up with a typed output.
The name implies trusted. Can we name it something a bit more functional, like maybe something like "unserialised types for DOM manipulation"? (yes, we know naming is hard :-) )
This may result in escaping everything — too much work (also including potential risks of jumping back and forth between escape and override escapes seem risk to be error prone)
Issues with multiple concatenation, e.g., "mystring" + TrustedHTML.escape('unsafe') + "otherstring" - results in JS string, not typed object.

Comment by @koto Oct 12, 2017 (See Github)

Hey @hadleybeeman, I'm koto, working with Mike on this.

To answer your questions:

Why is trustedURL not a subclass of URL?

No strong reason, these types originate from Google internal implementation, where that was the case. I'm not opposed to subclassing.

Example code vs IDL

In fact, we have a polyfill implementation now in the repository, it can be built with: $ npm install $ npm run build or you could check the sources.

Template strings

Yes! I proposed something like that in https://github.com/mikewest/trusted-types/pull/26.

Naming

We have reasons why this is named trusted as opposed to e.g. safe, but at this stage names are not that important, though "unserializedTypeForDOMManipulation" will probably not fly ;)

Escaping

We're hoping to come up with an API with safe builders that may make the migration easier, and offering security benefits. This is just an early draft.

Concatenation

That is actually expected, by concatenating safe HTMLs with raw strings you may come up with an unsafe construction (if the strings are user-controlled). I hope to address that in https://github.com/mikewest/trusted-types/pull/26, where interpolating user data gets slightly easier, but in general we hope such literal + TrustedHTML contatenations could be automatically refactored, rather than silently accepted at runtime.

Comment by @slightlyoff Feb 1, 2018 (See Github)

Hey @koto:

Thanks for getting back to us! @cynthia noted that the proposal has changed a bit. Are you looking for more feedback now? If not, feel free to close this issue and re-open when you get closer to shipping.

Regards

Comment by @koto Feb 1, 2018 (See Github)

Hey,

There are active changes being planned, and we'll be undergoing some API changes, so I'll close this now as you advised, and reopen when needed. Thanks!

Comment by @koto Feb 1, 2018 (See Github)

@mikewest, can you close this issue? I don't have the permissions to.

Discussed Jan 1, 2019 (See Github)

Peter: mwest asked to re-open and take a look--there have been changes...

Travis: Haven't looked yet. Can keep it for the f2f.

Peter: bumping to F2F.

Comment by @mikewest Jan 11, 2019 (See Github)

@torgo / @ylafon: Would you mind reopening this review? @koto, et al. have moved the design forward from where we left it in February, and I think it's ready for a more detailed analysis from the platform perspective.

Discussed Feb 1, 2019 (See Github)

Some feedback from 19 days ago.

Dan: Travis left feedback 20 days ago and there was a reply.

Sangwhan: i had one piece of feedback. David also had feedback. My main issue was this was only exposed to window.

Peter: they gave feedback on that point.

Sanghan: not entirely convinced so will have to write back on this one.

Peter: how about the rest of it?

Sangwhan: i think we wantt to get david's feeedback as well.

Dan: Agreed.

Sangwhan: everyone needs to comment back who have raised issues here.

Peter: actions on us to continue to leave feedback -

Sangwhan: let's bump to next week.

Comment by @dbaron Feb 7, 2019 (See Github)

So one concern I had while the TAG is looking at this today:

Assuming I'm interpreting the explainer correctly... it seems a little bit odd the way policies are identified by URLs, but there isn't really any validation that the URL given has any hard association to the chunk of script that's registered as being the policy identified at that URL. It both:

seems a little bit like an odd use of URLs, and
seems to introduce a risk that a policy URL really represents whichever chunk of script wins the race to register that policy (which in turn seems to make the CSP part slightly less useful)

I'm wondering if there's something better here, although I don't immediately see something that doesn't involve a bunch of additional resource fetching (to, say, fetch separate policy scripts each in their own file).

Comment by @torgo Feb 7, 2019 (See Github)

Some discussion again on the use of the term "trusted" and how over-loaded this term is. @dbaron suggested that "policy-checked types" might be better. @lknik suggsted "safe-to-consume". @slightlyoff suggested "validated". In any case, even if you stick with "trusted" I feel the document should explain the scope of the use of the term "trusted" - trusted in what sense - as a specific sub-section of the explainer.

Comment by @cynthia Feb 7, 2019 (See Github)

The functionality is currently exposed only to Window on the draft spec, is there is a reason for this? I would imagine this could be really useful in a service worker, which would require it to under WindowOrWorkerGlobalScope. (Seems like a simple fix, but if there is a clear reason for this decision, we'd like to know.)

Additionally, this feels like it would be really useful for server side JS runtimes as well, so making it implementable by node.js (for example) would be really useful.

Comment by @travisleithead Feb 7, 2019 (See Github)

I'm not convinced of the need for the 'exposed' and exposed-enumeration APIs on the TrustedTypePolicyFactory. If the application wants to expose the policy globally, this is really easy to do in an application specific way. For example, just stash a policy on the global object under a well-known name. If these extra features are dropped, the policy parts of the proposal start to look much more like JavaScript's Proxy which is a pattern that is known and could use the constructor pattern.

There also seems to be some potential alignment between the new 'policy' object planned to be exposed by the Feature policy spec (for DOM access to a document's feature policy), and the policy-related objects exposed by this spec. It would be nice to rationalize the names so that both policy things make sense.

Comment by @koto Feb 7, 2019 (See Github)

Thanks for the review! Let me comment on the issues raised inline below:

URLs as policy identifiers

In general, it's just strings I agree. We were trying to make them somewhat represent the policy author (e.g. which dependency creates it), but it so far is not very practical to use URLs to mandate that. We'll update our docs to clarify that.

Trusted as a name

So far we were not able to find a more compelling name, "Validated" seems close, but it not that precise either. Full agreement that we should definitely explain the scope of the 'trustedness' we are aiming for in the spec.

TT exposed in Window only

TT policies are meant to be created in the main window of the application. I'm not sure they would work in a service worker, given that we don't want them to be Transferable (to avoid other, e.g. cross-origin windows sending us the objects). The intention here is to create TT (locally, through policies) in a Window that will later use them with the related document. Service worker does not have the access to the document, so there would be no way for it to return the TT instance back. I agree that SW are compelling in that they might be allow us to centralize the policies implementation, but I don't see how that might work with our constraints, can you explain a bit what you had in mind, @cynthia?

server-side TT

I see that mechanism being useful to limit access to sensitive functions in other environments. I'm not sure TT in the current form would satisfy that, given that they are written to guard DOM sinks only (for practical reasons, we opted to solve the "narrow" problem of DOM XSS first). That said, this approach should in principle work for any DOM-like emulation in nodejs.

expose not needed

Indeed :) The idea was to provide a global registry of (presumably safe for all inputs) policy implementations to facilitate automatic refactoring of the code:

a.innerHTML = foo + bar

a.innerHTML = TrustedTypes.getExposedPolicy('sanitize-me-global', foo + bar);

but that's a fair point - we already have to rely on a configuration setting for this (the policy name), so may just as well fetch the policy from a global, configurable location instead.

use constructor pattern instead of createXYZ

@travisleithead, could you clarify this? Do you mean to use the constructor pattern for creating the policies, or the Trusted Type objects from them? We do not want the constructors for the latter to be callable.

Feature Policy

IIUC, Feature Policy aims to limit a window, together with all child frames, regardless of their origin. TT are expected to apply for the single window only, and definitely should not be propagated cross origin.

Comment by @mikewest Jul 16, 2019 (See Github)

Hello, friendly TAGgers!

It's unclear to me what the status is on y'all's end. :) We think we're converging on something that's reasonable, and it would be a great time to give us feedback if you have an opinion about the shape of the API we're ending up with. :)

Thanks!

Comment by @mikewest Aug 28, 2019 (See Github)

Dearest friends and neighbors,

We still think we're converging on something reasonable, and are getting reasonable feedback from developers who are trying out Chrome's current implementation. If y'all have thoughts or concerns, it would still be a lovely time to express them!

Thanks!

/cc @torgo

Comment by @annevk Aug 28, 2019 (See Github)

As was a problem with CSP, I'm worried with trying to address this problem based on callsites, rather than the more fundamental algorithms that are responsible for performing a certain action that we want to avoid exposing to non-privileged code.

That is, I think a more principled model would modify "navigate" to take a "trusted" value and based on some policy error on non-"trusted" values, and then adapt callsites as appropriate to be able to pass it "trusted" values. (I'm also a little worried that the current approach leaves certain attack vectors unexplored. Are malicious target values problematic? Is changing <form method> not problematic? I.e., should all parameters to "navigate" effectively be "trusted" values? Why only the URLs?

(Also, window.location is protected, but MathML links are not? With a system that protects navigation (and similar such algorithms) at source, you don't really have to worry about this, at least not until someone wants to use MathML links with it.)

Comment by @koto Aug 29, 2019 (See Github)

Thanks, Anne!

As was a problem with CSP, I'm worried with trying to address this problem based on callsites, rather than the more fundamental algorithms that are responsible for performing a certain action that we want to avoid exposing to non-privileged code.

That is, I think a more principled model would modify "navigate" to take a "trusted" value and based on some policy error on non-"trusted" values, and then adapt callsites as appropriate to be able to pass it "trusted" values.

We're not opposed to adding additional assertions at navigation time (or when a script is prepared), but what we feel really strongly about is throwing the errors (also?) at assignment time. For one - this can be polyfilled, but more importantly - this allows for statically (as much as it's possible with JS) identifying places in the code where sinks are used in a non-compliant way. Security bug becomes a programming error, and one can debug it early where it happens, instead of only being left with SecurityPolicyViolationEvents and CSP reports later with not enough context to identify the issue.

I'll identify which alorithms would be a good target for the additional checks and update the spec with the proposed alterations.

I'm also a little worried that the current approach leaves certain attack vectors unexplored. Are malicious target values problematic? Is changing <form method> not problematic? I.e., should all parameters to "navigate" effectively be "trusted" values? Why only the URLs?

We are targetting DOM XSS, content exfiltration or confinement is explicitly not in scope (mostly because this feels like a much more difficult to achieve with the sink-based approach only due to the existence of various other side-channels for the data).

(Also, window.location is protected, but MathML links are not? With a system that protects navigation (and similar such algorithms) at source, you don't really have to worry about this, at least not until someone wants to use MathML links with it.)

All link based vectors stop being risky once we simply disable javascript: URLs (and optionally reenable them selectively at navigation time via efault polocy). This is what we're proposing in the update to the API (we'll likely discuss this at lentgh at TPAC). After those updates, the only URLs that are relevant are URLs of scripts and object/embeds.

Comment by @lknik Oct 9, 2019 (See Github)

Seems it's shipping in Chrome soon?

Discussed Jan 1, 2020 (See Github)

Alice: Lukasz said that it was going to ship in Chrome soon, let me check

Alice: Origin trial ended, it hasn't shipped yet, there are no LGTMs on the Intent thread.

Alice: Anne pointed out on the intent thread that there are a bunch of open issues

... BZ commented: To be clear, Mozilla is interested in solving some of the problems Trusted Types aim to solve, but last I checked our general feeling at this time is that the specific Trusted Types proposal at hand may not be the right solution, and has various known and unaddressed design-level problems."

... Daniel V responded: "Known and unaddressed design-level problems" is new feedback for us. We're happy to pause this intent while we figure out those areas of disagreement."

... Then, a fuller list of issues:

DOM node manipulation and the <script> tag.
Extensibility
Developer support / Support by Frameworks

Tess: maybe we should ask on the Intent thread for specific issue links for the above things, instead of spelunking in the issues & PRs ourselves?

Alice commented on the TAG design review issue asking for specific issue numbers for the above remaining issues.

Comment by @alice Jan 27, 2020 (See Github)

Hi @mikewest et al,

We're coming back to this, finally (apologies!).

We had a read through the intent to ship thread, and noted that the intent was paused in November to take a look at:

DOM node manipulation and the <script> tag.

Extensibility

Developer support / Support by Frameworks

We couldn't find issues for the first and third of those topics; could you possibly give us a pointer?

We also noticed that there was a recent update to that thread specifically calling out the Extensibility issue as having been addressed, as well as the default policy issue (was that in addition to the above three issues, or a rephrasing of one of them?)

Was the Extensibility issue resolved to the satisfaction of the Mozilla folks who raised concerns? (@annevk may know more here.)

What is the status of the other issue or two from the original list?

And, is the current version of the explainer a good place for us to go to get an overview of the current state of the proposal?

Comment by @otherdaniel Jan 30, 2020 (See Github)

Thanks for taking another look at TT! :)

Regarding intent-to-ship & 3 issues: It's certainly our intent to have addressed the concerns.

Extensibility: I think this is done. The original concern was correct, in that extending the design (e.g. to cover CSS) was risky. With the require-trusted-types-for directive it's straightforward for the page author to indicate what scope they intended for the current version of their page.
DOM node manipulation & friends + "default policy": This is the same issue. Or actually, it's an extended version of the original issue. When we presented & discussed our first attempt at fixing this, we discovered that the issue runs a bit deeper than we had initially understood. We think we have now fixed this (in the spec; the implementation is still being finished), by checking the <script> text against a shadow slot, and by more precisely spec-ing and implementing when the "default policy" runs. (The last comment explicitly lists the 4 items that address the original DOM node manipulation issue.)
Dev support: We've run an origin trial and are experimenting with TT within Google. We have also received feedback from external parties, although they don't seem too keen on making public statements in this phase. So this is arguably still open. However it seems that we have largely exhausted the feedback potential for experimental features: Developers will put only modest amount of work into an unreleased feature. We will get more & more qualified feedback only after making this more widely available.

Explainer: The broad strokes of the explainer are correct, but it doesn't reflect recent changes to the spec. We need to update it & add more detail.

Comment by @mikewest Feb 14, 2020 (See Github)

Hey folks! As an FYI: there is an intent to ship thread ongoing on blink-dev@. IMO, the right thing to do is to get this API in front of developers, and iterate on it as we learn from their feedback. If y'all believe there are things we really ought to change before shipping it, please do weigh in. We value your input.

Comment by @alice Mar 2, 2020 (See Github)

@plinss and I are looking at this at our face to face.

Firstly: is the explainer up to date now?

Secondly: I would like to encourage @dbaron and @cynthia to take another look before we close this - perhaps if the explainer needs another round of edits, you could let us know when it is up to date?

Comment by @koto Mar 2, 2020 (See Github)

Firstly: is the explainer up to date now?

Yes.

There's some minor details we're working on w.r.t. the shape of the API. The most relevant ones would be:

We're also discussing the exact integration with other specs, e.g.:

That said, the explainer correctly describes the problem we want to address and the solution we propose. Further details are in the spec draft.

Comment by @plinss Mar 3, 2020 (See Github)

I'm also wondering why assign trusted type objects to the existing API surface that expects strings? Would it make more sense to introduce new API surface for content injection that only takes objects, whether they are trusted type like things or other higher-level objects, and then deprecate the string based apis (or turn them off in given contexts). Seems like this would make feature detection and polyfills simpler as well as giving extra options for more control over content injection.

Comment by @mikewest Mar 4, 2020 (See Github)

Hey, Peter! Thanks for the feedback!

The main reason we ended up reusing .innerHTML rather than introducing .innerTrustedHTML or similar is deployment cost. As currently specified, Trusted Types can be rolled out to an existing application in a piecemeal fashion in a way that works in both browsers that support the mechanism and those that don't. Changing the DOM entry points themselves would require substantial rewriting and branching at every point at which the DOM is modified; @koto will have more detail here, but my understanding is that most(?) application have significantly fewer points of string creation and/or sanitization than points of usage. Auditing and adjusting the former is a tractable task, branching for the latter would be more difficult.

Comment by @otherdaniel Mar 4, 2020 (See Github)

On separate 'trusted' entry points: It's a big thing for library support: In the current model, every trusted-types valid program is also non-trusted-types valid program. (With exception to actual policy creation, which can be guarded by a single if-statement).That makes it possible for programs, libraries, and frameworks to transition to Trusted Types support without leaving their non-TT users behind.

With separate entry points, libraries would be forced to maintain two versions (either two actual versions, or effectively two versions by branching on every affected DOM call). And every program that wishes to use TT would be pushed deeper into dependency hell, since now they'd have to transition every dependency and dependency-of-dependency to TT first, or to somehow manually enable them with via default policies or patch them elsehow. In a world were 100% self-contained, dependency-less programs have become a rarity, that's a pretty big deal.

Comment by @dbaron Mar 5, 2020 (See Github)

@sangwhan and I are looking at this in a breakout.

One comment is that after reading the explainer and parts of the spec, it's not clear to me why require-trusted-types-for 'script' is named what it is. It seems like an odd name to me for a directive that's about requiring something for javascript URLs. Though the Adopting Trusted Types section seems to imply that it's about something much broader than just the handling of javascript URLs.

Beyond that, I don't have anything to say immediately, but it really took me half an hour to load this whole thing into my head again, and I'd probably need another read-through to come up with more interesting thoughts on it.

Comment by @dbaron Mar 5, 2020 (See Github)

So I think given that @alice and @torgo were suggesting we consider closing it, and I think we think we're OK with closing this as well. I'm not sure that we've gone through everything in detail, but I also think we're reasonably comfortable with this approach at a high level even if we haven't dug into all of the details.

I'd note that I found the explainer a bit confusing, particularly for understanding how the CSP integration works (see previous comment). But this seems like it's probably straightforward to fix, and it's certainly understandable that an explainer for something that's changed so much over time might get a little confused at points.

Comment by @koto Mar 5, 2020 (See Github)

Thanks! For the record, require-trusted-types-for 'script' controls all enforcement for DOM XSS prevention (the only sink group we're tagetting right now), not just javacript: navigation. Point taken though, I'll try to clarify that in the explainer and review the spec to make that clear.