design-reviews#347: User Activation Delegation through postMessages

#347: User Activation Delegation through postMessages

Opened Feb 22, 2019

こんにちはTAG！

I'm requesting a TAG review of:

Name: User Activation Delegation through postMessages
Specification URL: https://github.com/whatwg/html/pull/4369
Explainer: mustaqahmed.github.io/user-activation-delegation Design doc: link
Tests: Work in progress in html/user-activation/.
Primary contacts: @mustaqahmed

Further details (optional):

Relevant time constraints or deadlines: We would like to have the review done by next month because we prefer to have this feature in Chrome 75, in order to fix some regressions caused by UAv2 (#295). But note that this API is conceptually orthogonal to UAv2.
I have read and filled out the Self-Review Questionnare on Security and Privacy. The assessment is here.
I have reviewed the TAG's API Design Principles.

You should also know that... While Chrome needs this API to fix regressions caused by UAv2 (#295), the concept of delegation here is independent of the model in UAv2. The delegation here can be used with any user activation model, including Chrome's old model (user gesture tokens).

We'd prefer the TAG provide feedback as (please select one):

open issues in our Github repo for each point of feedback
open a single issue in our Github repo for the entire review
leave review feedback as a comment in this issue and @-notify [github usernames]

Discussions

Comment by @mustaqahmed Mar 1, 2019 (See Github)

We need to review this quickly: we have got at least one regression that would be fixed by this API.

Discussed Mar 19, 2019 (See Github)

alice: This came in 25 days ago, not quite my area but seem reasonable. Useful when you want to know if the user has interacted, let's you transfer to the frame.

tess: Worried about the browser limiting some activity in iframes/subframes to a user gesture and this is a way for the main frame to hand off that user gesture. This makes sense going upward, but not downward. Media policies are different in different browsers; would this allow arbitrary hand-off of user gesture state? You have an iframe that has a video in it, user taps to make a comment and that causes the iframe to start playing a video.

alice: Wouldn't the video stop when the user interacted with the top frame again?

hober: I think that would depend on the browser's policy.

torgo: There are some security considerations in the design doc; not very privacy focused. Does seem that there's a missing privacy consideration... any time you're dealing with activation you're dealing with something that is personal to the user. That feels dangerous when it comes to privacy... doesn't seem that there's enough in this doc that deals with this issue. This is a little vague..

dbaron: I'm concerned that there are things that set the user activation state; often you'll get more than one of those in rapid succession. If you have the ability to transfer that to another frame, you can take what's conceptually one user action and "spread" it around, e.g. mousedown transfers state and then mouseup keeps the state locally.

hober: or you run into an issue where the transfer recipient loses user activation.

torgo: Let's try and solidify our feedback and come back to this next week.

Comment by @torgo Mar 19, 2019 (See Github)

Feels like there needs to be more exploration / mitigation of the potential privacy issues considering we are talking about user state here?

Comment by @dbaron Mar 19, 2019 (See Github)

Is https://github.com/mustaqahmed/user-activation-delegation/ the repo where you want github issues to be filed?

Comment by @dbaron Mar 19, 2019 (See Github)

And so I don't forget, the issue I was hoping to raise was that I'm concerned that being able to pass the state to another frame means that duplication of creating the activation state becomes a sort of privacy vulnerability. In other words, if, say, a user agent treats both mousedown and mouseup as triggering a user activation, then it could have a mousedown observer that transfers its own user activation state to another document, and then it would get activated again on mouseup. So this requires that (a) implementations be both more conservative and more interoperable in how they cause a user activation state and (b) that even with that fixed, it seems like it allows spreading user activation state much more broadly than before, since user activations often come in groups.

Comment by @mustaqahmed Mar 20, 2019 (See Github)

@torgo: There is no privacy concern here since only information we are transferring here is "whether user is interacting (or has interacted) with a frame". This is a trivial information any page can easily collect today through event handlers and then store/communicate using other APIs.

Thanks @dbaron for highlighting this tricky abusability scenario. Let me explain why I think this transfer API is the safest choice we have:

If this transfer API is used in conjunction with UAv2 (TAG review here, successfully shipped in Chrome 72), all user inputs (even multiple clicks) within a time-limit of few seconds already fuse into a single activation, and consuming the activation in any frame already clears the whole frame tree. So multiple consumption is impossible. See the Security Considerations section in our design doc. (This solves problem (a) in your post, and prevents (b) too.)
If this transfer API is used without UAv2, it's the job of the underlying model to guarantee single consumption with and without activation transfer. We believe that existing non-UAv2 models are too complicated to be able to provide this guarantee; for example Chrome had this serious bug with cross-process postMessage despite many years of effort (got fixed through UAv2).
For user activation, there is no interop today even with a plain postMessage: see this comparison from 2017. We can't expect interop only with the transfer option here. This has been broken for many years, and will need a long-term plan to fix. (A related note on interop: not all browsers trigger user activation through mousedown. In Chrome we have a bug to possibly drop mousedown to match Firefox. Spec discussion here.)

To emphasize, our long-term goal here is interop with user activation. In Chrome 72 we proved through UAv2 that a simple, token-less, easy-to-implement solution works for the Web. We encountered a few breakages, for which the transfer API proposed here is a workaround. Once we are done with both of these successfully, we will encourage other browsers to switch.

Discussed Mar 26, 2019 (See Github)

David: he responded to the substantive questions

Peter: he checked "issues in their repo" for feedback.

Tess: mark as urgent? he said we need to do it quickly.

Dan: Feels like he is dismissive of the privacy concerns. Yes, this is info that can be collected via other mechanisms, but does it make it easier to collect and share?

Peter: is this a high level feature that we even want to expose?

Tess: the case of the iframe that wants to tell its parents to resize it but only under user gesture. - a reasonable case for wanting to write web content that restricted itself to a user gesture. Prior to thinking of that use case I thought it could be gated in other ways. the iframe resizing case was a reasonable counter-argument. It's a bigger question than the scope of this design review.

Dan: i think it's reasonable for us to highlight this.

Tess: say you've got 2 fingers on the screen on 2 different elements...

Sangwhan: is the activation state exploitable like how you could use hsts to fingerprint people?

Tess: not obvious off-hand to me if this would make it more possible.

Sangwhan: is there some kind of cache of activation state? Not sure how persistent that is.

Tess: the persistence of that cache is concerning because...

Sangwhan: hsts state can be cached and can be used [for fingerprinting]. Is that in any way possible with this info.

David: if the user activation state is cached it's cached for a few seconds, not permenantly like HSTS.

Dan: ok - other than that I think we are happy - so let's bump it a week just to come back to it and hopefully close it up then.

Alice: if they decided next week to ship [in chromium] would anyone have a strong objection?

Peter: my concerns escalate as I see more and more APIs coming in that have a similar scope and nobody is looking at the big picture. These are all being taken in different directions.

Dan: what can we do about that?

Tess: 2 parts of that - would these things be unified in some way and how, and what is the right approach? also a process question - if we get 5 reviews that are all urgent that all have longer-term implicaitons, how do we give effectivefeedback in the time requested and simultaneously do right by our chartered obligations. if we're always udner pressure to not strongly object because of time-pressure then what is the point of the process?

Dan: +1

Dan: what are the similar things?

Peter: issue 356 - autoplay detection - feels like it should be harmonized with some other APIs.

Tess: there's a

Peter: that was just an example of one more interesting aspect

Sangwhan: RTCquic transport - also an exmaple of different groups going in different directions for one problem set. Spatial navigation and html focus...

Dan: I think it should be explored...

Peter: how?

Dan: meta-issue in design reviews repo?

Peter: i will do that

Comment by @torgo Mar 26, 2019 (See Github)

@mustaqahmed We take privacy extremely sincerely. Even it's obvious to you that this doesn't impact user privacy, it wasn't obvious to us. We'd like to see user privacy addressed in the explainer a bit more, even if it's just to say "we think this data can be collected in other ways ... [which ways] ... and therefore there is no additional privacy concern." But I also wonder: does this API allow privacy-sensitive information to be collected more easily, and if so, are there any mitigations against potential privacy issues that are built in? Thanks.

Comment by @alice Mar 26, 2019 (See Github)

Issues can be filed on https://github.com/mustaqahmed/user-activation-delegation

Comment by @dbaron Mar 26, 2019 (See Github)

Let me explain why I think this transfer API is the safest choice we have

That explanation sounds reasonable to me; I'd hope that dependency is at least clear to implementors from reading the spec or from reading the explainer.

Comment by @plinss Mar 27, 2019 (See Github)

I'm having serious doubts as to whether the notion of "user activation" should even be exposed to the web platform. This isn't a top-level feature in itself but rather a UA mechanism to control the availability of other features.

If we expose user activation this gives web page developers no indication of which features will be enabled or disabled, and I expect that set of features to be UA dependent and change over time.

What happens when UAs invent new mechanisms to gate features, such as eye tracking or the like? Is this an API we want to support forever?

I'd much rather see improvements to permissions APIs or other mechanisms to get access to, and be able to delegate access to, the individual features that would be gated by user activation.

Comment by @annevk Mar 28, 2019 (See Github)

I think you're correct that there is variance between implementations, but all implementations do have this notion and for some APIs there is agreement that this notion should be a requirement (e.g., Fullscreen). Given that, the typical follow-up is to refine and standardize.

Now, whether the mechanism Chrome has established works for them is also the model Firefox and Safari want to match is less clear.

Comment by @mustaqahmed Mar 29, 2019 (See Github)

@torgo: Thanks for raising the privacy implications question from data collection perspective. I have created a detailed privacy consideration doc to analyze your concerns, and linked it from the explainer as well as from the security/privacy questionnaire. Please let me know if this addresses your concerns.

@dbaron: Added a para in the explainer to further clarify the dependency with UAv2.

@plinss: I agree this is a low-level feature but as @annevk commented, all browsers rely on "user activation" for many APIs already. Moreover, many specs refer to this notion. For example, check fullscreen model, also check how many times the phrase appears in the HTML spec.

We have been discussing the spec problem with "user activation" here for a while; this API here is just a follow-up proposal to Chrome's main proposal there to fix the problem. Let's continue this discussion on that issue.

Comment by @plinss Mar 29, 2019 (See Github)

@mustaqahmed @annevk I'm not arguing that "user activation" shouldn't be a thing. UAs obviously already use that as a signal to gate certain features. That's fine. It's also fine for specs to refer to that, just like specs refer to "private browsing mode" and how things behave differently in that state.

Neither of those mean that the user activation state should be exposed directly via an API (just like there's no API to determine if you're in "private browsing mode"). Frankly as a web developer, I don't care, and shouldn't care, if the page is in a user activated state, what I care about is: "can I open this popup right now?" (or other gated feature). I do not want to check the user activation state and then guess whether or not that's going to impact my ability to open a popup. What about when feature X is also gated by other things, like: is the page installed to home screen? how many times has the user visited the site? how long has the page been open? what's the battery state? what's the network connection like?

There are all things that may gate the availability of a feature today, let alone the things UAs are going to invent tomorrow. (where is the user looking right now? is the device moving at more than xx kph?)

My argument is that the information that web developers need is whether or not the individual feature they're trying to use is available or not, not the inputs that went into that decision.

I also don't have an issue with delegating those capabilities, but instead of delegating the various inputs, delegate the actual capabilities, such as the ability to open a popup, or auto-play a video with sound, etc.

Comment by @mustaqahmed Apr 2, 2019 (See Github)

I see your perspective from "input vs capabilities" now, thanks. Here are our takes on that:

Delegating user activation doesn't exclude the delegating (or even suppressing) capabilities. They are in fact orthogonal. Suppose we would add a "popup delegation" API in future, then we may still have a use-case like "this subframe can open popups but only with its own user activation", right?
I see that the ability to delegate capabilities would give developers more fine-grained control on what to delegate or what not to. But the "orthogonality argument" above means activation transfer doesn't take that finer-control away. E.g. a top frame can delegate user input to a subframe, and still say "disallow fullscreen".
We have specific cases where developers want to transfer user activation from one frame to a "controller" frame. See the regression I mentioned in my first post above. Here is another example.

In summary, we already have capability delegation today (say <iframe allow=...> attributes), and it's natural expect more from individual API owners. Activation delegation makes this notion more powerful and useful IMHO.

Comment by @ojanvafai Apr 2, 2019 (See Github)

I think we might be talking past each other a bit in the last two comments because we mean different things by capability delegation. <iframe allow=...> gives permanent delegation rather than only in response to a user activation. This is trying to give temporary delegation of specifically the things browsers allow but only after user activation. If I understand @plinss correctly, he's not suggesting that we merge these concepts, but instead change the mechanism of the transfer.

@plinss is something like this what you were suggesting? (borrowing from feature policy syntax): // Only transfer activation for the purposes of autoplay popups: targetWindow.postMessage("handle_click", {allow: "autoplay; popups"}); // Transfer activation for anything that blocks on it. targetWindow.postMessage("handle_click", {allow: "*"});

I think that's good feedback, but I also think it's confusing because the set of things being allowed here is the set of things that require a gesture, not any general feature. Will think on whether we could (in the future) temporarily delegate any policy controllable by feature policy, but this is the first I've considered that idea...so...I'm not confident it's a good one. :)

Comment by @plinss Apr 3, 2019 (See Github)

@ojanvafai yes, while I'm not proposing any specific syntax, that's the idea I was trying to get across.

Rather than delegate the gesture, delegate what the gesture enables, this way we're not tying the specific feature that's being enabled to a gesture or any other mechanism, just the fact that the feature has been enabled (because the mechanism for enabling the feature can/will change or be UA specific). It also gives the author more fine-grained control about what can be done in the delegated iframe.

Also, related, rather than have an API that exposes the fact that you've been "user activated", expose the fact that the specific features controlled by the activation state have been enabled (and disabled when the gesture has been consumed or no longer applies). This can either be done by the existing permissions API or by augmenting that API as needed for the transient nature of the permission.

I understand that the features we're taking about here (so far) are the ones currently controlled by user activation, but why not have this be the same mechanism as delegating any other permission?

Comment by @mustaqahmed Apr 3, 2019 (See Github)

Capability-delegation is an orthogonal concept

@plinss: we already agreed above that user activation is an input not a capability. Now let’s look at why capability delegation is not a solution for your problem:

Rather than delegate the gesture, delegate what the gesture enables, this way we're not tying the specific feature that's being enabled to a gesture or any other mechanism, just the fact that the feature has been enabled.

From privacy/abusability perspective, a capability-delegation-mechanism should not enable the capability unconditionally. Let’s look at popup: if the user's current settings allow popups from origin A and disallow them from origin B, a capability delegation from A to B shouldn’t suddenly allow popups from B, right? A more compelling example is Geolocation: a site that has the user’s permission to silently use Geolocation (because the user chose “always allow” in the past) shouldn’t be able to transfer the silent access to any third-party (say ad) subframes.

Therefore, if we want developers to know if they can open a popup, we would need a separate (popup-specific or generic) API for that. Delegation is not a solution for that.

From a different perspective, each capability has a context, and the whole context can’t be transferred through a capability-delegation-mechanism. This confuses developers even with the “permanent” delegation we have today: does <iframe allow=fullscreen> mean

the subframe gets unconditional access to fullscreen (say, w/o user activation), or
it gets just the permission to try fullscreen, subject to all regular checks (including user activation)?

Any capability that needs delegation would have its own corner cases, and should address this question from its own perspective IMHO.

Temporary delegation is not activation expiry

@ojanvafai: If we tie the notion of “time” in “temporary delegation of capabilities” with the expiry time in user activation, this becomes problematic because different activation-gated APIs rely on user activation state differently.

If we transfer one capability (say popup) to a target frame, would we retain other capabilities (say, fullscreen) in the sender frame? I think we meant "yes" here, right? Then every activation-gated API would need to be tracked in every frame. This is like a user activation state for every gated API, with complicated logic to sync states during trigger, expiry and consumption. This doesn’t fit our philosophy of UAv2: simplicity of the underlying model for sake of interop. Separating input (user activation) from capability looks like a good tradeoff that maintains simplicity while providing reasonable control...

Fine-grained control on API availability is already there

It also gives the author more fine-grained control about what can be done in the delegated iframe.

I will clarify my last post: we can achieve this through existing means (iframe allow or feature-policy). Using user activation transfer on top of this allows temporary access during “unowned” user interaction (i.e. user activation on a different frame).

Comment by @mustaqahmed Apr 4, 2019 (See Github)

We are looking again into the use-cases that motivated us, in case there is still a way to hide user activation somehow.

Discussed May 15, 2019 (See Github)

Alice: The last comment ... looking into use cases. I think this one is pending feedback, or possibly even "ping to reopen". I'll set "pending external feedback" for now, and I'll ping folks internally to encourage them to get us feedback by next week.

Dan: Set milestone to f2f?

Alice: yes

Comment by @torgo May 21, 2019 (See Github)

@mustaqahmed we are discussing this issue at our f2f currently. Can you provide any update on the comment you left above? Thanks!

Comment by @mustaqahmed May 21, 2019 (See Github)

Sorry for the delay: we had several face-to-face meetings to find a unified solution covering general capability delegation. We haven't converged to a solution yet, I will share our initial findings in a week.

Comment by @mustaqahmed May 28, 2019 (See Github)

Hi TAG: we are still debating the shape of an API for transient capability delegation. While that is pending, we seem to have converged into the idea that user activation delegation is not the right solution here.

Comment by @hober Sep 10, 2019 (See Github)

We seem to have converged into the idea that user activation delegation is not the right solution here.

Okay. Closing this issue; please file a new design review request for whatever alternate solution you come up with. Thanks!