design-reviews#609: Early TAG design review for captureTab

#609: Early TAG design review for captureTab

Opened Feb 12, 2021

I'm requesting a TAG review of capture-current-tab.

Summary: in a number of scenarios (most prominently sharing Web slides in a Web teleconference), offering the possibility to video-capture the current tab (or a cropped view of the current tab) provides a smoother and easier to understand UX than what's possible using the Screen Capture API. The WebRTC WG has rough consensus that this is a useful scenario to address and is exploring approaches to enable that scenario; we are seeking input from the TAG on how to do that safely and consistently with the rest of the platform.

Explainer: Two rough approaches have been discussed:
- getCurrentBrowseringContextMedia - explainer at https://docs.google.com/document/d/1CIQH2ygvw7eTGO__Pcds_D46Gcn-iPAqESQqOsHHfkI/edit#heading=h.bj2aavkeqqud related discussions at https://github.com/w3c/mediacapture-screen-share/pull/148 and https://github.com/w3c/mediacapture-screen-share/issues/156, with concerns around cross-origin protections (see below)
- getTabMedia() secured by COEP sketched in https://docs.google.com/presentation/d/1CeNeno5XuDhm1mpnVyE9eT14YKZgZUtgQsJfC8uqEpA/edit#slide=id.gaef31c926d_1_72 (slides 16-28)
Security and Privacy self-review The gist of our request for input is on the privacy and security trade-off given that in a number of cases, the content to be captured from a given tab will include cross-origin content (e.g. Web slides that integrate a video hosted on a streaming service), possibly nested several times.

In general, we recognize that an API that allows to capture a single tab reduces some privacy risk compared to the generic screen sharing API (with less risk to share unintended content), but creates a bigger security attack surface where a Web site could embed third-party content and social-engineer the user to get it captured to gain access to cross-origin information.

In particular, there is active discussion on whether embedded origins should opt-in or opt-out of that possibility: https://github.com/w3c/mediacapture-screen-share/issues/156

The trade-off is discussed in slides https://docs.google.com/presentation/d/1CeNeno5XuDhm1mpnVyE9eT14YKZgZUtgQsJfC8uqEpA/edit#slide=id.gaef31c926d_1_37 from 11 to 28
GitHub repo (if you prefer feedback filed there): https://github.com/w3c/mediacapture-screen-share/
Primary contacts (and their relationship to the specification):
- Elad Alon (@eladalon1983, Google), Jan-Ivar Bruaroey (@jan-ivar, Mozilla) have been shepherding the discussions in this space
Organization/project driving the design: WebRTC WG

Further details:

I have reviewed the TAG's API Design Principles
The group where the incubation/design work on this is being done (or is intended to be done in the future): WebRTC WG
The group where standardization of this work is intended to be done ("unknown" if not known): WebRTC WG
Existing major pieces of multi-stakeholder review or discussion of this design: (see above)
Major unresolved issues with or opposition to this design: (as described above, the security model for cross-origin content)
This work is being funded by: N/A

We'd prefer the TAG provide feedback as (please delete all but the desired option): 🐛 open issues in our GitHub repo for each point of feedback

Discussions

Comment by @LeaVerou Feb 16, 2021 (See Github)

We discussed this in a breakout today.

We would like to see an explainer listing use cases. The current document explains that applications may want to present the user with a simpler choice, but not why. Also, we generally recommend that folks make their explainers available in plain text, Markdown or HTML; Google Docs is not globally accessible to web standards contributors.

We would also encourage filling out the security & privacy self-review questionnaire.

Comment by @torgo May 12, 2021 (See Github)

As discussed with Dom this is overtaken by #625 so I'm closing.

Comment by @eladalon1983 May 12, 2021 (See Github)

Beg pardon, but although they are related, I don't think these two issues are the same.

There are three relevant APIs, of which two are under discussion and likely require TAG review:

getDisplayMedia is an existing API. It is not in need of review.
getViewportMedia is a new API that has budding consensus between Google, Mozilla and Apple, and will likely end up being specified. TAG request #609 refers to this API.
getCurrentBrowsingContextMedia is a proposal for a (possibly temporary) hybrid to bridge the gap between the need for getViewportMedia and hurdles of its increased security requirements and their lack of current adoption. TAG request #625 refers to this API.

Comment by @jan-ivar May 13, 2021 (See Github)

Yes, this should be reopened. https://github.com/w3ctag/design-reviews/issues/625 is a Google-only proposal without support from the WebRTC WG, and shouldn't overtake it. @dontcallmedom @torgo

Comment by @torgo May 14, 2021 (See Github)

Ok message received - I'm re-opening and we'll have a look next week.

Comment by @dontcallmedom May 18, 2021 (See Github)

So, while my suggestion that this issue was a duplicate of #625 was very wrong, I would still entertain the motion we close this issue - not because we have figured everything out with the proposal, but this issue was raised as an early resign review request to help on the discussion of the security model (opt-in vs opt-out), and that particular discussion has been settled in the context of the WebRTC Working Group.

I think we would want to reopen a TAG review request once we have the relevant materials to support it (including an updated explainer and a filled-up security/privacy questionnaire). @jan-ivar @eladalon1983 any objection to that approach?

Comment by @jan-ivar May 18, 2021 (See Github)

The WG settled on the safer of two options pitted against each other in the OP, so we don't need the TAG to pick for us.

But the request was to determine whether either approach is safe, so half the request still seems valid.

I'm happy to give an update on getViewportMedia (a.k.a. "getTabMedia()" in the OP). The API is now secured by site-isolation, and target documents must additionally opt into capture using Document policy.

But even this API has privacy concerns like non-script resources being unprotected, and private information inherent in rendering is still leaked.

I'll work on an explainer and fill out the questionaire.

Comment by @torgo Jun 1, 2021 (See Github)

So just to clarify - is there now going to be one consolidated proposal merging #609 and #625? If so, can we agree to close one of these issues and update the other with the consolidated and agreed proposal?

Discussed Sep 1, 2021 (See Github)

Sangwhan: 609 we're supposed to provide feedback on path forward in terms of security. Breaks the origin model?

Dan: you can already share the screen? If you're sharing the screen what's the difference with sharing the tab? if you're sharing the screen that has that tab on it your'e doing the same thing. Firefox already lets you share screen.. you're not sharing the DOM, you're sharing an image of what's in that tab. The pixels.

Sangwhan: there's a slidedeck.. the more secure proposal.. it was quite subtle but valid, but I don't remember the details.

Dan: we could ask for additional information and with that we can close the 625.. we're not being asked to review 625..

Discussed Oct 25, 2021 (See Github)

Dan: don't know what the latest is [reaches out to Dom]

Dan: let's discuss more at the plenary.. Dom has closed it?

[a wild Dom appears]

Dan: what's the situation? Competing issues 609 and 625.. it looks like there's an agreed proposal for the future but an interim thing that Chrome is doing. We closed 625 saying we have concerns with an interim thing and we encourage them to do work in web rtc.. they said they intend to

Dom: for 609, a specific set of questions on the security properties, I closed because I think we've got convergance on the approach we need to take is one that requires opt in by websites to be captured, that's what's going to emerge hopefully soon in this new repo where this getViewportMedia spec is going to be developed. My sense, I will wait to see, but my sense is that there is now convergence in the web rtc wg in that direction. I'm not entirely clear whether the interim proposal is still going ahead. Their concern is that they don't think websites are going to adopt this new opt in mechanism in a short timeframe so they want to use this capture tab from google presentation that's going to fail in a large number of cases in a way that is obscure to developers and users. That was the motivation, how they intend to deal with that problem I'm not clear. I think the TAG pushback was extremely useful in getting more convergence in the wg, but I'm not sure where they've landed yet.

Dan: sounds positive. We shouldn't worry until a new review request comes our way?

Dom: yes

Comment by @dontcallmedom Oct 26, 2021 (See Github)

getViewportMedia is going to be developed as a separate spec by the WebRTC WG in https://github.com/w3c/mediacapture-viewport - we will reopen a TAG design request once the proposal is in shape for it.