We’re in the process of standardizing a new API - getViewportMedia - that will allow web-applications to present a simple confirmation-only prompt to the user. The security requirements of this API are under active discussion, but consensus is forming that both cross-origin isolation and a new opt-in header will be required.
Not all applications can accept these requirements - at least not in the short-term. However, by forcing such applications to use getDisplayMedia, the user is pushed towards the riskier option of sharing the entire monitor. Why is that the riskier option? Because at the moment capture starts, the entire current monitor includes the current tab. Note that the moment capture starts is sufficient for almost any attack, as all attacks we have thus far considered could be carried out using a single frame.
A hybrid API is deemed necessary in order to offer some of the benefits of getViewportMedia without its elevated security requirements. This hybrid API will allow the application to signal its preference for capturing the current tab by way of a new dictionary member parameter for getDisplayMedia. Namely, we will extend DisplayMediaStreamConstraints by adding another dictionary member called preferCurrentTab with a default value of false. When getDisplayMedia is invoked with preferCurrentTab=true, the browser will offer the current tab as the first option to the user, but will still offer unlimited choice of capture sources (see image below).
The unlimited choice of sources makes this new API compliant with the requirements of getDisplayMedia. Since it complies with the requirements of getDisplayMedia, the security requirements placed on getDisplayMedia are sufficient for this new hybrid API.
Relevant time constraints or deadlines: We aim to ship in Chrome m92 or m93.
The group where the work on this specification is currently being done: WebRTC WG works on getViewportMedia, but is not interested in this hybrid API.
The group where standardization of this work is intended to be done (if current group is a community group or other incubation venue): WICG (I will link once this is in the WICG.)
Major unresolved issues with or opposition to this specification:
Our position, on the contrary, is that this hybrid is necessary and does not degrade security when compared to getDisplayMedia.
This work is being funded by: Google
You should also know that...
A word of caution over a source of potential confusion:
The name getViewportMedia is a later conclusion. Initially, that API was offered under the name getCurrentBrowsingContextMedia. Chrome has an active origin-trial for getCurrentBrowsingContextMedia which accomplishes the same thing as preferCurrentTab, but uses a new method instead of a new dictionary member. See the explainer.
We'd prefer the TAG provide feedback as (please delete all but the desired option):
💬 leave review feedback as a comment in this issue and @-notify @eladalon1983
Why does this use browsing context in its name? Does this survive navigations somehow?
cc @jan-ivar
Comment by @eladalon1983 Apr 26, 2021 (See Github)
The capture does not survive navigation - the capturing app is unloaded on navigation.
I am open to renaming. Any thoughts on what could be a good name for this hybrid?
I'm not sure, but from the proposed UI this seems like an option (which would have a name related to viewport to stay consistent) you would pass to getDisplayMedia().
Comment by @eladalon1983 Apr 26, 2021 (See Github)
I did consider the option of an additional constraint to getDisplayMedia, but that becomes less convenient if getViewportMedia is ever extended to receive additional parameters that don't make sense for getDisplayMedia - something which I do plan. In that case, the hybrid gCBCM can be meaningfully extended to accept that parameter and apply it only if the user chooses the current tab. (I suggest we migrate this discussion the WICG repo when that one is set up. I can @mention you when it's time, if you'd like - let me know.)
A hybrid API - getCurrentBrowsingContextMedia - is deemed necessary in order to offer some of the benefits of getViewportMedia without its elevated security requirements. This hybrid API will allow the application to signal its preference for capturing the current tab. The browser will then offer the current tab as the first option to the user, but will still offer unlimited choice of capture sources (see image below). The unlimited choice of sources makes this new API compliant with the requirements of getDisplayMedia.
An application signal does not alleviate the "elevated security requirements" if the application is malicious, it defeats them.
The getDisplayMedia API deters social engineering: "User Agents are encouraged to warn users against sharing browser display devices as well as monitor display devices where browser windows are visible, or otherwise try to discourage their selection on the basis that these represent a significantly higher risk when shared." ¹
Providing malicious applications with a method that does exactly what they need seems like a bad idea.
<sub>1. See the questionaire.md and subsequent links for details of these unobvious treats on the same-origin policy from sharing web surfaces under attacker control.</sub>
Comment by @dontcallmedom May 18, 2021 (See Github)
Since I was confused and created confusion in terms of the relationship with #609, I thought I would summarize what I understand about this particular design review (at the request of @LeaVerou and @kenchris I was chatting with this morning):
the proposal in this issue hasn't been discussed (let alone endorsed) by the WebRTC Working Group
the proposal in this issue addresses similar needs as the ones identified for the getViewportMedia API (on which #609 focuses) but proposes a different solution
the proposal in this issue is essentially the equivalent of the API defined by the WebRTC Working Group getDisplayMedia, but with a specific hint to suggest the current browser tab should be captured - which as @jan-ivar commented on probably reduces the effectiveness of the mitigation set by getDisplayMedia() to avoid giving too much control to the API-calling-page on what is being captured
The motivation I understand behind the proposal in this issue is that getting the security model being developed for getViewportMedia (which requires any embedded resources to adopt & deploy new HTTP headers) is likely to be very challenging. I'm mentioning this in case the TAG would like to chime in more generally on other approaches that might make it easier to deploy getViewportMedia.
The Chrome decision on the "need for elevated permission" for getDisplayMedia (which presents all the capture surfaces without calling out special considerations about their risks) was based on the understanding that the most common use cases would be displaying a tab or displaying the screen, so it did not make much sense to increase the cognitive overload by calling out cases that had lower risk than the common ones.
It is logical based on this standpoint that presenting the present tab as a capture option doesn't need any more elevated warning; the warning is already elevated.
Lea: understand it's not, but they're working together to consolidate them. Ken and I were in this call with Dom where he explained.
Ken: Dom said he wants to reopen a TAG review with updated materials, updated exlpainer and S&P. Don't need the TAG to pick. Wanted to determine whether either request was safe. That was two weeks ago.
Lea: still pending external feedback? What about the other one?
Dan: 625, also 14 days ago Dom sent something. Multiple arguments going on with different people in these discussions. TAG feedback should be for them to sort it out..
Lea: not the TAG's job to consolidate
Dan: right, shouldn't be two issues open for us. But got pushback. Posting on both issues.
Yves: another option to close both and open another one if there is a dispute they want feedback on.
Lea: last resort to use us to resolve dispute
Hadley: encourage them to work together
Dan: we need more guidance to figure out what we should be reviewing
So just to clarify - is there now going to be one consolidated proposal merging #609 and #625? If so, can we agree to close one of these issues and update the other with the consolidated and agreed proposal?
There is not going to be a consolidated proposal.
(Btw, the current proposal - #625 - is going to be amended today/tomorrow, so if it's possible to hold off on reviewing it for 2 days, that'd be better.)
Hi @eladalon1983 can you please clarify this. It's highly unlikely that the TAG is going to endorse a single proposal when there are multiple competing proposals from different vendors and lack of consensus. Happy to postpone until our next design review week - which will be the 14th of June. Hoping we can have better news by then.
Dan: recent comment lays out a position from Google ... we have to consider both of these issues in parallel. Elad is saying that the thing they're proposing in 625 is a temporary measure, maybe we should be saying that's great okay but why won't that then become the de facto thing that people use? The web is full of short term measures that became the long term unfortunate thing.
Sangwhan: a lot of people are using this feature without realising its Chrome only - it's not standardised
Dan: [shares tab using the API described in 625]
Sangwhan: a lot of people are using this - though it's only implemented in chrome. I think the trade off given the situation we're in is something we're going to have to live with.
Dan: Why don't we say we appreciate .. thank you for clarifying that #609 is the long term proposal, we will focus our efforts on reviewing #609. Can you provide a roadmap for how you see transitioning people from use of this API to what is described in 609 once the issues are resolved? Eg. would 609 be layered on top of 625? Based on that we could propose closed.
Dan: why is it called preferCurrentTab? comment left
The long-term path (getViewportMedia) has a standard consensus track, and that is what is tracked in TAG issue #609. But this solution has multiple complexities and non-trivial security aspects that we still need to iron out. Therefore -
preferCurrentTab is a short-term measure that solves some use cases to some degree, and doesn't have the security problems associated with getViewportMedia.
After months of discussion, there is no consensus on getViewportMedia with Mozilla, so Chrome gave up and shipped preferCurrentTab.
Thanks for the update @eladalon1983. We are going to review this at our "f2f" coming up on the 13th. I hope we can resolve and close the review by then.
Just discussed in our virtual f2f breakout. Thank you for clarifying that getViewportMedia is the long term proposal, we will focus our efforts on reviewing that. Can you provide a roadmap for how you see transitioning people from use of preferCurrentTab to getViewportMedia once the issues are resolved? The concern we have is that the web is full of technologies that were designed as short term stop gaps until a longer term thing could be worked out. We're rather not see another one added to that list.
Comment by @eladalon1983 Sep 14, 2021 (See Github)
Once the security measures getViewportMedia requires are sufficiently rolled out, applications will naturally migrate from preferCurrentTab to getViewportMedia, because the latter offers a superior UX; namely, the user is presented with a clearer choice, and cannot choose anything but the current tab.
Chrome has UMA tracking calls to getDisplayMedia with/without preferCurrentTab (and the API invocation's result). getViewportMedia will be associated with similar UMA.
When we feel that adoption is sufficient, or that the challenges to it are no longer as significant, we can (a) communicate publicly that preferCurrentTab is about to be deprecate and (b) start printing deprecation warnings to the dev-console whenever it is used.
Ok this sounds good. We still have concerns about interoperability and strongly encourage convergence on one consensus-based solution as you have laid out above.
OpenedApr 23, 2021
Ya ya yawm TAG!
I'm requesting a TAG review of getCurrentBrowsingContextMedia.
Overview
Consider the existing navigator.mediaDevices.getDisplayMedia(). It allows a user unlimited choice of sources - any monitor, window or tabs.
We’re in the process of standardizing a new API - getViewportMedia - that will allow web-applications to present a simple confirmation-only prompt to the user. The security requirements of this API are under active discussion, but consensus is forming that both cross-origin isolation and a new opt-in header will be required.
Not all applications can accept these requirements - at least not in the short-term. However, by forcing such applications to use getDisplayMedia, the user is pushed towards the riskier option of sharing the entire monitor. Why is that the riskier option? Because at the moment capture starts, the entire current monitor includes the current tab. Note that the moment capture starts is sufficient for almost any attack, as all attacks we have thus far considered could be carried out using a single frame.
A hybrid API is deemed necessary in order to offer some of the benefits of getViewportMedia without its elevated security requirements. This hybrid API will allow the application to signal its preference for capturing the current tab by way of a new dictionary member parameter for getDisplayMedia. Namely, we will extend DisplayMediaStreamConstraints by adding another dictionary member called
preferCurrentTabwith a default value offalse. When getDisplayMedia is invoked withpreferCurrentTab=true, the browser will offer the current tab as the first option to the user, but will still offer unlimited choice of capture sources (see image below).The unlimited choice of sources makes this new API compliant with the requirements of getDisplayMedia. Since it complies with the requirements of getDisplayMedia, the security requirements placed on getDisplayMedia are sufficient for this new hybrid API.
Links and Details
Further details:
You should also know that...
A word of caution over a source of potential confusion:
getViewportMediais a later conclusion. Initially, that API was offered under the namegetCurrentBrowsingContextMedia. Chrome has an active origin-trial forgetCurrentBrowsingContextMediawhich accomplishes the same thing aspreferCurrentTab, but uses a new method instead of a new dictionary member. See the explainer.We'd prefer the TAG provide feedback as (please delete all but the desired option): 💬 leave review feedback as a comment in this issue and @-notify @eladalon1983