#954: Element Capture
Discussions
2024-08-26
This sits on a list of features that Google has deployed without standardization (this is still in Origin Trial, I believe).
Problems here are the same as those inherent to screen sharing. Some fingerprinting risk, but the primary risk is that this enables capture of content that a site might not otherwise be authorized to obtain.
Dan: what's the permissions story? do they document abuse cases / mitigations?
Martin: The explainer ... they are relying on screen capture permissions model etc... they are just looking to double down on that.
Dan: so we are not asking the user to share...
Martin: there was a debate on what influence the application should have over the shape and scope of the permissions window... targeting screen sharing. One potential outcome is that the web site can share itself... there is a spec for that.
Martin: 625 was an APi for capturing the current tab... the key problem with screen capture - allows the web site to see what is on the page. It can frame in content from another site and screen capture the current page including that content... this is why we have permission prompts that put user interests ahead ... element capture gives the ability of not just capturing whats on screen but what is off-screen, including cross-origin...
Dan: in what cases cross-origin?
Martin: if the element you identity is or contains an iframe for example...
Dan: what if the spec said "no cross-origin"? would that mitigate?
Martin: that was one of the mitigations discussed... we also have viewport media function...
Martin: getviewportmedia has a site isolation requirement... which uses the same COEP for shared array buffer access... so from that perspective it's probably OK. Element capture in that case would be fine... GetDisplayMedia gets you some piece of screen realestate - GetViewpoertMedia gets the current tab...
Dan: is that feedback we could provide?
Martin: it looks like they required the target to opt in... that changes things a little bit. The cross-tab... you select a different tab and within that tab you want to focus on a specific chunk of that tab. and it looks like the target tab chooses what is captured...
Dan: I have a presentation running in one tab and I want to share it - but only the presentation, not the controls, to another tab that is webrtc session...
Martin: that looks like a good use case... this does look like it's cooperative...
Dan: Seems brittle. Needs coordination. Is it specific to the capturing or captured application or specific origins. Seems to lack detail.
Martin: Might ask for more detail.
Dan: Sympathetic to the use case. Either present the whole tab or the whole screen, but then you can't see people. Moving away from the tab can destroy the stream. Need to understand how this changes the permission flow. If you are locked in to a specific target... Are you allowed to override what a site says you can share? Could people be railroaded into a specific selection?
https://w3c.github.io/mediacapture-viewport/
Martin: let's say that you've identified that element and you can render that element to a stream - and not the elements "on top" of it - what happens with transparency?
<blockquote> Hi - some pieces of feedback from our TAG breakout this morning where we reviewed this:It seems like the explainer is very lean. We think that there are a number of issues that need to be more fully explored before we can be more sure about this proposal.
In the use case that you're sharing a specific content area to an embedded iFrame (the use case in the explainer) what is the permissions flow for this scenario? For example - in current screen sharing scenarios, the user may be prompted to share a tab, a window, or the whole screen. What would the user be prompted for in this case? Would they be able to choose an alternative sharing target such as an other tab or the screen or is it envisioned that in this case they would be constrained to only share content from the designated application?
Can this be treated like an extension to ViewPortCapture? We note that this sort of sharing carries similar security risks at that API, and the additional constraints on capture in that API might be better suited to this use case than the more general getDisplayMedia.
The proposed API starts by preparing to share the whole of the content, and then restricting it to a particlar part - have you considered ways to start with the specific part to be shared instead? (How would this affect occlusion?
You have a goal of avoiding occlusion, but what about elements that are partially-transparent? Would this capture what is rendered behind an element?
</blockquote>2024-09-09
Matthew: thoughts... first of all they responded positively - that's good. They posted a blog and some of this should have been in the explainer... would have answered some of our questions. One thing clatified is that they are doing occlusion - you will only get a video stream of that element (not occlusions). They also address the issue whether you should be able to share a particular element... didn't address something that might be off-screen. They do address the transparency... Also with respect to the API shape, we had a question - starting with the TAB and then drilling down - the reason they started that way - (1) it leands on existing infra for permissions and (2) that it sets expeections about what could be shared. I'm torn on it because I see what they're saying but it feels like extra work.
Dan: notes lack of multi-implementer support.
Matthew: one other thing - they said they did consider the alternative we suggested but there is no alternatives suggested in the explainer...
Martin: this is better than the original version but didn't address the concerns that Mozilla had... And I would like to have that discussion (about viewportcapture). I think the right thing would be to ask for the explainer to be updated... then we can move on to the next step.
matthew to ask them to update the explainer and ask about off-screen elements
2024-10-07
Matthew: we had a load of questions and it turned out that most of the questions were addressed in the article. We asked them to put it in the explainer. A couple of the q's we had were not answered... we asked them. No feedback. I think most of our concerns were addressed. It feels a little weird - but they are doing compositing properly - removing transparency... but it would be good to have this alternatives considered section... ball's in their court.
we ping the issue to get feedback
OpenedMay 10, 2024
Hej TAG!
I'm requesting a TAG review of Element Capture.
A combination of pre-existing mechanisms (getDisplayMedia, Region Capture) already allows Web applications to capture a portion of the current tab as video MediaStreamTrack, robustly cropping away irrelevant pixels. Such videos can than be transmitted remotely; removing pixels not intended for sharing helps the sharing user's privacy, and prevents distraction by the receiving users. It also helps conserve compute and network resources.
Our new API, Element Capture, takes this a step further, allowing Web applications to remove unwanted occlusions. For example, if a private message notification appears over the shared region, it is possible to avoid capturing that message, which also avoids transmitting it remotely, and therefore helps uphold privacy guarantees implicitly made to the user, who had only intended to share the target-region, and not whatever happened to be drawn over it.
Further details:
You should also know that...
Strong positive Web developer feedback for this feature was expressed on https://github.com/screen-share/element-capture/issues/3 and during Screen Capture CG meetings.