design-reviews#550: WebXR Depth API

#550: WebXR Depth API

Opened Aug 19, 2020

Saluton TAG!

I'm requesting a TAG review of WebXR Depth API.

As Augmented Reality becomes more common on the smartphone devices, new features are being introduced by the native APIs that enable AR-enabled experiences to access more detailed information about the environment in which the user is located. Depth API is one such feature, as it would allow authors of WebXR-powered experiences to obtain information about the distance from the user’s device to the real world geometry in the user’s environment.

Explainer¹ (minimally containing user needs and example code): https://github.com/immersive-web/depth-sensing/blob/main/explainer.md
Security and Privacy self-review²: https://github.com/immersive-web/depth-sensing/blob/main/security-privacy-questionnaire.md
GitHub repo (if you prefer feedback filed there): https://github.com/immersive-web/depth-sensing
Primary contacts (and their relationship to the specification):
- Piotr Bialecki, @bialpio, Google, Editor
Organization/project driving the design: Immersive Web CG
External status/issue trackers for this feature (publicly visible, e.g. Chrome Status): https://chromestatus.com/feature/5742647199137792

Further details:

I have reviewed the TAG's API Design Principles
The group where the incubation/design work on this is being done (or is intended to be done in the future): Immersive Web CG
The group where standardization of this work is intended to be done ("unknown" if not known): Immersive Web WG (planned)
Existing major pieces of multi-stakeholder review or discussion of this design: https://github.com/immersive-web/depth-sensing/issues
Major unresolved issues with or opposition to this design: No issues exist yet, this is an early stage proposal that has not been fully reviewed.
This work is being funded by: N/A.

You should also know that...

N/A.

We'd prefer the TAG provide feedback as (please delete all but the desired option):

🐛 open issues in our GitHub repo for each point of feedback

Discussions

Comment by @bialpio Sep 18, 2020 (See Github)

Heads-up: the repository was accepted by Immersive Web CG and I have moved all the content there. I have edited the links to account for the change.

Comment by @alice Sep 23, 2020 (See Github)

One initial bit of feedback: would be good to see some non-visual use cases, such as realistic sound effects taking the room geometry into account. This might be useful for low vision and blind people in particular. In general, it would be good to see more use cases which take disabilities into account.

Comment by @alice Sep 23, 2020 (See Github)

Aside from that: what would you suggest I look at to understand firstly how the coordinate system works (I see it's two unsigned dimensions...), and what normTextureFromNormView is doing?

Comment by @bialpio Sep 23, 2020 (See Github)

One initial bit of feedback: would be good to see some non-visual use cases, such as realistic sound effects taking the room geometry into account. (...)

This is a great idea! I'll add it to the explainer - it should be possible to implement this with the API, but it may rely on having to perform a scene reconstruction using the data.

I think it'd be easier to initially ignore the normTextureFromNormView matrix (i.e. assume it's identity for a while) and focus on what data is being returned - I tried to explain it Interpreting the data section.

normTextureFromNormView is there to allow devices to return the data in the same format irrespective of current device orientation and the coordinate system of the underlying platform - this way we don't have to perform the data adjustment in the implementation (that would be costly!), but it does mean the API is a bit more complicated to use. :( The explainer focuses on "what does the API provide?" instead of "why does the API provide it like this?", but if you think it'd be helpful, I can some text around design choices here.

Example: WebGL textures have the origin in bottom-left corner, but Android assumes that origin of the screen is in top-left corner - if we were to use the texture that we got from ARCore on Android as-is, everything would be flipped. Example 2: ARCore on Android always returns data assuming that the device is in landscape orientation - if it so happened that the user entered the AR session in portrait mode & we tried to use the texture as-is, everything would be rotated.

Comment by @torgo May 11, 2021 (See Github)

Hi @bialpio We're just considering this issue in our virtual f2f and seeing if it's possible to close it. Can you let us know current status of this API? It looks like it's in Chrome 90 but there are no signals from other implementers? Is this currently marked as an "unofficial draft" in the spec. Is this headed for the working group, or does this remain a community group "report"? Thanks for any info you can provide.

Comment by @ylafon May 11, 2021 (See Github)

Hi, we reviewed this with @torgo during our F2F, and as with https://github.com/w3ctag/design-reviews/issues/545#issuecomment-837946515 , we were wondering about detectability and its use for other purposes than VR experience, ie: fingerprinting and/or precise gear detection. It would be good to let the user "degrade" the experience to something more basic to mitigate that.

Comment by @bialpio Aug 26, 2021 (See Github)

Hi @bialpio We're just considering this issue in our virtual f2f and seeing if it's possible to close it. Can you let us know current status of this API? It looks like it's in Chrome 90 but there are no signals from other implementers? Is this currently marked as an "unofficial draft" in the spec. Is this headed for the working group, or does this remain a community group "report"? Thanks for any info you can provide.

The spec was moved to WG mid-March, but the bikeshed change that reflects this landed only 2 days ago. There are no signals from other implementers that I know of (see Firefox, Safari).

(...) ie: fingerprinting and/or precise gear detection. It would be good to let the user "degrade" the experience to something more basic to mitigate that.

I think it could theoretically be used for fingerprinting (e.g. an app could record the data and try to create a map of the environment & use it to distinguish one user from another when they visit the page again and enter another AR session). This is something that I touched upon in the response to Q12 in Security & Privacy Questionnaire. Note that hit-testing could be used in a similar way, but depth API exposes this information with higher resolution.

I like the idea of "degrading" the experience by the user! I think the specification currently leaves space for that - we allow the UAs to limit the resolution of the depth buffer and expose this step directly in the algorithms now - this could definitely be something that the UAs could expose a UI around. Current implementation in Chrome limits the resolution to 240x180 - if more entries are available in the depth buffer, we will not return anything back to the app as we do not currently attempt to reduce the resolution.

Discussed Sep 1, 2021 (See Github)

Dan/Rossen: writes proposed comment

Comment by @atanassov Sep 15, 2021 (See Github)

Hi @bialpio – @torgo, @kenchris and I looked at the issue during our Gethen vf2f and generally like the direction it is going.

The solution / mitigation you describe above (degrading / fuzzing to reduce possibility of use for fingerprinting) sounds reasonable. Is there an issue in the working group that you can point to where this is being discussed? If there's active work going on on this then we can resolve our review.

Comment by @bialpio Sep 15, 2021 (See Github)

Hi @bialpio – @torgo, @kenchris and I looked at the issue during our Gethen vf2f and generally like the direction it is going.

The solution / mitigation you describe above (degrading / fuzzing to reduce possibility of use for fingerprinting) sounds reasonable. Is there an issue in the working group that you can point to where this is being discussed? If there's active work going on on this then we can resolve our review.

There is no issue in the repo for this particular aspect of the spec (I remember exposing the "limit the resolution" / "block access" parts more prominently in the algorithms in response to feedback but I cannot find a written note about it anywhere now). Regarding active work, do you think the current phrasing around the allowed behavior is insufficient? If so, I can expand on the existing text a bit to make it clearer that user agents are allowed to build those kinds of controls into the UX around the API. The existing relevant paragraph is:

"In order to mitigate privacy risks to the users, user agents should seek user consent prior to enabling the depth sensing API on a session. In addition, as the depth sensing technologies & hardware improve, the user agents should consider limiting the amount of information exposed through the API, or blocking access to the data returned from the API if it is not feasible to introduce such limitations. To limit the amount the information, the user agents could for example reduce the resolution of the resulting depth buffer, or reduce the precision of values present in the depth buffer (for example by quantization). User agents that decide to limit the amount of data in such way will still be considered as implementing this specification."

I don't think it'll be controversial to add more text around this, so I'll just go ahead with a PR.

Discussed Sep 20, 2021 (See Github)

Rossen: drafts closing comment

CLOSED

Discussed Sep 20, 2021 (See Github)

Dan: A response from Piotr... doesn't think it'd be controversial to add more text.. similar conversation around raw camera access, he's getting a lot of feedback about fingerprinting and privacy issues.. I want to say we'll review it when there's a PR

Rossen: yeah.. a lot of squishy language, UAs 'could'.. prefer to see more normative

Dan: yeah

Rossen: if this is a note that'd be fine. if this is a spec they need to work more on the language, but right direction

Dan: in your note about fingerprinting he did talk about the S&P questionnaire. I'm wondering if there's anything else we can point him to, unsactioned web tracking finding or something, that would underscore the issue. Great feedback that there needs to be more normative text.

Rossen: can leave a comment, schedule for next week. Outside of fuzzing, once this is done, we should consider closing it?

Dan: I think so.

Comment by @atanassov Sep 20, 2021 (See Github)

Thank you for pointing us to existing language. This is a great start and I would advise making the normative text stronger in its recommendations. For example when you say user agents should consider limiting the amount of information exposed through the API that recommendation is better to be a must. etc.

Comment by @atanassov Sep 22, 2021 (See Github)

After another read through this review during our plenary today we are happy to close the issue as satisfied. We hope that you will take our advice into account and work with us again in the future. Thank you and good luck with further progress at WebXR.