design-reviews#414: Trust Token API

#414: Trust Token API

Opened Sep 3, 2019

こんにちはTAG!

I'm requesting a TAG review of:

Name: Trust Token API
Specification URL: N/A
Explainer (containing user needs and example code)¹: https://github.com/WICG/trust-token-api
GitHub issues (if you prefer feedback filed there): https://github.com/dvorak42/trust-token-api/issues
Tests: N/A
Primary contacts (and their relationship to the specification): @dvorak42, @csharrison

Further details:

Relevant time constraints or deadlines: We’d like to discuss this at TPAC (Sept 16, 2019). Other than that no hard time constraints.
I have read and filled out the Self-Review Questionnare on Security and Privacy. The assessment is here.
I have reviewed the TAG's API Design Principles
The group where the work on this specification is: No group yet

We recommend the explainer to be in Markdown. On top of the usual information expected in the explainer, it is strongly recommended to add:

Links to major pieces of multi-stakeholder review or discussion of this specification:
Links to major unresolved issues or opposition with this specification:
- See the Future Extensions section in the explainer for a few open problems that would be good to resolve.

You should also know that...

We’re still very early stage here, just looking to get TAG review earlier rather than later.

We'd prefer the TAG provide feedback as (please select one):

open issues in our GitHub repo for each point of feedback
open a single issue in our GitHub repo for the entire review
leave review feedback as a comment in this issue and @-notify [github usernames]

Please preview the issue and check that the links work before submitting. In particular, if anything links to a URL which requires authentication (e.g. Google document), please make sure anyone with the link can access the document.

¹ For background, see our explanation of how to write a good explainer.

Discussions

Comment by @hadleybeeman Dec 3, 2019 (See Github)

Hello! @hober and I discussed this at our face to face in Cupertino.

Two main points from us:

What happens if the issuer is a bad actor?

This design only works in the way you've intended if the issuer is properly anonymising and randomising the tokens. What happens if the issuer isn't a trustworthy organisation?

And since the user has no role in selecting the issuer, the user then gets no say in who that might be. If we end up with an ecosystem of dodgy issuers, can the user protect themselves?

It seems like this could be mitigated by an approach like the one in Web Payments, where the browser keeps a set of payment methods that the user is happy with. The shopping site has a list of payment methods it supports. At purchase time the site supplies its options; the browser picks from those. This is a nice quality: the user agent has a role in choosing. This is the role of the user agent.

We recognise that users can't express their preferences on advertisers at all. Could a similar approach work here?

We're concerned about the potential for trust tokens to be used as categories to identify or describe the users.

You've written in the explainer:

The issuer can store a limited amount of metadata in the signature of a nonce by choosing one of a set of keys to use to sign the nonce and providing a zero-knowledge proof that it signed the nonce using a particular key or set of keys.

You say it's a limited amount of metadata: how many bits? Even a small number of bits could be risky with certain bad issuers.

We'll open issues in your github repo; these notes are here so that we have them.

Comment by @dvorak42 Dec 3, 2019 (See Github)

(Continuing thread on the trust-token-api issues).

The crypto in this scheme is resilient against a bad actor on either side (preventing token forgery from the client and preventing loss of anonymisation from the issuer). The issuer would only be able to subdivide the users of that issuer based on the presence or absence of the token (and in the private metadata case, the value of that bit of information).

There are some issues that can occur if you are running a large number of issuers attempting to be malicious, where each issuer uses the bit of information they have to divide their userbase via different non-trust related metrics. Having a allow/block list that the UA supports would help mitigate this issue.

Depending on the use case, different numbers of bits may be reasonable. For the web anti-fraud use case, there are compelling arguments for having one bit (to avoid the presence of a token from telling a malicious actor they've successfully passed the fraud system/captcha/etc), beyond that each UA would need to consider the privacy/usecase tradeoffs carefully. This may interact with ideas such as a privacy budget

Comment by @hadleybeeman Dec 3, 2019 (See Github)

Also, it looks like your use case might be similar to the Verifiable Credentials work. It would be useful to talk to them and determine if you think this is a competing proposal, or where the overlaps/differences are. @burnburn @stonematt are the chairs.

Comment by @dvorak42 Dec 3, 2019 (See Github)

There's a bit of overlap, but for Trust Tokens, we are only looking to propagate a tiny amount of trust information (1 or 2 bits) and the protocol needs to be resilient against bad actors where the issuer tries injecting more information into the token/claim or do other forms of watermarking/fingerprinting of the token issuance/redemption. Given the breadth of scope of information that a verifiable credentials claim contains and the trust in the credential/claim issuer there, these probably are reasonable to remain as separate proposals with different threat models.

Its possible that the redemption attestation portion of Trust Tokens might be adaptable to look like the Verifiable Credentials, though a simple public key signature scheme works fine for that.

Comment by @torgo Mar 3, 2020 (See Github)

@dvorak42 @csharrison we're just trying to make some progress on this issue. While we're doing that, can you let us know if there have been developments recently on the spec, and especially if there is any information on implementations and use of this? Also it looks to us like there is currently no requirement for asking for user permissions. If this is the case, can you expand on the rationale here? It looks to us like this is a very powerful API that cuts across origins, and that potentially violates the same origin principle. We are concerned that users would not be expecting information from one domain to be available to another domain.

Comment by @dvorak42 Mar 3, 2020 (See Github)

There has been some work on the spec side for the underlying protocol (Privacy Pass) which is going through the IETF standardization process, and as that updates we'll be updating the Trust Token API design. Initial work has begun on implementing this API in Chromium and we hope to run small-scale experiments with it soon to verify the feasibility of this API and whether it is sufficient for use cases that might need it. We currently don't require user permissions, as the capabilities of this API are currently substantially less than for ordinary 3P content within a page which don't require permissions, we'll likely need a new model if we try to move a lot of these 3P-esque capabilities behind permissions as prompting on every new page visit (even just the CAPTCHA case where you'd need to accept a user permission before using the CAPTCHA or be forced through a longer flow) that uses these capabilities would cause user fatigue.

Comment by @hober Mar 3, 2020 (See Github)

Comment by @hober Mar 5, 2020 (See Github)

the capabilities of this API are currently substantially less than for ordinary 3P content within a page which don't require permissions, we'll likely need a new model if we try to move a lot of these 3P-esque capabilities behind permissions

@atanassov and I took another look at this during our Wellington F2F. We had a bit of trouble parsing your comment; could you try to clarify this bit for us? Specifically, when you say "ordinary 3P content within a page which don't require permissions," could you give us a concrete example? You say that "we'll likely need a new model if we try to move a lot of these 3P-esque capabilities behind permissions". If you look at the current browser landscape, do you think it's reasonable to expect "a lot of these 3P-esque capabilities [to move] behind permissions"? That is, maybe the time to look into finding a new model is now?

Comment by @hober May 27, 2020 (See Github)

Hi @dvorak42!

@plinss and I took another look at this in this week's TAG F2F, and we're hoping you could answer some of the questions I asked in my last comment.

Comment by @dvorak42 May 28, 2020 (See Github)

Sorry, missed the original response.

3P cookies/storage being the current type of content that isn't primarily behind active user permissions. I agree that as the browser landscape moves towards limiting 3P content we need some sort of model, but I'm not sure that using permissions as currently exists is the right approach here. Requiring the user top click through permissions on every page that wants to mitigate fraud/DoS/etc would end up with user fatigue. There's also the question of whether having a new model for these sorts of 3P-esque capabilities should be done on an API by API basis or if there should be a more holistic approach to sorting out how to handle these types of capabilities.

Comment by @davidvancleve Sep 15, 2020 (See Github)

Guten TAG,

Motivated by a likely paucity of tokens available on mobile, we're thinking through ways to expand trust token coverage by supporting on-device token issuance; we'd appreciate expanding the scope of this TAG review to include any more concrete subsequent design for on-device token issuance, too. (Updates to follow in the linked bug, and in edits to docs in the Trust Tokens repository.)

Thanks!

Comment by @torgo Sep 22, 2020 (See Github)

Hi @davidvancleve @csharrison - We're just coming back to this in our virtual f2f. @plinss will potentially open up an issue with you about the tracking potential we see. In the mean time, could you give us a brief update on where things are at with your experimentation using this technology?

Comment by @dvorak42 Sep 22, 2020 (See Github)

Currently we're running an origin trial in Chrome to see whether the signal in a token is enough to be a suitable signal for anti-fraud purposes. We've been reaching out to folks to try getting more participants in the origin trial to see what use cases the API can be useful for, but due to the complexity with spinning up an issuer/redeemer setup, haven't gotten too many external participants running their own code, we're in the process of rolling out demo sites to test the redemption side of the API and a library to support issuers running their own issuer during the OT which will hopefully allow more folks to experiment with the API.

Discussed Jan 1, 2021 (See Github)

Hadley: I would like to ping the requestor again and see where they are now...

[disussion on trust tokens]

Hadley: possbility for misuse ... totalitarian govts...

Peter: concern of being able to categorize people based on metadata from the tokens... De-anonymization ... e.g. sign in to cloudflare sign-in with google... Cloudflare knows you're using google. Trust tokens could be used to allow anonymous access.

Hadley: it looks like after that initial discussion they are working on it...

Peter: as I recall there are mitigations...

Hadley: Mozilla supprotive of the ideals and goals but needs more securty analysis...

Amy: user activation at issuance of the token - I can't picture what that would look like. https://github.com/WICG/trust-token-api#mitigation-dynamic-issuance--redemption-limits

[we left more feedback on the issue and are waiting for response...]

Comment by @torgo Jan 26, 2021 (See Github)

Hi @martinthomson - we are just reviewing this in the TAG f2f this week and we were wondering if there was any updated research or position from Mozilla beyond what Tess pointed to from March of last year?

Comment by @hadleybeeman Jan 26, 2021 (See Github)

Hi @dvorak42 @csharrison. We're just wondering how the origin trial is going? Have you learned anything that is changing your approach?

Comment by @rhiaro Jan 26, 2021 (See Github)

I see in the privacy considerations:

At issuance, we require user activation with the issuing site.

and was wondering if you can go into more detail about what this looks like from the user's perspective?

Comment by @dvorak42 Jan 26, 2021 (See Github)

While we don't have concrete numeric data yet on how effective the API is, the OT and external feedback has indicated that some parts of the API need to have a few more toggles to support various use cases. The largest change has been making the redemption record be a free-form blob that issuers can structure however it most makes sense for specific issuers. This change also introduces the possibility of merging the various redemption flows into one API (the issuer can decide whether or not to return a redemption record, which either matches with the previous 'raw-token-redemption' or 'srr-token-redemption' flows).

We also need to add more explicit support for issuers not necessarily being the first party on an issuing site. For the CAPTCHA use case, the CAPTCHA issuance logic might be embedded in sites as 3P content, and not be the same as the top-level page the user is visiting. Along with potentially optimizing those paths (allow an issuance to also be a redemption in the case that you want a redemption record at the time the issuer is issuing tokens, or you want to use the presence of a redemption record to guide the decision to provide more tokens).

This ties in a bit with the user activation question. The actual mitigation is that we want to have a signal that the user is intentionally navigating to/interacting with the page, rather than this page being loaded in the background or via a long redirect chain through a ton of sites that are issuing tokens. From a user's perspective, the user activation signal is implicit in their use of a web page using the API, rather than an explicit pop up they have to click or prompt they have to interact with.

Comment by @martinthomson Jan 28, 2021 (See Github)

I don't really have a lot to add here. There has been some activity that I haven't been following closely, but I'm not seeing any concrete progress on the truly thorny pieces of this.

Much of the privacy properties of the underlying privacy pass work depend on the client having a clear understanding of what information it is propagating across privacy boundaries. As a generic mechanism, this becomes essentially impossible to validate without knowledge of the application context and the information that is being exchanged. I don't think that we are in any position to say that a generic framework like the one proposed is workable.

There are things that might be OK to enshrine in the platform with only limited safeguards (those safeguards might extend to including explicit consent, though opinions on what is appropriate here differ widely). Steven is talking here about using this for CAPTCHA, in which case the information being carried might be "X believes that this client is not a robot", which is one of the best example applications of this that we currently have. Even there, there are difficult caveats to work through. That includes those issues Steven mentions, but larger questions too.

I haven't seen progress (though, again, not I'm paying close enough attention, sorry) to suggest that the embedding information through the choice of token issuer keys has been adequately addressed, nor the corresponding issue of centralization that the solutions to that problem generally lead to. These are really difficult problems, even for the relatively narrow space of making asserts about the difference between natural and artificial intelligence.

I don't know if the TAG has any established policy with respect to research projects. The IETF is generally careful to identify and avoid projects that include a significant exposure to questions unanswered in science. This is one of those cases where you might be best deferring any concrete resolution until those central questions are answered.

My intent here being not to discourage the research (this could be a really useful technology), but to ensure that it is better understood. Again, if there have been results regarding these questions and I simply missed them, I apologize and hope that Steven or Charlie can enlighten us all. (I will read that work with great interest.)

Discussed May 1, 2021 (See Github)

Dan: is this in privacyCG?

Tess: no. WICG.

Dan: explainer updated.. rereview?

Hi @csharrison @dvorak42 - we're picking this up again at our virtual f2f this week. It looks like this work is ongoing in WICG. Can you provide any further updates? Any response to @martinthomson's message above? Should we be re-reviwing? If so can you let us know what's recently changed in your design?

Comment by @torgo May 13, 2021 (See Github)

Hi @csharrison @dvorak42 - we're picking this up again at our virtual f2f this week. It looks like this work is ongoing in WICG. Can you provide any further updates? Any response to @martinthomson's message above? Should we be re-reviwing? If so can you let us know what's recently changed in your design?

Comment by @dvorak42 May 13, 2021 (See Github)

We're doing some work in the Privacy Pass IETF working group to try to more explicitly handle some of these issues (being more explicit about the boundaries/contexts operations are being done in, trying to pull in and articulate the centralization concerns to try mitigating them in the solutions/protocol changes).

Generally I agree, that I think the API will need to have more explicit mitigations/safeguards in the use of issuance/redemption in different contexts/origins/etc to protect against cross-site tracking/fingerprinting, rather than being reliant on having an understanding about the sort of information being embedded.

I can write up a doc gathering safeguards and boundaries included to try mitigating some of the cross-site tracking concerns to get a review over that model/framework and related concerns that have come up from the Privacy Pass side.

Comment by @torgo May 13, 2021 (See Github)

That sounds great! Let us know when that doc is ready and we can have a look at that point.

Comment by @hadleybeeman Jun 14, 2021 (See Github)

Hi @dvorak42 @csharrison! We're just checking in on this. Any progress on that safeguards and boundaries document? Or is there anything else we can do to be helpful here?

Discussed Jul 12, 2021 (See Github)

Dan: sending message to Chris Harrelson - will report back at plenar

Comment by @dvorak42 Jul 19, 2021 (See Github)

Sorry, missed the message. Not a ton of progress yet. Some of the requisite framing has been merged into the Privacy Pass draft (https://github.com/ietf-wg-privacypass/base-drafts/blob/master/draft-ietf-privacypass-architecture.md#redemption-contexts) and we'll have a quick update with the WG at IETF next week for that, hopefully we can get the Trust Token side document out by mid-August. We're also currently trying to finish up another update to the explainer to help articulate some of the ecosystem/deployment shapes that've turned up, that should hopefully land in the next couple weeks.

Discussed Aug 16, 2021 (See Github)

Peter: looks like we're waiting for input from them

Tess: update a month ago about an update to the explainer that should land.. they did make three changes in august but not to the explainer

Peter: Let's check in in a couple of weeks, and if not, face-to-face

Comment by @dvorak42 Aug 17, 2021 (See Github)

As a quick update, we've landed the ecosystem/deployment at https://github.com/WICG/trust-token-api/blob/main/DEPLOYMENTS.md. We expect to have the framework doc published in the next few weeks (taking a bit longer to get everything worked out), and will ping this thread once that's landed.

Discussed Sep 1, 2021 (See Github)

Tess: waiting for an update from them

Comment by @hober Sep 14, 2021 (See Github)

Hi,

As we wait for the updated framework doc, I wanted to make sure we mentioned in this thread that (assuming all of the underlying issues can be worked through) we're interested in declarative integration of trust tokens into HTML forms, see #558 for more.

Comment by @dvorak42 Oct 13, 2021 (See Github)

We've finally landed the privacy framework document (https://github.com/WICG/trust-token-api/blob/main/PRIVACY_FRAMEWORK.md). There are a number of parameters that UAs will need to set based on their privacy model/principles, at some future point they could be tied into a site's privacy budget to allow for more issuances/redemptions to happen on a site if its not using many other privacy budget impacting features.

I've opened up a couple issues for additional ways to trigger the Trust Token API (form-based triggering could help deal with most of the requirements of issue #558), we're also looking at ways that this can be triggered through HTTP headers, potentially via HTTP Auth requirements for visiting a resources.

Discussed Oct 25, 2021 (See Github)

Dan: something new for us to review. Should bring this to attention of privacy tf

Hadley: confused about how site and ua are interacting. Is limit imposed by the spec or by the ua? How does the ua get to change the site's limit?

Dan: reference to privacy principles that is in a private repo that I haven't seen before..

Hadley: this goes through first part identity which ties into FPS.. but says third parties can be allowed access to a first party identity.. first party gets to decide with which third parties to share the identities of the user.. user should be in control of that rather than the first party.. concerned there may be some assumptions underpinning the broader issue that we would like to discuss.

Dan: i want to understand the standing of that document, the people working on trust token obviously think it has some standing. I think it may be superceded by the work Jeffrey did on the privacy threat model which has now become part of the privacy princples doc in the task force. If that's the case we should be getting those folks in trust token to reference our document. Even though it's hardly done it feels more comprehensive.

Discussed Nov 8, 2021 (See Github)

Ken: new doc called privacy framework

Dan: I asked whether the privacy framework doc could point to the privacy principles doc. Jeffrey Yaskin made a PR to do that. Good sign. I was unclear what the other privacy principles doc was that they were pointing to, apparently a Chrome one, better to refer to a more community driven one.

Ken: no spec yet?

Dan: we closed the captchas are horrible issue on the basis this is being worked on

Tess: one of the things is that it depends on the privacy pass stuff at ietf. I can't evaluate the crypto properties of that stuff. I would feel better if I saw some independent analysis of it by someone who does understand cryptography. I assume that's already happened and they can just link us to it.

Peter: i recall this was based on zero knowledge proof crypto

Tess: mozilla's position on privacy pass is 'defer', states they defer until the protocol and novel crypto principles have had more thorough security analysis. I'll quote that in a comment.

Dan: where are they talking about doing this work? raises comment

Peter: explainer is in wicg

Dan: then where does it go?

Peter: I see Hadley opened a couple of issues in their repo... two years ago. Which have been responded to but still open

Hadley: feels like there's still a situation where the user ends up having to choose a token issuer that isn't reputable.. I take that back, if the site has narrowed down the list of token issuers they will trust the worst thing it can do is track the user across other properties that are controlled by the same site.. I guess that is more of a privacy compromise than is possible just from the issuing site. I don't feel like the answer has been fully bottomed out. Did they flesh out the use cases? I'd like more time to dig into this.

Comment by @plinss Nov 8, 2021 (See Github)

For tracking purposes: https://github.com/WICG/trust-token-api/issues/88 https://github.com/WICG/trust-token-api/issues/89

Comment by @hober Nov 8, 2021 (See Github)

Hi,

Mozilla's position on Privacy Pass says:

[W]e will defer making a firm position until the protocol and the novel cryptographic primitives it relies on have had more thorough security analysis.

Has there been review from independent cryptography experts? Could you point us to it, if so? If not, are you making any effort to get independent review of the cryptography?

Comment by @torgo Nov 8, 2021 (See Github)

@dvorak42 further question: Where does this work go after WICG?

Comment by @dvorak42 Nov 8, 2021 (See Github)

The crypto and protocols are in the process of getting standardized in the IETF and getting reviewed via CFRG (for the OPRF crypto primitives).

I'm not sure we know what the best home post-WICG, laterally for ecosystem/broader discussion the antifraud CG might be a good for further discussions, for standardizing and moving down the standardization path, not sure what the best home would be.

Discussed Nov 15, 2021 (See Github)

Dan: there was a response to us - left comment and we'll talk about it next week.

Comment by @torgo Nov 17, 2021 (See Github)

Ok thanks @dvorak42 – can you address the question raised by Tess regarding take-up and reception of the IETF specs (and the general issue of multi-stakeholder reception)? Also can I encourage you to have a discussion about where this will go after incubation? WebAppSec maybe? I'm just trying to get an idea.

Discussed Nov 22, 2021 (See Github)

Dan: one of the quesitons was to do with trusttoken being based on top of privacypass, an ietf spec. It was not clear what mozilla's disposition toward privacypass is. I marked it as a potential multistakeholder issue. They responded - looks on track... not clear which bits will get standardised... maybe webappsec after incubation. My sense is that addresses the issues we've raised. Maybe mark as satisfied with caveats and close?

Amy: worth asking to open a new review when they have a concrete spec

Peter: agree

Hi - thanks for the chance to give this important work an early review.  We're largely happy with the design and approach.  We're still concerned about the multi-stakeholder issuee and the dependency on PrivacyPass. We'd like the opportunity to review again when the spec is more concrete. Can you please either open a new issue or ping us and we can re-open this one.  In the mean time we're closing this.

Comment by @dvorak42 Nov 22, 2021 (See Github)

The CFRG spec is on track and seems to have positive support, though the exact parameterization and knobs that will end up getting standardized are still shifting. The PrivacyPass spec recently updated it's charter timeline and focus on specific instantiation of the protocol, we're hoping that the focused approach there might simplify the scope of the work to get more positive signals. On the browser side we've mostly seen experimentation and analysis happening in Chromium and Edge.

Starting up the discussions and seeking advice, but yeah given the nature of the API WebAppSec seems like a potentially good home.

Comment by @torgo Nov 22, 2021 (See Github)

Hi @dvorak42 thanks for the chance to give this important work an early review. We're largely happy with the design and approach. We're still concerned about the multi-stakeholder issue and the dependency on PrivacyPass. We'd like the opportunity to review again when the spec is more concrete. Can you please either open a new issue or ping us and we can re-open this one. In the mean time we're closing this.