design-reviews#297: HTTP State Tokens

#297: HTTP State Tokens

Visit on Github.

Opened Aug 14, 2018

Guten TAG,

I'm requesting a TAG review of:

Name: HTTP State Tokens
Specification URL: N/A
Explainer, Requirements Doc, or Example code: https://github.com/mikewest/http-state-tokens
Tests: N/A
Primary contacts: @mikewest

Further details (optional):

Relevant time constraints or deadlines: None. This is a super-sketchy idea that's just baked enough to ask people to start poking at it with toothpicks to see how runny it is inside.
I have read and filled out the Self-Review Questionnare on Security and Privacy. The assessment is here.
I have reviewed the TAG's API Design Principles

You should also know that...

I'm only sending this to y'all at this point because I alluded to it in our last meeting. I'd appreciate y'all's feedback, but this isn't something folks have solidly bought into yet, and there's no risk of it shipping (or being implemented) anytime soon. It's a thought experiment I'd like y'all to participate in, which hopefully will lead to a reasonable design in the future.

We'd prefer the TAG provide feedback as (please select one):

open issues in our Github repo for each point of feedback
open a single issue in our Github repo for the entire review
leave review feedback as a comment in this issue and @-notify [github usernames]

Discussions

Comment by @mikewest Oct 11, 2018 (See Github)

Maybe we can talk about this at TPAC, since we'll all(?) be there?

Comment by @lknik Oct 12, 2018 (See Github)

Would you be able to compile security & privacy considerations?

Comment by @mikewest Oct 12, 2018 (See Github)

That's basically the whole explainer? What would you like to see?

Comment by @dbaron Oct 31, 2018 (See Github)

So it seems like this turns something that currently requires sites to actively track (i.e., set a cookie) into something that can now be done passively, since it at least looks like this is proposing that the HTTP state token be sent whether or not the site requests it. Though maybe that's not the intent, and the idea is that it would only be sent after Sec-HTTP-State-Options has been received? (But if that's the case, how will the first request be connected to the rest?)

Comment by @torgo Oct 31, 2018 (See Github)

@mikewest - we didn't get a chance to discuss this at TPAC. Is there any additional implementer interest? Also to David's point, will implementation of this proposal diminish the ability for the user to measure how they are being tracked? Also, will an origin get different tokens depending on the top level origin (i.e. double-keying).

Comment by @mikewest Oct 31, 2018 (See Github)

Thanks, @dbaron and @torgo!

So it seems like this turns something that currently requires sites to actively track (i.e., set a cookie) into something that can now be done passively

Two points:

https://github.com/mikewest/http-state-tokens#opt-in discusses the question of whether an initial navigational request should be pre-populated with a value, or whether the capability should be advertised and opted-into. My intuition is that binding the initial request to the next request is actually important, and that folks would just initiate another navigation if they didn't get the value to begin with (which means that this isn't actually a barrier, and it just creates annoyance for users). This, though, is not at all set in stone, and I may well be wrong in my analysis. If there's real value to making the token completely opt-in, I'm not at all philosophically opposed to doing so.
Note that the default delivery behavior of a token is same-site. This means that even if we do decide to auto-mint tokens for origins, they're not particularly useful for third-party tracking until and unless a user visits the site in a first-party context so that it has an opportunity to change the delivery option via a Sec-HTTP-Site-Options header (because same-site tokens aren't delivered in third-party contexts). That is, if a user who's never been to any websites before visits https://example.com/, then:
- The navigational request would include a newly-minted token for https://example.com.
- Subresource requests to https://example.com/ would include https://example.com's token.
- Subresource requests to https://sub.example.com/ would include a newly-minted token for https://sub.example.com (because its delivery would default to same-site).
- Subresource requests to https://not-example.com/ would not include a token for https://not-example.com (because its delivery had not been changed from same-site).
In other words, I think this proposal actually makes third-party tracking more opt-in than it is today, insofar as the third-party which wishes to track must have been visited in a first-party context at some point in the past to have changed it's delivery option.

Is there any additional implementer interest?

None that I know of.

Also to David's point, will implementation of this proposal diminish the ability for the user to measure how they are being tracked?

I don't see how it would. Users would remain in control of these tokens in the same way they're in control of cookies.

Also, will an origin get different tokens depending on the top level origin (i.e. double-keying).

That seems like a choice specific user agents could make. I'd note that Safari is the only browser to double-key storage, and they've just removed cookie partitioning from tip-of-tree WebKit. It seems more likely to me that folks will follow Firefox's approach of gating access completely for particular origins rather than attempting to shard identity across top-level pages, but I think this proposal would give user agents flexibility to do so as they see fit.

Comment by @dbaron Oct 31, 2018 (See Github)

Assuming same-site means same origin (does it?), I think (2) above addresses part of the concern. But the point about active versus passive seems related to things like the Lightbeam (formerly Collusion) Firefox extension that aims to show how sites are collaborating with each other in tracking the user. If the tracking happens automatically without any opt-in from the site, then it becomes harder to show what tracking is happening.

Comment by @mikewest Oct 31, 2018 (See Github)

Assuming same-site means same origin (does it?)

same-site means same-site (the enum in the explainer is cross-site, same-site, or same-origin; the default is same-site for delivery, as that enables the SSO pattern of sso.site.tld that we see all over the place, which seems like a reasonable kind of thing to encourage as the default behavior).

If the tracking happens automatically without any opt-in from the site

The proposal suggests that we mint tokens proactively for things that the user navigates to as first-parties. It does not suggest that we do the same for things that the user does not navigate to as a first-party, even if they really want it. Can you help me understand the scenario in which Lightbeam would show users bad information, or somehow misunderstand/underestimate the tracking potential a user's navigations expose?

Comment by @dbaron Oct 31, 2018 (See Github)

There's an enum in the explainer? (I don't see one.)

So perhaps not Lightbeam exactly -- but tools that help users understand what's happening with their information. So just as we try to minimize passive fingerprinting opportunities while worrying less about active fingerprinting, it seems like we ought to be concerned about the distinction between passive tracking versus active tracking -- probably even more so in tools designed for tracking than for the fingerprinting case.

Comment by @mikewest Oct 31, 2018 (See Github)

There's an enum in the explainer? (I don't see one.)

You're right. It's much more implicit than I thought it was in the delivery section of https://github.com/mikewest/http-state-tokens#a-proposal. Sorry about that, it'll be more clear if/when I ever get around to writing a spec.

So perhaps not Lightbeam exactly -- but tools that help users understand what's happening with their information. So just as we try to minimize passive fingerprinting opportunities while worrying less about active fingerprinting, it seems like we ought to be concerned about the distinction between passive tracking versus active tracking -- probably even more so in tools designed for tracking than for the fingerprinting case.

I agree that we need to be cognizant of the opportunities we're creating, and mindful of the ways in which they'll be abused.

That said, this proposal is strictly narrower than status quo insofar as it does not allow tokens to be minted in third-party contexts, but only in first-party contexts. That is, the site which wishes to track users will need to "actively" deliver a Sec-HTTP-State-Options: delivery=cross-site header to the user in a first-party context in order to receive tokens in a third-party context. That seems to me to satisfy your concerns with regard to the passively-available characteristics of the token. Does it not?

Comment by @dbaron Nov 1, 2018 (See Github)

It still seems like there is a reduction in the ability to study/monitor first-party tracking. I'm not sure how big a deal that reduction is -- but I'd think some people are interested in how much they're tracked by first parties.

Comment by @mikewest Nov 1, 2018 (See Github)

It still seems like there is a reduction in the ability to study/monitor first-party tracking.

I don't understand how. But I'd love to chat about it more! :)

Comment by @michael-oneill Nov 1, 2018 (See Github)

Third-party tracking would still be possible, even without access to third-party cookies. A first-party browsing context could create an element, or execute an xhr, with a url formed from the session token.

var img = new Image();

img.src= //www.third-party-tracker.com?token= http://www.third-party-tracker.com?token= " +token;

www.third-party-tracker.com http://www.third-party-tracker.com could concatenate the token with the Referrer header to create a cross-origin unique identifier, or the first-party origin could be in another url param.

From: Mike West notifications@github.com Sent: 01 November 2018 10:19 To: w3ctag/design-reviews design-reviews@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: Re: [w3ctag/design-reviews] HTTP State Tokens (#297)

It still seems like there is a reduction in the ability to study/monitor first-party tracking.

I don't understand how. But I'd love to chat about it more! :)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/w3ctag/design-reviews/issues/297#issuecomment-434996368 , or mute the thread https://github.com/notifications/unsubscribe-auth/AEBCIkNn6bafSeLimw4Ex4NS9MqWX2WLks5uqsqHgaJpZM4V8N8q . https://github.com/notifications/beacon/AEBCIh-DRwu6iYGL4n2V-hkCn49IQoV2ks5uqsqHgaJpZM4V8N8q.gif

Comment by @mikewest Nov 5, 2018 (See Github)

Third-party tracking would still be possible, even without access to third-party cookies.

The mechanism you're sketching below uses first-party context in order to enable tracking by specific third-parties in a given first-party's context. That's certainly a thing that can happen! It seems non-unique to this proposal, however, as any local storage mechanism (localStorage, for instance) can be used in the same way. I'm not sure that there's any technical mechanism we can provide that would allow websites to keep track of user state on the one hand, but technically disable them from sharing it with third-parties on the other.

One benefit of this proposal is that it forces that sharing mechanism to rely on explicit server-side cooperation, as the token isn't exposed to JavaScript. That does basically nothing to address the issue, given the number of alternative storage mechanisms, but at least it takes cookies off the table as a trivial mechanism for storing arbitrary state.

Comment by @michael-oneill Nov 6, 2018 (See Github)

Users should have control over third-party tracking, so more needs to be done than just blocking the session token in nested contexts. One way is to clear all user state for an origin after a reasonable time-out, say 24 hours. Perhaps an extension to Clear-Site-Data so the state gets purged after a user configurable duration, defaulting to 24 hours (say) if the header is not in the response. If the primary content provider has user consent they could supply a CSD header with a longer duration, and the user made aware of that by suitable UA UI.

Comment by @mikewest Nov 6, 2018 (See Github)

These seem like ideas that are worth discussing. I feel like they are somewhat beyond the scope of this proposal, which is aiming at replacing the underlying infrastructure of HTTP state maintenance. User-facing controls on top of that infrastructure are very important indeed, but tangential.

Discussed Nov 28, 2018 (See Github)

hadley: looks like we are discussing tangents .. mike has said 28 days ago that it will be more clear when he writes a spec. suggest we close it and say "great, see you then."

Dan: I agree.

Hadley: I will close

Comment by @hadleybeeman Nov 28, 2018 (See Github)

Thanks, all! And especially to @mikewest. It looks like we've explored our initial thoughts on this subject and the next step would be to review a spec when you get to writing one. (Note that we are still very interested in better state management and are cheering you on! Let us know if we can help.)

In the meantime, we'll close this issue.

Comment by @dbaron Nov 28, 2018 (See Github)

Oops, never replied about my concerns about first-party tracking from https://github.com/w3ctag/design-reviews/issues/297#issuecomment-434996368

My concern was that today, if a user visits example.com and it sets a cookie or uses any other storage API, there's something that the page does actively to store data. So it's detectable that the page is doing some form of data storage (which could be used for tracking) in a way that not all pages do -- and in a way that something like a browser extension could indicate to a user who's interested in knowing about that and monitoring which pages do some form of tracking or data storage.

If instead the page gets a token automatically, then such UI would essentially flag every page on the web as doing tracking, and would become completely useless. (It's not clear that it would be all that useful today, but that's another debate...)

Comment by @lknik Nov 28, 2018 (See Github)

@dbaron your concern is valid. In this light the change would look like a regression, with respect to the current mechanism. One way to address would be to bake in some negotiation process to actually request the token. Then again, probably most sites would end up using this, so it ends in 'all are tracking' status.

That's a consequence of the token-based state management working as a reliable identifier/fingerprint.

Comment by @michael-oneill Nov 28, 2018 (See Github)

The user needs to be asked for consent, and it is has to "freely given, informed, specific and unambiguous". Perhaps a Permissions prompt triggered by a FP header? The information presented should be recorded somewhere also so the user can be reminded (this is relevant to all Permissions).

Comment by @mikewest Mar 28, 2019 (See Github)

the next step would be to review a spec when you get to writing one.

I got to it: https://tools.ietf.org/html/draft-west-http-state-tokens-00. Would y'all (@plinss? @torgo?) like to reopen this issue, or shall I file another?

Comment by @mikewest Apr 8, 2019 (See Github)

Y'all might be interested in skimming through https://speakerdeck.com/mikewest/cookies-are-bad-at-http-workshop-2019, which walks through the proposal at a very high level. The spec linked above should be detailed enough to pick at, but the high-level direction is what I'm most interested in at this point: is this the right shape for a state management primitive? Should some of the constraints be loosened, tightened, etc?

Comment by @annevk Apr 8, 2019 (See Github)

It doesn't really seem to me that cookies are beyond fixing. And web developers would be suited by incremental evolution of them (e.g., easier ways of clearing, truly origin-bound cookies, etc.). It's not really clear that reduction in size and no possibility for user-information in (these) headers is sufficient motivation for replacing them. And there's also massive infrastructure changes required that make adoption seem somewhat unlikely.

Comment by @mikewest Apr 8, 2019 (See Github)

I agree with @annevk: we should make today's cookies better incrementally and create a forward-looking, aspirational replacement in parallel so that the one approach doesn't block the other.

Comment by @michael-oneill Apr 9, 2019 (See Github)

This is a great idea, and I agree the short duration default of 1 hour is in the right ballpark.

One thing to consider: text that the user agent SHOULD alert the user when the response header calls for a longer duration, or raises scope from same-origin. This could be a prompt as in ITP or some kind of visible indication in the chrome, and the user should be able to interact with it to deny or restrict the change.

MikeO

From: Mike West notifications@github.com Sent: 08 April 2019 13:36 To: w3ctag/design-reviews design-reviews@noreply.github.com Cc: michael-oneill michael.oneill@baycloud.com; Comment comment@noreply.github.com Subject: Re: [w3ctag/design-reviews] HTTP State Tokens (#297)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/w3ctag/design-reviews/issues/297#issuecomment-480813306 , or mute the thread https://github.com/notifications/unsubscribe-auth/AEBCIvNHrNYqrW3RZiWkzA1ly_Vcophpks5vezeqgaJpZM4V8N8q . https://github.com/notifications/beacon/AEBCIh3emTlICNMX_wZrjFYK6noXdjAzks5vezeqgaJpZM4V8N8q.gif

Comment by @michael-oneill Apr 9, 2019 (See Github)

Another thought: why not allow the expiry be delayed by resetting on user activity, key strokes etc. So the token stays alive while there is an active session, then times out after an hour. The default duration could be smaller then also (<1hr).

Comment by @annevk Apr 9, 2019 (See Github)

@mikewest to be clear, I don't think the case for the replacement has been sufficiently made.

Comment by @torgo Sep 10, 2019 (See Github)

Picking this up at our f2f meeting today...

Comment by @torgo Sep 10, 2019 (See Github)

After reviewing your RFC and especially point 1.2, I remain puzzled as to why developers would switch to http state tokens at all if cookies still exist? Why not "fix" cookies so that they have the security behaviour you're describing. @dbaron points out that this would "break" some content. Maybe it's time to break some content?

Having said that, our proposal is that this should progress in the http working group rather than here and so we think this issue should probably be closed for now and then reopened when and if the http working group requests TAG review.

@mikewest is it ok for us to close the comment for now?

Comment by @mikewest Sep 11, 2019 (See Github)

Picking this up at our f2f meeting today...

Will you be posting minutes? Stalking through GitHub and the usual places didn't turn anything up for me...

After reviewing your RFC and especially point 1.2, I remain puzzled as to why developers would switch to http state tokens at all if cookies still exist?

Technically, this proposal does some things that cookies can't (cookies don't do ports, for instance, nor can they directly assert provenance), which creates some security advantage to adoption. I know of ~3 places in Google that want some of these things, and I can imagine marginal adoption amongst the particularly savvy and beautiful developers.

Non-technically, resetting expectations is, IMO, best done with a new thing that has a new name. I tried to make that case in the section of the ID that you mentioned. It appears you didn't find that work compelling, but I'd like to understand why?

Why not "fix" cookies so that they have the security behaviour you're describing.

We briefly touched on this in https://github.com/w3ctag/design-reviews/issues/297#issuecomment-480911736, and https://speakerdeck.com/mikewest/cookies-are-bad-at-http-workshop-2019:

We should fix cookies when we can. For example, Chrome's current push to default to SameSite=Lax and require Secure along with SameSite=None (which y'all have in your queue at https://github.com/w3ctag/design-reviews/issues/373) can be seen as a step towards aligning cookies with this proposal. You could imagine more shifts in defaults that would bring us even closer.
We should propose a replacement that folks can transition to because new names reset expectations and give us a clear migration path over time.

Having said that, our proposal is that this should progress in the http working group rather than here and so we think this issue should probably be closed for now and then reopened when and if the http working group requests TAG review.

I'll defer to y'all on this, but it surprises me.

My impression has been that the TAG has generally been happier when looped into discussions that affect the underlying framework on which other things are built. I'd expect y'all to have somewhat authoritative opinions both about core concepts like authentication and state management. HTTPWG is clearly where standardization would happen, but it's not clear to me that there's as much of a liaison relationship there as you're suggesting (friendly folks like @mnot nonwithstanding!). I don't think the HTTPWG is in the habit of poking at the TAG for feedback.

For example, https://httpwg.org/http-extensions/draft-ietf-httpbis-header-structure.html seems like something that y'all could weigh in on, as it has pretty clear implications for the design of APIs at the application layer. So far, I haven't seen that happen. Perhaps @mnot was planning on asking y'all to weigh in as the document moves to last call?

Comment by @mnot Sep 11, 2019 (See Github)

W3C and IETF have a liaison, of course (ping @wseltzer), and we do try to keep the TAG in the loop on major developments where we can.

IIRC I've brought SH up in person during our occasional syncs, but I don't think we've had a format TAG review. I'll be at TPAC next week (well, Tuesday to Thursday); if folks want to talk about it or just an overview, happy to oblige.

Discussed Jan 27, 2020 (See Github)

Yves: the issue is that it's a high cost to replace cookies with something new. It might be better for incremental changes to cookies achive that goal. I don't think just changing to http state tokens would be easy because of all of the different paerts of the [the ecosystem] which rely on cookies.

Yves: we did discuss this feedback previously. Not a new thing. So - should we review technical points on the latest spec or close it because we're waiting for a clear upgrading path from cookies...

Dan: should we provide some clear TAG feedback that we would like to see this couched as an incremental upgrade from cookies rather than a replacement - like what is the gradual approach?

Yves: there are experiments ongoing on cookies - lifetime, etc...

Dan: it would be good to understand the outcomes of those experiments.

Yves: not sure replacing cookies with soemthiing new will help - people will find other ways to track.

Dan: I feel like we should aim to close this issue with some definitive feedback at our f2f.

Discussed Feb 10, 2020 (See Github)

(reading issue to try to figure out what the state of it is)

Yves: maybe try asking Mike to figure out what the plans in this space (including for cookies) are? Get clarification on upgrade path? Perhaps we should ask the chairs if we want to try to make further progress on the issue -- we can discuss that in the plenary

Comment by @ylafon Feb 12, 2020 (See Github)

@mikewest what is the status of this issue? Main concerns are the transition plan from Cookies to state tokens, and if smooth transition cannot be achieved, how can Cookies be modified to inherit the new properties of State Tokens? Your point 2 above (changing to a new name to reset expectation) comes with a huge change in implementations, UAs, libraries, server components, etc... It would be sad to see a 'cookie users' web and a 'state token' web.

Comment by @mikewest Feb 17, 2020 (See Github)

Hey, @ylafon!

what is the status of this issue? Main concerns are the transition plan from Cookies to state tokens, and if smooth transition cannot be achieved, how can Cookies be modified to inherit the new properties of State Tokens?

It's on the back burner at the moment; Chrome has been focusing on the less-radical approach initiated in https://mikewest.github.io/cookie-incrementalism/draft-west-cookie-incrementalism.html for the near term, and I expect more incremental proposals to follow.

I do intend to come back to this in light of some of the other cookie-related proposals Chrome is pushing for in the mid-term. That said, there's no action for y'all to take a the moment, and closing this out until I've found time to rewrite the proposal is probably reasonable.

Comment by @hadleybeeman Mar 2, 2020 (See Github)

Thanks, @mikewest! We're just coming back to this at our Wellington face-to-face. We're happy to close it, and will look forward to seeing your new incremental proposals whenever you'd like us to. Cheers!