#562: Storage Buckets API

Visit on Github.

Opened Oct 7, 2020

HIQaH! QaH! TAG!

I'm requesting a TAG review of Storage Buckets API.

The core of the proposal is granting sites the ability to create multiple storage buckets, where the user agent may choose to delete each bucket independently of other buckets. By contrast, today's user agents have a binary choice of either persisting or deleting all the data stored by a site.

Further details:

  • I have reviewed the TAG's API Design Principles
  • The group where the incubation/design work on this is being done (or is intended to be done in the future): WICG
  • The group where standardization of this work is intended to be done ("unknown" if not known): WHATWG
  • Existing major pieces of multi-stakeholder review or discussion of this design: https://github.com/WICG/storage-buckets/issues
  • Major unresolved issues with or opposition to this design: none
  • This work is being funded by: Google Inc.

We'd prefer the TAG provide feedback as (please delete all but the desired option):

🐛 open issues in our GitHub repo for each point of feedback

Discussions

2020-11-09

Minutes

Tess: Storage standard... very abstract. defines conceptual framework that other storage mechnisms can hook onto. E.g. as other specs are built on top of Fetch. Similar idea here. Also specifying what happens when you "clear storage data" button .etc...

... key value store. Key is what origin and what kind and value is what the backing store is. Not really something that's exposed to authors. Definition of something called a storage bucket. All the storage for an origin fits into the same bucket. API allows for programmatic access to storage buckets. We now live in a world of partitioned storage. E.g. facebook.com is one bucket, facebook.com in an iframe on NYTimes is a different bucket. API helps websites understand when the bucket under them is changing. Can allow sites to create an ephemeral storage bucket. Adding an API to it might require more work.

Ken: I went to TPAC breakout. You can delete some of these buckets if there is a right to be forgotten request. For login info you really don't want that to be gone - or for a boarding pass - giving developers more control or due to regulatory requirements.

Tess: Dedicated proposals for login or regulatory consents... Site wouldn't have to use its own storage for that.

Ken: normally you have to write to people "we're going to store this for xyz timeframe" and after that's it's gone [under gdpr].

Dan: is there a privacy concern - data leakage between buckets?

Tess: example of 2 tabs open to facebook - it can already know...

Dan: container tabs?

Tess: it would have no way of knowing. This doesn't change that. Useful for things like storage access API - and storage access changes - getting notified that the bucket is invalid. "What happens to temp storage" - how do you swap it out in a way that's safe? Uncommittedd indexdb transaction... This API helps with defining that precisely. Super unclear how this should work right now.

Rossen: biggest worries?

Tess: I deeply understand why we need to do this but haven't reviewed if this is a good design.

Rossen: Reading the privacy assessment - on the question on deistinguising between 3rd and 1st party context - not proposing any distinction between 1st and 3rd. Down the road it may restrict access to 3rd party. That is a warning flag to me.

Tess: what's unclear from the explainer - defining partion storage is an ongoing effort. That's "in flight". I agree with the worry but since it's in flgiht it's probably appropriate.

Ken: this is early days.

Tess: I think we should take a closer look at the API itself.

Dan: I'll do a stronger review of the privacy & security response.

Ken: I'll look at the API.

Rossen: Storage buckets - the API is hanging off of navigator. That's for the current origin?

[bumped 2 weeks]

2020-11-23

Minutes

[]

2020-11-30

Minutes

[discussion on the API]

Ken: there is no API spec yet...

ken: [leaves commet]

Dan: I will look at the privacy & security response - briefly looking it looks reasonable.

[bumped a week]

2020-12-07

Minutes

Tess: we wanted them to provide us a summary of the API - and they have done so - in IDL. We could look at it now.

Ken: on API I'm not a big fan of APIs with Open or Create... I don't understand - all storage buckets have a name and also a title - metadata you can set. You can go into settings and see a descriptive name. You run into issues of a11y, i18n..

Tess: we've been reluctant to expose author-provided strings in security UI like that...

Ken: I think it should just be called open. Also - i don't know about the expiration thing - takes a DOM timestamp. Will this work with summer time, etc...

Tess: DOM timestamps are always UTC.

Ken: ok.

Tess: the rest of it is "what hangs off of a storage bucket"

Ken: ...name of persist...

Tess: our naming advice would suggest a simpler name - like "save". Depends on what persist does.

... durability concept - durability relax - is the UA free to delete things things? But even in strict if the UA is under storage pressure it's gonna get rid of your stuff. So - don't like APIs that imply to the author that this is durable.

Ken: maybe "something i care about" / "something I don't care about"

Tess: The service worker integration is complex but it needs to be...

Ken: file API thing...

Tess: these APIs are straight from the file api... so consistency.

Ken: set expires -- doesn't it make sense to set expiration date? set expires sounds like a boolean.

Tess: to summarize: we have some naming concerns and other similar concerns... but overall it seems fine.

Dan: why is this better for user privacy?

Tess: the explainer covers this reasonably well.. Currently browsers have an all-or-nothing choice about storage. This API would allow browsers to be a little more intelligent. For example, an email client could put mailboxes in 1 bucket - and say that the durability is relaxed - and a 2nd bucket for draft messages you haven't sent yet - and put that under "strict" durability. So UA can delete the "relaxed" stuff first. This doesn't change partitioning of storage - it just makes it more convenient about how to selectively purge storage. Theoretically the privacy story is "no change" - there could be some fingerprintability - you could create different buckets but it's really hard to do...

Dan: expiration date could allow the developer to set expration on data that isn't needed anymore - aids in data minimization. Good for sustainability.

Tess: Yes.

... you could imaging user agents to want a default expiration. That would be a huge win compared to existing APIs.

... controling the expiration of the default bucket...

Ken: you can't get access...

Tess: in the S&P they talk about exposing the default bucket... could be good because it encourages sites to set a

[discussed possibly closing in the plenary call this week - otherwise 2nd week of jan]

2021-01-11

Minutes

Ken: I read the feedback, it's really good. Some of the comments about names, it's consistent with existing things so I think that's okay. Consistency is very important even though the names might not be perfect.

Dan: propose close?

Ken: yeah this is early review. We like the feature, they are listening to our feedback, we have given some comments and we look forward to a late review.

Dan: did they say what the venue is?

Sangwhan: feels like webapps

Ken: they haven't said. Probably same as cookie store API. I'll write that the feedback sounds sensible.

Sangwhan: use cases and goals I'm happy with, detailed API I'm not up to date

Dan: and multi stakeholder, I don't see

Ken: done

Sangwhan: the multi stakeholder thing is at the bottom of the explainer

Ken: it's definitely there, I'll remove that

Dan: there are some words here.. when they say Gecko positive, what does that mean? I would like to see a link to the Mozilla standards position. Web developers - positive? I'd like to see more.. be good to see some evidence. Some blog posts?

Ken: I'm a web developer and I like it

Sangwhan: it takes away the problem of namespacing so you can scope storage from the developer to the browser, so that's useful

Dan: I'm in no way making the arguement that it's not useful. I'd just like to see more evidence when they make these assertions.

Ken: maybe we should change our issue template to point out please send the standards position

Dan: I might add an additional comment asking for more detail on the developer feedback.

2021-01-11

Minutes

Tess: Quick summary: Ken and I looked at it, liked the general idea, wanted some kind of summary IDL to understand what the API is.

... they provided that, Ken and I haven't managed to do a second review on it since then.

... We talked about doing a 1:1 breakout to do that, but we haven't yet scheduled that. Other folks are also assigned to this... but doing a breakout between Tess, Ken and Sangwhan is near-impossible timezone-wise.

Peter: Should we push this to the f2f?

Alice: still overconstrained.

Peter: I would simply have multiple breakouts...

2021-02-15

Minutes

Yves: is there an issue with partitioning?

Ken: we discussed this but don't recall the answer.

Yves: [leaves comment asking this]

2021-04-26

Minutes

Hadley: I'm not sure they answered Yves' question. diff between storage buckets and partitioning?

Yves: organise buckets by yourself rather than by each file and resource..

Ken: control your own storage inside your own application. Like caching. You can say which is privacy sensitive..

Yves: or a bucket you pin in your cache.

Ken: per user per site. Site is in control. UI that says you can delete cache but have a bucket you don't want to delete. Site has more control over how caches are stored. Good with regulation, some data needs to be stored and treated in a specific way. Eg. right to be forgotten.. want to remove private info but not necessarily cached images.

Dan: positive from firefox, no signal from edge or safari

Yves: fine to clos