#205: Web Lifecycle for system initiated Discarding & Stopping

Visit on Github.

Opened Oct 13, 2017

Hello! I'm requesting a TAG review of:

You should also know that... This is an early stage proposal, which we are just starting to implement. This proposal has been informally discussed with other browser vendors (Mozilla, Apple, Microsoft) and their high level feedback has been incorporated.

We'd prefer the TAG provide feedback as (please select one):

  • open a single issue in our Github repo for the entire review

Discussions

Comment by @cynthia Oct 17, 2017 (See Github)

Minor suggestion: StopReason {..."awwsnap"} comes to mind.

Comment by @torgo Nov 29, 2017 (See Github)

Discussed in telcon 29 Nov. There was a TPAC session. @slightlyoff : people who are excited about this should share their support.

Discussed Jan 10, 2018 (See Github)

Travis: I intend to be brief. I think this is amazing. It describes the typical lifecycle of a web page and it introduces 2 new states that a web page can go into and associated signals. The 2 new states are "stopped" (memory resident but not getting time on CPU) and "discarded" (forcefully terminated, gone) - when you see the stop signal you will be allowed to run a callback to unload some resources or prepare. After you've been stopped you may be terminated without further notice.

Dan: possible use of this to subvert user action..

Alex: the design point is that a lot of browsers are silently doing tab discarding - e.g. chrome on android, safari on IoS. They will keep snapshots (pixels) but the contetn is gone. Meanwhile a lot of content does want to live in the background - they will keep a lot of tabs open. What makes sense on mobile makes less sens on desktop - where those tabs might still be visible. That's one motivating use case. it's not designed to handle pop-up or pop-under detection on unload.

Travis: it seems the focus of the lifecycle is where the agent makes the decision to terminate. If the user closes the tab that's out of scope. Preventing ransomware pop-ups is high priority. I think it's a valid concern.

[ dbaron's machine spontaneously reboots and he disappears for a minute or two ]<sup>Citation needed</sup>

Travis: will continue writing up feedback.

--

Comment by @travisleithead Jan 19, 2018 (See Github)

A most excellent set of docs! CC-ing @toddreifsteck since he participated in much of this review, and is in contact with the authors...

Some feedback/questions:

  1. Hmm... how does pagehide fire after transitioning to the STOPPED state? Aren't you CPU-starved at that point?
  2. State model doesn't account for crashes, except to say that probably all state transitions are skipped... on reload should wasDiscarded be set to true? Same if a site crashes during a lifecycle callback? Might be nice to provide a bit of data that they crashed (vs. were discarded).
  3. Consider: two site instances, both DISCARDED, when one gets re-loaded, how does the site logic know which state to resume (assuming one site instance didn't overwrite the other's data)? (Might need a GUID or some unique identifier to distinguish state? Is this a problem that can be solved in the API, or is it the web author's responsibility to handle?
  4. Consider: two frames, same domain (running on the same HTML5 event loop), if one is STOPPED, must the event loop be able to distinguish which contexts are stopped vs. not? Same question for which Render steps become optional.
    • Does STOPPED mean no more processing of the event loop (for all related browsing contexts?)
  5. What happens when one of the callbacks might hang (not returning control)? Should this situation be conveyed (similar to crash flag noted earlier).

Some nits:

  • In the table: pagevisibility should be visibilitystate (for the state) or visibilitychange for the event :-) It was a little unclear which you mean--I originally assumed the state, but later think you meant the event.
  • "onpagevisibilitychange" should be "onvisibilitychange"
  • API sketch: not necessary to have separate interfaces for FreezeEvent and ResumeEvent, since they provide the same state. :-)
Comment by @spanicker Mar 15, 2018 (See Github)

Major apologies for the delay, I somehow totally missed that this round of review has happened. (I discussed these points with @toddreifsteck offline already)

  1. Hmm... how does pagehide fire after transitioning to the STOPPED state? Aren't you CPU-starved at that point?

It is NOT that pagehide is fired after STOPPED -- but rather that onfreeze is fired on the way to BFCACHE. If the page is moving to bfcache (user navigation), both onfreeze and pagehide will fire. Suggestions for how to make that clear in the diagram are very welcome!

  1. State model doesn't account for crashes, except to say that probably all state transitions are skipped... on reload should wasDiscarded be set to true? Same if a site crashes during a lifecycle callback? Might be nice to provide a bit of data that they crashed (vs. were discarded).

That is indeed the intention to set the bit when possible - even for crashes / unexpected termination cases (eg. user undoes tab close). We want to set wasDiscared for instance, when the renderer crashes however it's not possible to provide this bit in all cases eg. if the browser process itself crashes.

  1. Consider: two site instances, both DISCARDED, when one gets re-loaded, how does the site logic know which state to resume (assuming one site instance didn't overwrite the other's data)? (Might need a GUID or some unique identifier to distinguish state? Is this a problem that can be solved in the API, or is it the web author's responsibility to handle?

Great point, this is captured here https://github.com/WICG/web-lifecycle/issues/4 and https://github.com/whatwg/html/issues/3378 The plan is to expose clientId and lastClientId on Document. clientId is the environment id that is currently only available on the service worker.

  1. Consider: two frames, same domain (running on the same HTML5 event loop), if one is STOPPED, must the event loop be able to distinguish which contexts are stopped vs. not? Same question for which Render steps become optional. Does STOPPED mean no more processing of the event loop (for all related browsing contexts?)

IIUC the case you mean is multiple tabs -- with their own top level pages on the same origin -- sharing the renderer process and event loop? Is that right? If so -- then one (meta) issue is that the HTML spec is disconnected from our (and likely other browsers too?) implementation. In our implementation we have page-level and frame-schedulers in addition to renderer-level scheduler. So it is not true that they are "sharing the event loop" per se. And yes there is a need to distinguish between which contexts are stopped. Please advise on how to clarify this -- should we update the HTML spec to match the reality of implementations? \cc @domenic

  1. What happens when one of the callbacks might hang (not returning control)? Should this situation be conveyed (similar to crash flag noted earlier).

The callbacks have an upper time-limit -- once the time-limit has reached the browser has two options: a. bail and discard the tab (and tear down the page) - since freezing has failed and the page could be in an inconsistent state. b. "pause execution" -- i.e. suspend JS without tearing down the page. There are tradeoffs and we are currently investigating option b.) with option a.) as the fallback.

Comment by @spanicker Mar 15, 2018 (See Github)

I have addressed the nits and updated the explainer. Thank you for the detailed review!

Comment by @travisleithead Apr 6, 2018 (See Github)

I would like to follow-up on whether I have any more feedback on the diagram (as requested above). Will try to gather that for an upcoming meeting.

Comment by @spanicker Apr 6, 2018 (See Github)

An update here: we have an initial version of the spec, and will add it to the explainer soon. Would be useful to get feedback from upcoming TAG meeting if possible, thanks!

Comment by @spanicker Apr 9, 2018 (See Github)

Here's a link to the initial draft of the spec: https://docs.google.com/document/d/1jEaYK7w-jbwdP31RR1egznxoh6c_1mtPJX6c-EUKaY8/edit# Feedback is very appreciated!

I will update this thread when the spec is moved to the github repo.

Comment by @spanicker Apr 12, 2018 (See Github)

I updated the diagrams in the repo to help address the confusion around frozen and bfcache. Feedback is very welcome, thanks!

Comment by @travisleithead Apr 17, 2018 (See Github)

The updated diagrams work for me! (It looks like events may fire on either side of the Frozen state, so that while in the Frozen state, nothing happens, but transitions into and out of that state are naturally not Frozen, so events can fire. That's my new understanding anyway.) Thanks for the clarifications!

Comment by @travisleithead Apr 17, 2018 (See Github)

This initial review looks great! We'll close this issue for now, but would love to be asked to review again once the official spec is more baked and ready.

We did note a concern about seeing a spec draft in Google docs ;-) But it looks like the Bikeshed-edition is underway (https://github.com/WICG/web-lifecycle/blob/master/web-lifecycle.bs).

Thanks for flying TAG; hope to see you again soon!