#594: Early design review of modal close signals/ModalCloseWatcher

Visit on Github.

Opened Jan 7, 2021

HIQaH! QaH! TAG!

I'm requesting a TAG review of "Modal Close Signals".

A common feature of modals (dialogs, context menus, pickers, etc.) is that they are designed to be easy to close, with a uniform platform-wide interaction mechanism for doing so. Typically, this is the <kbd>Esc</kbd> key on desktop platforms, and the back button on some mobile platforms (notably Android). Not all platforms have such close signals, but for those that do, reacting to them is challenging to do correctly with current web APIs.

The explainer for modal close signals outlines the problem space, and goes into depth on one specific solution: a ModalCloseWatcher class, which provides a platform-agnostic way for developers to intercept these close signals. However, it also notes an alternative of bundling close signal handling into higher-level modal/popup APIs.

We'd appreciate any early feedback you have on both the problem space and the solution space. What do you think of our analysis of the problem space? Do you think ModalCloseWatcher is a good idea, or should we bundle into a higher-level API, or is there a third path we're not considering? Do you agree with our analysis that, even if we don't expose a ModalCloseWatcher class as a web API, the spec and implementation infrastructure could build off of something like it under the covers?

Further details:

  • I have reviewed the TAG's API Design Principles
  • The group where the incubation/design work on this is being done (or is intended to be done in the future): WICG, most likely, or maybe as a HTML Standard pull request
  • The group where standardization of this work is intended to be done ("unknown" if not known): WHATWG (HTML Standard)
  • Existing major pieces of multi-stakeholder review or discussion of this design: none yet
  • Major unresolved issues with or opposition to this design:
    • As mentioned above, we're unsure whether to expose a ModalCloseWatcher API vs. bundle into a higher-level API
    • We've recently discovered the situation with <dialog> is a bit more complicated than we thought, so that might impact the proposed <dialog> integration, and may cause other changes to the API out of a desire to keep symmetry with <dialog>: https://github.com/slightlyoff/history_api/issues/13
  • This work is being funded by: Google

We'd prefer the TAG provide feedback as (please delete all but the desired option):

💬 leave review feedback as a comment in this issue

Discussions

Discussed Jan 1, 2021 (See Github)

Tess: I've been meaning to talk to jcraig about this; I'm reminded of IndieUI. This looks to be solving a narrower problem, though in a more imperative & complicated way.

Action: Tess to talk to jcraig

Peter: Thoughts on the API surface:

  • Activating a modal watcher via construction seems odd, would be nice to be able to construct one and activate it later, maybe something closer to the AbortController API.
  • No way to tell if a given watcher is the currently active one.
  • What happens when a watcher gets destroyed or a signalClose, but it's not the active watcher? e.g. a modal dialog opens another modal dialog, then the parent times out or gets a network event.

Tess recapped a bit about the accessibility events in AOM that got replaced with synthetic keyboard events, & some about the Indie UI "dismiss" event.

We contemplated the apparent fact that the web, for better or worse, has ended up with a 1990's desktop computer (monitor, keyboard, mouse) as it's basic abstraction. We agreed that having to add a bespoke XWatcher object to the platform for every niche X seems unoptimal, and suspect a simple, higher-level approach (like Indie UI's "dismiss" event) would enable browsers to solve the Android back button problem without all this complexity.

We also discussed the possibility of a higher level modal controller that also captures event, etc. e.g. all the things you need to do when starting a modal session, rather than small bits that you'd make one of those out of. Modal elements like Dialog could simply have one of those.

[Tess notes from rollout discussion, to turn into comment] We would prefer sending synthetic ESC key presses if/when the modal is either a real <dialog> or is marked up with ARIA. This problem seems the same as the increment/decrement synthetic arrow events thing in AOM—authors simply do what they should be doing anyway (provide aria attributes to say what their div soup is doing) and then the browser sends the events as needed.

Comment by @annevk Jan 8, 2021 (See Github)

I don't understand why we would not start with "Specify uniform-per-platform behavior across existing platform modals" to find out how desirable this is and if it turns out to work great, potentially give developers some control as well? It's much easier to provide an API on top of a well-baked primitive.

Discussed Feb 15, 2021 (See Github)

Tess to write up a comment on the issue summarizing our previous discussion on this.

Comment by @domenic Apr 9, 2021 (See Github)

Hi TAG! I did another pass on this proposal to address some feedback folks have given so far:

  • I renamed from ModalCloseWatcher to CloseWatcher since I don't want to canonicalize the term "modal" here. There's a lot of current thinking going on about how to name the various dialog/popup/toast/picker/menu/modal/etc. patterns in Open UI and it seems better to just be generic.

  • In response to @annevk's feedback, I elevated the unification of existing platform close signals to a top-level goal. I agree we should work on such unification as part of this work. (I know @annevk proposed doing so ahead of time, but I don't know if there'd be appetite for doing so if it's not driven by the prospect of actually fulfilling the expressed web developer need.)

  • I unified with <dialog> further. In particular this meant renaming beforeclose to cancel and talking more about how to reconcile with <dialog>'s existing behaviors.

  • I did some more research on iOS and accessibility technology so I could talk in an informed way about their close signals.

  • After some discussion with web developers we decided that this is worth pursuing alongside, and integrated with, the <popup> proposal. So the idea of just doing <popup>, and not doing CloseWatcher, makes less sense to us now.

I'd love to get your further thoughts on the revised proposal! One particular API issue that would benefit from TAG feedback is https://github.com/slightlyoff/history_api/issues/34.

Discussed May 1, 2021 (See Github)

Peter & Tess talked about this at our last F2F but never posted public comments after that discussion; we've now done so.

Comment by @hober May 12, 2021 (See Github)

Hi @domenic!

Sorry it's taken so long to get back to you on this one. @plinss and I looked at this today during the TAG's F2F, and also some months ago at the previous TAG F2F, after which we failed to follow up here.

If we take a step back from the actual specific problem you have, adding a bespoke <code><var>X</var>Watcher</code> object to the platform for every <var>X</var> in the future wouldn't be great.

While a simple, higher-level approach (like Indie UI's "dismiss" event) would enable browsers to solve the Android back button problem without all this complexity, we're reminded of the experience the AOM folks had (when they defined specific accessibility events but then ripped them out and replaced them with synthetic keypresses and the like).

Given that, we find the simplicity of going with synthetic Esc presses instead of anything more complicated here really appealing. This would be just as easy for authors to handle as the increment/decrement synthetic arrow events thing in AOM—authors simply do what they should be doing anyway (provide aria attributes to say what their div soup is doing, and handlers for arrow key presses) and then the browser sends the events as needed. Note that the browser can often target the synthetic Esc correctly, if there's a real open <dialog> on the page or an element marked up with the appropriate ARIA. Maybe the thing to do for more "exotic" closable things is to add ARIA roles for them?

Comment by @domenic May 19, 2021 (See Github)

Hi Tess! Thanks to you and Peter for taking a look! Your point about the connection with IndieUI/AOM is well-taken, and I'll reach out to those folks to learn more about the history there. Also, some of the specific API feedback pieces from the minutes are great and we'll definitely take those into account.

I worry that one big thing that was missed in the review was our section on why synthesizing an event doesn't work that well. As that section mentions, to do such a translation, the browser needs to know whether a given gesture/keypress/back button/etc. is a close signal, or is something else (like a navigation on Android, or a scroll on iOS).

To do this, the web developer needs to provide some signal to the browser that they're in a situation where close signals make sense. We've provided that via new CloseWatcher(). You could imagine other routes for giving the browser this signal, but fundamentally such a signal is needed. (Indeed, I think your suggestion about detecting elements with ARIA roles is in this category, but that'd be rather unprecedented: ARIA roles currently only affect accessibility technology, and do not change behavior, which is something I believe the ARIA WG has worked hard to preserve so that people don't abuse accessibility annotations to achieve a given user experience for non-AT users.)

Another thing to consider is the abuse prevention discussion. Abuse prevention is relatively easy with an imperative API like this, where there's a clear time at which the "trap" could be installed (i.e., upon the construction of the CloseWatcher object). It gets harder with other APIs, such as ones based on the presence or absence of elements in the DOM. For example, we don't want to modify generic DOM insertion code to throw if someone inserts too many elements with a given ARIA role.

Finally, I'm curious about the implication that this approach is very complex/complicated. Would you be able to say more? From my point of view and that of framework authors I've discussed this with, this is basically "as simple as possible": a constructor to signal that you want to watch, and an event to get notified of the thing you're watching for. Framework authors generally preferred this to the complexity they'd have to introduce into their application to differentiate between different types of <kbd>Esc</kbd> keypresses, or coordinate which of several active components a given document-targeted <kbd>Esc</kbd> is meant to close, or insert elements like <dialog> into the DOM which might not fit with their application structure. I wonder if we shot ourselves in the foot here by writing too long of an explainer...

Indeed, although I recognize thinking on the extensible web manifesto has evolved over time, I think this is one of the cases where it really shines: excavating simple primitives that underlie elements like <dialog>, or the proposed <popup> light dismiss behavior, and making them first-class so that everyone can use them without needing to rely on browser magic such as synthetic <kbd>Esc</kbd> keys or elements with special privileges.

Let us know what you think!

Comment by @juandopazo Jul 30, 2021 (See Github)

We were just chatting with a friend at another big Silicon Valley company about this. Apparently some developers there thought it was a good idea to try to replicate this behavior with the history API. And it's a mess.

Ship this please! It looks great! Very simple and independent from appHistory. No brainer.

Discussed Aug 30, 2021 (See Github)

Punt to plenary to get Tess' feedback

Discussed Sep 1, 2021 (See Github)

The two of us looked at this in a previous F2F, and left a bunch of comments that Domenic addressed in a followup. Overall we both think this is basically fine, though we're still worried that "adding a bespoke <var>X</var>Watcher object to the platform for every <var>X</var> in the future wouldn't be great. Closed.

Comment by @hober Sep 13, 2021 (See Github)

Hi @domenic! Thanks for your detailed reply and your patience. @plinss & I took another look at this today and we're pretty happy with where you've ended up. We're still worried about "adding a bespoke <var>X</var>Watcher object to the platform for every <var>X</var> in the future," but maybe there won't be that many <var>X</var>s in the long run.

Comment by @hober Nov 16, 2021 (See Github)

Note that the AOM people settled on generating an escape keyboard event. Should we reopen this? @domenic, are you working with the AOM people on this?

https://github.com/WICG/aom/blob/gh-pages/explainer.md#user-action-events-from-assistive-technology

Comment by @domenic Nov 16, 2021 (See Github)

Unfortunately I have not yet; I got confused when I realized 3/4 of the editors were no longer working on the project and that was enough to throw me off of my initial attempt. I will attempt further outreach via the repository, although it looks like the most recent answered question on the repository was one filed October 2020 :(

Comment by @hober Nov 16, 2021 (See Github)

@domenic They have a weekly call. Maybe reach out to @cyns?

Comment by @slightlyoff Oct 18, 2023 (See Github)

Sorry for the late feedback here, and from the sidelines. I've raised design issues over in the Intent-to-Ship thread and would like the TAG to weigh in on the API style concerns:

https://groups.google.com/a/chromium.org/g/blink-dev/c/jM5au7yYzHM/m/5SWuXPMdAgAJ