design-reviews#416: EditContext API

#416: EditContext API

Visit on Github.

Opened Sep 4, 2019

こんにちはTAG!

I'm requesting a TAG review of:

Name: EditContext API
Specification URL: https://w3c.github.io/edit-context/
Explainer (containing user needs and example code)¹: Explainer
GitHub issues (if you prefer feedback filed there): Issues
Tests: TBD
Primary contacts (and their relationship to the specification): [@snianu][@dandclark][[@alexkeng]

Further details:

Relevant time constraints or deadlines: None.
I have read and filled out the Self-Review Questionnare on Security and Privacy. The assessment is here.
I have reviewed the TAG's API Design Principles
The group where the work on this specification is: Editing WG

We'd prefer the TAG provide feedback as (please select one):

open issues in our GitHub repo for each point of feedback
open a single issue in our GitHub repo for the entire review
leave review feedback as a comment in this issue and @-notify [github usernames]

Please preview the issue and check that the links work before submitting. In particular, if anything links to a URL which requires authentication (e.g. Google document), please make sure anyone with the link can access the document.

¹ For background, see our explanation of how to write a good explainer.

Discussions

Discussed Feb 24, 2020 (See Github)

[discussing and analyzing the explainer]

Alice: does Rossen know about this?

Alice: a thing to capture events that may allow you to edit text...

Alice: explainer could have more images...

Ken: typing into an input field and you get feedback while typing like typing "icecream" and getting a prompt for an emoji...

Alice: is it the auhtor telling the page about the editing ...

Ken: both ways. If it replaces "ice cream" with the emoji it needs to know where ice cream is...

Ken: names / selection...

Ken: seems sensible... the API seems sensible. [internationalization cases ...]

Dan: I'm concerned about the responses to the security & privacy questionnaire

Ken: I assume this will only work with input methods shipped with the browser, like emoji picker and accessibility helpers. This is an API for the web site author to integrate with these, and not a way for developers to create their own input methods that will then get access to sensible information such as selected text/passwords etc. Maybe this could be made more clear.

Dan: let's review the assignment of this in plenary

MathML Core - @alice, @torgo, @hadleybeeman

[...discussion of where we're at now...]

[alice composes comments and pastes into issue

Comment by @torgo Feb 25, 2020 (See Github)

We are reviewing today.. One question I had was about the security & privacy questionnaire. It seems like this technology would have signifigant access to sensitive information... 2.4 and maybe needs to elaborate on how it would mitigate against misuse / elaborate possible abuse cases?

Comment by @kenchris Feb 25, 2020 (See Github)

Dan was worried about the access to sensible information. I assume though, that this will only work with input methods shipped with the browser, like emoji picker and accessibility helpers.

This is an API for the web site author to integrate with these, and not a way for developers to create their own input methods that will then get access to sensible information such as selected text/passwords etc.

Maybe this could be made more clear early on in the explainer

Comment by @BoCupp-Microsoft Feb 25, 2020 (See Github)

My definition of sensitive data would be any information conveyed by the user to the site that the user didn't explicitly intend to provide.

In the case of EditContext, which deals with delivering text that has been input by the user to the active document, I expect that the user's intent is to provide the site with the text that has been typed on a keyboard, or composed in an IME, or spoken to an OS speech to text input mechanism, etc. EditContext doesn't expose any of the details of how the text was provided directly, only the resulting text and some direction as to how the text should be decorated during the process of composition.

The text data input by the user is already available through alternative means, e.g. the beforeinput event; the EditContext is just providing it in an alternate way that is decoupled from the DOM. The decoration information seen in the IDL as part of the TextFormatUpdateEvent is new, but we don't consider it sensitive.

If there is a threat to be considered for formatting information, it would be that an author may differentiate one input method from another based on the conventions that the input method has adopted for formatting its text during composition. For example, speech input may use a dotted gray underline while a Japanese IME would use a solid black underline. This formatting data is necessary for sites like Google Docs and Office Online to meet user expectations during text input, and we believe its acceptable to expose this information to web sites so that the user can input text in a way that is consistent with their experience.

Note that there are other mechanisms already that may reveal similar information about the user. One example are the heuristics by IMEs to suggest candidates for some phonetic input that are most frequently selected by the user. The first candidate after typing the phonetic input will be inserted into the DOM and visible to the author's script. While this may provide some new bit of information to fuel fingerprinting, it allows the user to have a fast and consistent input experience, which IMO outweighs the minor privacy concern.

I hope this helps. If this is the information you're looking for we're happy to include it in the explainer. If you need more or disagree with any of the points we've raised please let us know.

Thanks!

Comment by @cynthia May 27, 2020 (See Github)

@atanassov and I discussed this during the VF2F.

From a high-level perspective, we are happy with this proposal. It's a complex, previously unaddressed problem and we are glad to see someone working on this.

I personally would like to see how this is expected to behave in corner-case scenarios, and what chain of events come out in different user scenarios across different languages (e.g. non-compositing, compositing without candidates, and compositing with candidates) to better understand if we are actually solving this problem once and for all.

As for the privacy issue noted above, we discussed this at length and concluded that this is probably a non-issue.

Comment by @cynthia Sep 24, 2020 (See Github)

Nit discovered by @hober during F2F review: There seems to be a typo/error in Example 1 and Example 3, the former references EditView but the implementation is named EditableView.

Comment by @hober Sep 24, 2020 (See Github)

Also, are this.computeSelectionBoundingBox() and computeSelectionBoundingRect() supposed to be the same thing? If so, what is it? Is it the bounds of the current selection, or is it the bounds of the editable area within which selection can occur?

Comment by @alice Sep 24, 2020 (See Github)

This looks promising, but it's a very complex API and the explainer is quite terse and seems to omit a lot of detail, so it's hard to review the details of the API.

For example:

Additionally, the layout bounds of selection and conceptual location of the EditContext in the view should be provided by calling updateLayout.

window.requestAnimationFrame(() => {
    editContext.updateLayout(editContainer.getBoundingClientRect(), 
                             computeSelectionBoundingRect());
});

I'm not quite sure what this is doing. Update the layout of what? Why does it take these two rectangles? Why does it need to be done asynchronously (via rAF)?

In general, it would be helpful at least if the IDL had extensive comments explaining the purpose of each enum, object and method. The code examples could also use more extensive comments to explain each call into the proposed API.

Also, it would be great to illustrate via targeted (minimal) code examples how the API solves each of the problems listed in the Real-world Examples section. That section is extremely helpful to understand the context of the API, but it's not made explicit how the API solves those problems.

Comment by @cynthia Sep 24, 2020 (See Github)

Additionally, it's a bit unclear how this would work with RTL languages, since it involves selection and selection behaves differently there. Do you have any thoughts on this?

Comment by @snianu Mar 4, 2021 (See Github)

Apologies for not responding to these questions earlier. We have been working with web devs and other Browser vendors to figure out all the intricacies related to selection (e.g RTL selection movement, word break etc), editing commands, accessibility etc. We are working on updating the explainer with more detail texts & examples. We will update the status here once we have addressed all the concerns. Thanks for all the feedback so far! Tagging @alexkeng who is actively working in this space.

Discussed May 1, 2021 (See Github)

Sangwhan: Doesn't look like there has been much progress.

Rossen: Let's leave a comment that we will punt this until they update us.

Comment by @atanassov May 11, 2021 (See Github)

@cynthia and myself took another look at this proposal during our May 2021 vf2f and we are still waiting for more context from the authors. Please let us know when you have updates and are ready to discuss further.

Discussed Sep 1, 2021 (See Github)

Rossen: Still no progress from filers. Left comment

Comment by @atanassov Sep 16, 2021 (See Github)

@LeaVerou and I refreshed this again during our Gethen vf2f. We still don't see observable progress. @snianu if this work is still ongoing we would like to know otherwise we will close the review as stalled.

Comment by @alexkeng Sep 17, 2021 (See Github)

sorry for late response. This work is still ongoing and here is the latest explainer

@cynthia

I personally would like to see how this is expected to behave in corner-case scenarios, and what chain of events come out in different user scenarios across different languages (e.g. non-compositing, compositing without candidates, and compositing with candidates) to better understand if we are actually solving this problem once and for all.

We have updated the EditContext Event Sequence section in the explainer and added a new section Difference between Contenteditable element and the EditContext element with a summary table for events in various editing scenarios. Please see the explainer for more details about the event behaviors.

There seems to be a typo/error in Example 1 and Example 3, the former references EditView but the implementation is named EditableView.

The old examples are obsolete and replaced with 4 new code snippets and 2 sample pages.

Additionally, it's a bit unclear how this would work with RTL languages, since it involves selection and selection behaves differently there. Do you have any thoughts on this?

We haven't really tested RTL languages. We will get back to you when we have more info.

@hober

Also, are this.computeSelectionBoundingBox() and computeSelectionBoundingRect() supposed to be the same thing? If so, what is it? Is it the bounds of the current selection, or is it the bounds of the editable area within which selection can occur?

yes, they are the same, and it's the bounds of the current selection (The old examples are deleted, but the sample of updateLayout can be found in Example 4)

@alice (though she is unassigned)

I'm not quite sure what this is doing. Update the layout of what? Why does it take these two rectangles? Why does it need to be done asynchronously (via rAF)?

It doesn't have to be done asynchronously, please see the comment in Example 4 for more details.

In general, it would be helpful at least if the IDL had extensive comments explaining the purpose of each enum, object and method. The code examples could also use more extensive comments to explain each call into the proposed API. Also, it would be great to illustrate via targeted (minimal) code examples how the API solves each of the problems listed in the Real-world Examples section. That section is extremely helpful to understand the context of the API, but it's not made explicit how the API solves those problems.

Thanks for the suggestion! We have added more comments in the new examples, which should help explain things a litter better. As for adding code examples to show how the API solves each of the real-world problems, we do have some prototypes that show how EditContext can address these scenarios, ex. using EditContext for composing across page boundaries, and collaborating while composing text, but some of them are quite big samples and involve some open source projects. We'll evaulate it and update the explainer with appropriate ones.

Comment by @plinss Nov 17, 2021 (See Github)

I'm questioning why this is built around a plain text buffer model. What happens when the element that an EditContext is attached to has children? This API seems to fall down hard in that case.

More specifically, rather than using text offsets, and a string, why not use DOMRange and DocumentFragments? I accept it adds some complexity, but ti also serves the use case of actually editing a web page.

Comment by @snianu Nov 17, 2021 (See Github)

@plinss I think it's because the OS text input services only understand plain text. When interacting with IMEs, browsers serialize the DOM into plain text view and send it to the IMEs via system APIs provided by the OS text input services. EditContext is a web API that lets authors have control over what part of the content to serialize and when to communicate that to the IMEs. This model is described here in the explainer.

Discussed Apr 25, 2022 (See Github)

left some comments

Comment by @plinss Apr 26, 2022 (See Github)

I understand that the OS services have a text based API, but it still seems that the behavior with elements that have children is underspecified. Regardless of what the behavior is, if it's not specified it's not going to be interoperable.

Also, any answer to the question about using DOMRange? (You might also want to look at pending improvements there

Discussed Jul 18, 2022 (See Github)

Dan: still pending answers to questions Peter raised. Sangwhan said let's close it. Got a response that they're updating the explainer and then update the thread. Sounds like we should wait for the update.

Other:

Discussed Jul 18, 2022 (See Github)

Rossen: status?

Dan: a comment from Peter in April - no response yet.

Rossen: proposed API has a very strict scope on text. We can debate that further .. if the topic needs to expand .. to elements .. reinvent entire DOM inside... Not interested.

Peter: I just think there's under-specified behaviour... Within the scope of this proposal...

Peter: if you call this on an element that has children what happens to the children? The answer could be "throw them out" - just want to have consistent implementations..

Rossen: ok that's fair.

Peter: there's a question of why not use DOMrange... just an API shape question... just about making it more consistent with other APIs already in the platform. No answer to that either.

Dan: we just need an answer to the questions being asked here...

Sangwhan [via chat]: we should close.

Rossen: I will follow up.

Other:

Comment by @torgo Jul 18, 2022 (See Github)

We're just picking this up today to try to see if we can close. It looks like we're still pending an answer to the questions @plinss raised above.

Comment by @snianu Jul 18, 2022 (See Github)

@torgo Apologies for the delayed response. @alexkeng is working on updating the explainer and spec with all the details. We will update this thread once we have addressed all the concerns.

but it still seems that the behavior with elements that have children is underspecified. Regardless of what the behavior is, if it's not specified it's not going to be interoperable.

Yes, this is something that we've discussed internally and added more details about it in the explainer and the spec that we are working on right now. Once we have addressed all the feedback from the developers, we can post the updated docs here. Thanks for raising this concern!

Comment by @torgo Jul 19, 2022 (See Github)

Ok thanks for that! We'll re-review and close when we get a chance to look at those updates.

Discussed Aug 29, 2022 (See Github)

Rossen: I feel everything I wanted to put forward - how I see this API and how it could be used for editing - is already stated in the minutes... Peter also stated his points. At this point what is the benefit?

Peter: The main overall shape is OK ... the last questions are they need to specify what happens with content that has children.. needs to be specified. Questions on use of DOM range. They said they are going to update the explainer and spec... they added details to the explainer and working on changes to the spec. SO I think we're waiting for them to update the docs.

Discussed Nov 28, 2022 (See Github)

Dan: concern to do with underspecified behaviour of elements with children.. There were updates in November.. pinging again

Comment by @torgo Nov 28, 2022 (See Github)

Hi @snianu I notice no updates since 8-November. Does that mean you're ready for us to re-review? Thx!

Comment by @alexkeng Nov 30, 2022 (See Github)

Hi @torgo, no, EditContext is not ready for re-review, in fact, the project is on hold at the moment due to resource constraints. We'll update the ticket when we re-start the project, thanks!

Comment by @torgo Dec 1, 2022 (See Github)

Ok - in that case I think we're going to go ahead and close this one. Please ping the issue or a TAG member directly when you think it's ready to be re-opened. Thanks!

Comment by @snianu Jun 6, 2023 (See Github)

@torgo @dandclark is now actively working on EditContext and has been updating the spec as well. Requesting to re-open this issue to continue TAG review for this feature. Thanks!

Comment by @dandclark Jun 6, 2023 (See Github)

こんにちは TAG-さん!

Apologies for the extended delay here. We've got capacity to pick up this spec again, and I'm going to take point on driving it.

Can this issue be reopened, or should I file a new issue for the review?

The spec draft for the feature has evolved quite a bit from when this was initailly opened, and some of the materials have moved around. @snianu's top post has been edited with the new links, or you can find the spec draft directly here.

To start addressing some of the last questions on this thread from @plinss, the intended behavior when an EditContext has children is that there will be a mix of behavior borrowed from contenteditable and some behavior that diverges. This section overviews these similarities and differences: https://w3c.github.io/edit-context/#edit-context-differences.

Essentially, the browser will handle caret navigation, and child content of the EditContext will inherit editability such that the user can click/arrow-key into it and move the caret around in it. Where EditContext primarily differs is that when the user tries to add or delete content via key or IME input, the browser will not modify the content automatically. Instead it will fire events against EditContext so that the page author can perform the modification as they see fit.

A primary goal of this API is to give the author a primitive to interact with OS text input services more directly. This is done via a plain text buffer since that's what the text input services use. The authors are then responsible for translating this into the view that will be presented to the user, either building it with DOM nodes or painting to a <canvas>.

Comment by @dandclark Jun 8, 2023 (See Github)

@torgo tagging you directly about the request to reopen since I don't know if there's much visibility for closed threads. Thanks!

Comment by @chrishtr Jun 8, 2023 (See Github)

(I went ahead and reopened, hope that's ok.)

Discussed Jul 1, 2023 (See Github)

Rossen to read & generate comments.

Discussed Aug 21, 2023 (See Github)

https://github.com/w3c/edit-context/issues/38
- https://github.com/w3c/edit-context/issues/53
- https://github.com/w3c/edit-context/issues/54

Discussed Aug 28, 2023 (See Github)

Rossen: quite a face-lift. Most of the points were addressed...

Dan: relevant issues:
https://github.com/w3c/edit-context/issues/38
https://github.com/w3c/edit-context/issues/53
https://github.com/w3c/edit-context/issues/54
...examples of resolved issues.

Rossen: Peter I think the one you were concerned with is 53.

Peter: I was concerned that it was underspecified... My only other feedback was ... why aren't these in the DOM range?

Lea: being underspecified would be fine for an early review...

Peter: but this was from 2019...

Peter: do we know the status? There was work happening on a DOM range API... wonder if anything ever became of that? I would suggest they incorporate that...

Lea: this seems to require a lot of manual syncing between this model and the DOM... and usually that can introduce error. So I'm a bit worried that this is designed for the needs of big apps like google docs but not really usable by the average developer...

Peter: original model for building your own editor...

Lea: there's a lot of projects that require that that are not maintained by large companies...

Dan: e.g. cryptpad is an example that's not...

Peter: original API design a simplistic model - editor from scratch - that may not be the only use case...

Lea: this seems to make complex things possible but simple things are not easy.

Dan: reasonable to challenge them on the developer complexity issue.

Lea: on the other hand - as a developer it looks really tedious to do but it's better than contentEditable... Maybe that's what we need.

Dan: could we ask them "what is your story for developer complexity?"

Rossen: I was pretty close to it when it started -- model made sense to me when it was text editing. Certainly needed as a base construct. Having editors generally speaking ... the web has outpaced the capabilities of editors... giving web developers a way to build the rich editors they want to... is the idea. If you're building an editor, you're already an advanced developer... So... I stand behind the fact that this is a great step forward. I don't see it as isolating any developer segment more than any other API...

Lea: Trying to think it through - one thing that gives me pause - APIs where if you want to do something custom then you have to do it yourself - APIs that have a cliff. This could be alleviated by providing a shortcut to create an editContext that already has the correct plumbing. For example if you want to have a text editor thats a different color, or a markdown editor, these are simple use cases... but still very painful in the web platform today. it would be quite nice to not have to write a ton of code to provide an editor that has an improvement... of what the web platform has today.

Rossen: If you assume steady state of a document then you don't need anything... Editing is all about handling and reacting to the actions of the users... changing parts.. Reacting to actions... EditContext is all about reacting... When it comes to text, range API is not great... What you're talking about are higher level edit commands that are built on top of constructs like the EditContext?

Lea: from what I'm seeing it seems you have to write code to sync up basic things - like undoing... all of these things you would normally get for free - you have to write code to sync with the DOM... I'm worried this is quite error prone. Example 3 for example: https://github.com/w3c/edit-context/blob/gh-pages/explainer.md#example-3-mapping-the-selection-from-dom-space-to-editcontext-plain-text-space

Peter: yes this gets to my point - my concern is that it's modelled around a simple text buffer - not structured content... that's what I was asking for in the first place...

Lea: also this https://github.com/w3c/edit-context/blob/gh-pages/explainer.md#undo web based editors rarely use the DOM undo stack.. there are many benefits from not having to maintain an undo stack of their own... for certain editors having an undo stack maintained by the browser is not workable but for something like a code editor, as long as it can be undone reasonably there are no reasons for the develoeprs to want their own undo stack... I would much prefer see somthing that allows you to tweak the existing undo stack or replace it if you want but it shouldn't be a requirement to write an undo stack to use this...

Dan: can we provide feedback that encompasses both positives and negative points?

Lea: I think it's about developer experience... for some use cases this is the right developer experience but for some cases it's not right... We have a design principle... https://w3ctag.github.io/design-principles/#simplicity

Rossen: fwiw I think the undo stack is not mentioned in the spec... Is the explainer maybe stale?

Lea: the explainer has a bunch of things it doesn't do... https://github.com/w3c/edit-context/blob/gh-pages/explainer.md#interaction-with-other-browser-editing-features

Dan: explainer updated 3 months ago...

Lea: also: when you have an API of this kind... can end up with buggy code. buggy editors ... all over the web.

Peter: 2 models - (1) you're building an editor - e.g. google doc - this API is focused on solving that proboem .. in that regard, positive step (2) the browser already has some rich capabilities - but no way to extend / augment those - which is what we'd like to see. My feedback is that they are trying to solve (1) - that's fine - my concern is that I don't see how this feature ties into explaining the editor functions of the browser... Imagine if we opened up the browser's editor...

Lea: meta-comment - this type of thing is exactly why we need to have this initiative [task force] to identify gaps in the web platform...

Rossen: EditingContext is part of the editing task force that's focusing on editing...

Lea: why is it so focused on the specific use case?

Rossen: focusing in the base primatives...

Peter: seems like a side thing of people trying to replace the broswer editor... would love to see this evolve towards a way to augment the browser editor...

First, it's great to see work in this direction, and we definitely recognize the need for it!
  
While many editing use cases do require re-implementing capabilities like undo or caret management, not all use cases share that complexity. Right now, it appears that this API primarily caters to the most complex of cases that require re-implementing many fundamental aspects of text editing, but introduces [sharp cliffs](https://w3ctag.github.io/design-principles/#high-level-low-level) for the simpler use cases, those that merely need to augment regular text editing (case in point: the Markdown editor right here on GitHub). As mentioned in our design principles, there's value in designing APIs that keep [simple things simple, and complex things possible](https://w3ctag.github.io/design-principles/#simplicity).
  
Could this be the begininng of an effort to support *augmenting* the browser editor capabilities, when you don't need to entirely reimplement them? This would not necessarily require a different design: perhaps it can be done with shortcuts (e.g. a way to get an `EditContext` that already has the right plumbing based on a given editable element in the DOM (`<input>`, `<textarea>`, `<div contenteditable>` etc)).

Comment by @torgo Aug 28, 2023 (See Github)

Relevant issues:
https://github.com/w3c/edit-context/issues/38
https://github.com/w3c/edit-context/issues/53
https://github.com/w3c/edit-context/issues/54

Discussed Oct 9, 2023 (See Github)

Rossen: last time we talked about it you had an action item to look through the issues that were being worked out.

Peter: I don't see how the three posted are related to what I was talking about

Discussed Oct 16, 2023 (See Github)

Peter: the listed issues don't seem to be addressing my concerns.. i will follow up with this and I will leave a comment.

Comment by @plinss Oct 19, 2023 (See Github)

Thanks for the updates, at this point we're going to close as satisfied. I have a few personal concerns about usability, but I think experience working with the API will be the determining factor there. My primary concern was avoiding under-specified behavior and that seems to be addressed now. Thanks for flying TAG.