#716: I18N String-Meta and WebIDL

Visit on Github.

Opened Mar 8, 2022

صباح الخير TAG!

I'm requesting the TAG express an opinion on a "dispute" related to:

  • Name: String-Meta (language and direction metadata)
  • Specification URL: String-Meta
  • Explainer (containing user needs and example code)¹: here
  • User research: N/A
  • GitHub issues (if you prefer feedback filed there): i18n-discuss#23
  • Primary contacts (and their relationship to the specification): @aphillips @r12a @xfq
  • The group where the work on this specification is: I18N
  • Links to major pieces of multi-stakeholder review or discussion of this specification:
  • Links to major unresolved issues or opposition with this specification:
  • Relevant time constraints or deadlines: "soon" (we want to approach ECMA-402 and ECMA TC39 and/or discuss with WebIDL)

Explanation of the issue that we'd like the TAG's opinion on:

This isn't quite a "normal" technical dispute, but we do seek a conversation with TAG about the technical approach we are taking. We believe that interoperability of natural language strings between different Web APIs is strongly desirable

Quoting our explainer:

We would like TAG to review our approach to this problem and discuss what the right long term approach should be in the Web platform. We believe that this is an important gap for natural language support on the Web; but we are concerned that our current approach and comments generates churn or is distracting to Working Groups attempting to complete work on specifications.

Our immediate request has to do with webidl#1025 wherein we requested that WebIDL add a Localizable type to IDL. This would allow specifications to reference this string type and save them creating a local dictionary representation. The WebIDL folks do not want to do this because it is at odds with their normal practice of providing only JavaScript primitives and types. They also don't want to become a registry of random dictionary entries.

One way to solve this would be if W3C and ECMA-402 proposed a natural language string type with these attributes to ECMA TC39. If that proposal were ultimately successful (and it will take at least one complete JavaScript release cycle to be accepted and reach the specification), then WebIDL could encode the type in their specification. This would be the most durable and platform-wide solution. On the down side, this would require probably 1-3 years before specifications would have a ready reference and it is unclear if such a type would be accepted or implemented by TC39.

Another alternative, possibly acting as a shim for eventual standardization by ECMA TC39, would be for I18N to define a dictionary and ask specifications to adopt it generally for natural language string values.

Links to the positions of each side in the dispute (e.g., specific github comments):

webidl#1025

What steps have already been taken to come to an agreement:

We don't actually disagree with WebIDL. Some working groups have pushed back on our comments asking for direction metadata because of the lack of a standardized representation on the Web, such as Webauthn and Web Payments.

We'd prefer the TAG provide feedback as (please select one):

  • leave a comment in the following GitHub issue: i18n-discuss#23

  • leave review feedback as a comment in this issue and @-notify [github usernames]

  • open a new issue in our GitHub repo with the feedback

  • we would like to have a joint call with representatives of the TAG if appropriate

For our own housekeeping: [I18N-ACTION-1103]

Please preview the issue and check that the links work before submitting. In particular, if anything links to a URL which requires authentication (e.g. Google document), please make sure anyone with the link can access the document.

¹ For background, see our explanation of how to write a good explainer.

Discussions

2022-05-09

Minutes

Sangwhan: why is it a type shouldn't it be a trait. oh webid doesn't have trait? An object that can have a string represetnation should be localizable. For example if you have an object that has a toString... you want toString to return a string that has a localizable trait.

Yves: in unicode there were specific codes to identify the direction.. currently it's deprecated... about to be removed... [from unicode]...

Sanghwhan: RLM? https://en.wikipedia.org/wiki/Right-to-left_mark

Yves: https://w3c.github.io/string-meta/#unicode_enough - the problem is what happens when you want to do string equality... this would be a good way of doing it because from the API point of view there is no change but from everything that needs to process strings there is an issue. I don't think there is a "right solution" - it's just trade-offs.

Dan: can the priority of constiuencies be helpful?

Sangwhan: we have to think about it from a developer perspective... something similar from the I18N wg that we reviewed... it was possible to decide which was better based on ergonomics. I don't see an immediate difference here.

Dan: easier for developers to use -> more use of I18N -> better I18N of the web overall?

Yves: the balance is the need for I18N on one side and... inconvenience for developers... The issue is that localizable strings are mixing data and metadata ... more difficult than an object that represents some data...

Yves: if it's a new type in the language [js] then it's easy to use it in WebIDL and everywhere else. but you could say the same for mandating the use of unicode description characters... if it's present in the way JS is handling strings for instance then it solves the issue.. not sure what the impact would be for the browsers.. if info [meta] would be kept along with the data. It's easy to get the data to process it and forget about the metadata... either having a specific type or mandating the use of the description ... which was deprecated because nobody was using it. To me it's more of an ecmascript thing for them to decide first. Either add a new type or processing of unicode characters...

Dan: we could recommend that ecmascript community makes this decision

Yves: ensuring that the data and metadata are as closly tied togetether as possible so the metadata is not lost.

Hadley: given that they are already talking to ecmascript community what is they want us to do?

Sangwhan: weigh in on which tradeoff the web should go for?

Max: agree what Yves said - it should be coupled - centralized way for metadata and text tu run together to make it easier for the developer to use... otherwise difficult to be consistent.

Sangwhan: they want to have a proposal into TC39 that also is adapted by WebIDL...

Yves: i think it's better to start with the native type - ecmascript.

Dan: since the webidl people follow TC39 ... TC39 is the right place...

Yves: it would be good to get Apple's and MS's opinion... people working on I18N.

[discussion of DOM string and TC39 string]

Dan: draft tag view might be - "be guided by developer ergonomics; data and metadata closely bound; should follow from TC39 consensus on what the right approach is and then adjust WebIDl accordingly..."

2022-06-06

Minutes

Max: we requested them to provide some code example... he has promised to add a code example in the explainer but it hasn't been done yet.

Dan: Can we close then?

Max: that's one comment... from TAG. We need another round of review after they provide the example. They have multiple choices - one option is to do the standardization in webidl but webidl people don't like it. - another option is to do it in emaca - that will take too long. Another alternative, possibly acting as a shim for eventual standardization by ECMA TC39, would be for I18N to define a dictionary and ask specifications to adopt it generally for natural language string values.

Yves: last time we talked about this - the data and meta-data close together so it's not lost... To me the easiest option for implementers - either in unicode encoding (currently deprecated because of lack of usage - which means all string libraries) - or have ecma work on a specific type that can be a regular string plus indication. If we don't have that then it means we would always use structures that would be lost at some point.

Hadley: we were going to circle back to them...

Yves: https://github.com/w3c/mediacapture-main/issues/665 and https://github.com/w3c/miniapp-lifecycle/issues/25 are discussions ongoing...

Dan: ecma solution possibly best?

Hadley: found in previous minutes: "draft tag view might be - "be guided by developer ergonomics; data and metadata closely bound; should follow from TC39 consensus on what the right approach is and then adjust WebIDl accordingly..."

Dan: so we should post this.

Hadley: rewrites in full sentences

Dan: and then?

Hadley: ...ask them if there is anything we can do to help.

Yves: keep it open until we find a place for this...

Dan: we should revisit at the plenary and then push it out a few weeks to revisit again.

Hadley: posts comment

2022-07-11

Minutes

Hadley: we have feedback to Addision P - guideded by developer ergonimics... tc39 consensus and then to webidl. And we left with a question is there anything we should do and no reply. So will nudge. I think our work is done.

Yves: i already talked to Richard Ishida @r12a and @Xfq... it was done already last week. Also ping on the issue.

Hadley: [leaves comment]

2022-07-18

Minutes

Hadley: waiting for a response

Dan: we got a response

Hadley: we're not against it... but compared to what? Who do we know at TC-39?

Sangwhan: Dan E. is on TC39...

[Sangwhan & Hadley leave comments]

Dan: set milestone for next week and we can close then.

2022-07-London

Minutes

Hadley: we did our piece on this - I think we're done.

Dan: I agree -

Yves: especially last comment - the clarification we think this is an important issue.

agreed to close

Collection of Screensharing-related UX Hints

Max: I sent a comment but no response...

Hadley: another way to get their attn.

Max: I also sent email to them.

agreed to close