#716: I18N String-Meta and WebIDL
Discussions
Comment by @domenic Mar 8, 2022 (See Github)
(The links to https://github.com/whatwg/webidl/issues/1025 are broken in the OP.)
For the TAG's consideration, I think the discussion here is a bit deeper than the procedural question of where a specific dictionary should live. The Web IDL thread brings up an important question which seem more foundational to me:
Is it appropriate for specifications to aspirationally include direction/language metadata in user-facing string APIs, even if no implementer has any plans to use that metadata? See discussion starting https://github.com/whatwg/webidl/issues/1025#issuecomment-934150460 .
I don't have strong feelings about the specific shape of localized-string APIs, whether based on ECMAScript, a shared dictionary defined somewhere, or spec-specific dictionaries. But I do have strong feelings that adding aspirational features without any implementation commitment is not good, even if the aspiration is in the direction of a good cause like proper i18n support.
I've also found it persistently frustrating how the i18n folks have been unable to supply many concrete examples of the APIs they'd like to use this infrastructure. In particular, I think the category is: JavaScript APIs, which accept strings for presentation to the user (not developer), and do not involve HTML markup at all. We intentionally try to avoid such APIs on the web platform, preferring markup for user-presentation, but sometimes it's unavoidable as we need to present strings in "browser chrome" or similar.
My reading of the thread has been that we eventually settled on there being one or maybe two such APIs on the web platform (PaymentRequest and WebAuthentication), plus Notifications which already has its own solution. The thread also had lots of confusion about geolocation and developer-facing error message localization.
I would love to see the explainer include example code (as the "good explainers" document and the template in the OP suggests), of each of these specific APIs, before/after the proposal.
Comment by @plehegar Mar 25, 2022 (See Github)
fyi, the i18n are tracking several direction and language metadata issues.
Comment by @aphillips Mar 25, 2022 (See Github)
I18N has been requesting metadata for some time and there are different dispositions depending on the specification and its needs. There are 18 specifications in the list below. Separately I am reviewing specs that preceded our efforts in this area as well as the list of potential new specifications.
First, JSON-LD added features to the specification allowing document and item level metadata.
The following specifications either added support using JSON-LD or are in the process of doing so:
- WOT Thing Description
- pub-manifest
- activity-pub
- web-annotation
Some specifications added locally defined metadata (i.e. their own language
and direction
fields in natural language values and which are similar to the Localizable
type described elsewhere in this request):
- appmanifest
- miniapp-manifest
- TTML
- activitystreams (based on WebAppManifest)
One specification (WebAuthn) defined its own serialization scheme; we are still working with them on the details of that scheme.
The following specs are in one way or another waiting on this discussion or have proceeded:
- secure-payment-confirmation
- payment-request
- vocab-duv
- mediacapture
- web-share
- micro-pub
- SHACL
- screen-capture
SHACL probably should have adopted the JSON-LD approach. micro-pub entangles metadata with the question of localizable error messages and so might not apply.
Comment by @domenic Mar 25, 2022 (See Github)
Thanks for assembling the list. Just to reemphasize:
I would love to see the explainer include example code (as the "good explainers" document and the template in the OP suggests), of each of these specific APIs, before/after the proposal.
Comment by @aphillips Mar 25, 2022 (See Github)
@domenic thanks: I'm also working on addressing that part of your comment.
Comment by @alvestrand Mar 27, 2022 (See Github)
FWIW, WebRTC (device labels) is also in the group of "having pushed back on the request because of no developer interest and no clear pattern to follow". https://github.com/w3c/mediacapture-main/issues/665
Discussed
May 1, 2022 (See Github)
Sangwhan: why is it a type shouldn't it be a trait. oh webid doesn't have trait? An object that can have a string represetnation should be localizable. For example if you have an object that has a toString
... you want toString
to return a string that has a localizable trait.
Yves: in unicode there were specific codes to identify the direction.. currently it's deprecated... about to be removed... [from unicode]...
Sanghwhan: RLM? https://en.wikipedia.org/wiki/Right-to-left_mark
Yves: https://w3c.github.io/string-meta/#unicode_enough - the problem is what happens when you want to do string equality... this would be a good way of doing it because from the API point of view there is no change but from everything that needs to process strings there is an issue. I don't think there is a "right solution" - it's just trade-offs.
Dan: can the priority of constiuencies be helpful?
Sangwhan: we have to think about it from a developer perspective... something similar from the I18N wg that we reviewed... it was possible to decide which was better based on ergonomics. I don't see an immediate difference here.
Dan: easier for developers to use -> more use of I18N -> better I18N of the web overall?
Yves: the balance is the need for I18N on one side and... inconvenience for developers... The issue is that localizable strings are mixing data and metadata ... more difficult than an object that represents some data...
Yves: if it's a new type in the language [js] then it's easy to use it in WebIDL and everywhere else. but you could say the same for mandating the use of unicode description characters... if it's present in the way JS is handling strings for instance then it solves the issue.. not sure what the impact would be for the browsers.. if info [meta] would be kept along with the data. It's easy to get the data to process it and forget about the metadata... either having a specific type or mandating the use of the description ... which was deprecated because nobody was using it. To me it's more of an ecmascript thing for them to decide first. Either add a new type or processing of unicode characters...
Dan: we could recommend that ecmascript community makes this decision
Yves: ensuring that the data and metadata are as closly tied togetether as possible so the metadata is not lost.
Hadley: given that they are already talking to ecmascript community what is they want us to do?
Sangwhan: weigh in on which tradeoff the web should go for?
Max: agree what Yves said - it should be coupled - centralized way for metadata and text tu run together to make it easier for the developer to use... otherwise difficult to be consistent.
Sangwhan: they want to have a proposal into TC39 that also is adapted by WebIDL...
Yves: i think it's better to start with the native type - ecmascript.
Dan: since the webidl people follow TC39 ... TC39 is the right place...
Yves: it would be good to get Apple's and MS's opinion... people working on I18N.
[discussion of DOM string and TC39 string]
Dan: draft tag view might be - "be guided by developer ergonomics; data and metadata closely bound; should follow from TC39 consensus on what the right approach is and then adjust WebIDl accordingly..."
Discussed
Jun 1, 2022 (See Github)
Max: we requested them to provide some code example... he has promised to add a code example in the explainer but it hasn't been done yet.
Dan: Can we close then?
Max: that's one comment... from TAG. We need another round of review after they provide the example. They have multiple choices - one option is to do the standardization in webidl but webidl people don't like it. - another option is to do it in emaca - that will take too long. Another alternative, possibly acting as a shim for eventual standardization by ECMA TC39, would be for I18N to define a dictionary and ask specifications to adopt it generally for natural language string values.
Yves: last time we talked about this - the data and meta-data close together so it's not lost... To me the easiest option for implementers - either in unicode encoding (currently deprecated because of lack of usage - which means all string libraries) - or have ecma work on a specific type that can be a regular string plus indication. If we don't have that then it means we would always use structures that would be lost at some point.
Hadley: we were going to circle back to them...
Yves: https://github.com/w3c/mediacapture-main/issues/665 and https://github.com/w3c/miniapp-lifecycle/issues/25 are discussions ongoing...
Dan: ecma solution possibly best?
Hadley: found in previous minutes: "draft tag view might be - "be guided by developer ergonomics; data and metadata closely bound; should follow from TC39 consensus on what the right approach is and then adjust WebIDl accordingly..."
Dan: so we should post this.
Hadley: rewrites in full sentences
Dan: and then?
Hadley: ...ask them if there is anything we can do to help.
Yves: keep it open until we find a place for this...
Dan: we should revisit at the plenary and then push it out a few weeks to revisit again.
Hadley: posts comment
Comment by @hadleybeeman Jun 7, 2022 (See Github)
Thanks for this question, @aphillips! We've discussed it in our W3C TAG breakout today.
We think that it's important that the approach be guided by developer ergonomics, who will be the primary users of this. It's important to note that the data and metadata are closely bound and the shape of this should reflect that. And, ideally, we think that it should follow from TC39 consensus on what the right approach is and then adjust WebIDL accordingly.
Is there anything we can do to help with this, or do you have what you need to crack on? Let us know.
Discussed
Jul 1, 2022 (See Github)
Hadley: we have feedback to Addision P - guideded by developer ergonimics... tc39 consensus and then to webidl. And we left with a question is there anything we should do and no reply. So will nudge. I think our work is done.
Yves: i already talked to Richard Ishida @r12a and @Xfq... it was done already last week. Also ping on the issue.
Hadley: [leaves comment]
Discussed
Jul 1, 2022 (See Github)
Hadley: waiting for a response
Dan: we got a response
Hadley: we're not against it... but compared to what? Who do we know at TC-39?
Sangwhan: Dan E. is on TC39...
[Sangwhan & Hadley leave comments]
Dan: set milestone for next week and we can close then.
Discussed
Jul 1, 2022 (See Github)
Hadley: we did our piece on this - I think we're done.
Dan: I agree -
Yves: especially last comment - the clarification we think this is an important issue.
agreed to close
Collection of Screensharing-related UX Hints
Max: I sent a comment but no response...
Hadley: another way to get their attn.
Max: I also sent email to them.
agreed to close
Comment by @hadleybeeman Jul 12, 2022 (See Github)
Hi, we're just revisiting this. @xfq @r12a, I know that @ylafon has spoken to you about this over the past week. If you or @aphillips don't have any final comments in the next week, we are minded to close this. Let us know if we can do anything else to help.
Comment by @aphillips Jul 15, 2022 (See Github)
@hadleybeeman Thanks for the update. We discussed this in our teleconference yesterday (2022-07-14).
Unsurprisingly our next step is to further engage ECMA-402 (the I18N part of TC39) and together engage TC39 about encoding natural language strings with appropriate metadata. One of our goals in opening this issue was to get support (where appropriate) from TAG--probably in the form of "hey, we think this issue is worth paying attention to". Is it reasonable to expect such support? It's hard to tell what TAG's position is from the comments.
Also, does TAG have any recommendations for who to approach at TC39? We can just use the I18N folks, but perhaps a more direct engagement would be better. What can you suggest?
Comment by @cynthia Jul 19, 2022 (See Github)
@littledan is this something you could potentially help with?
Comment by @hadleybeeman Jul 19, 2022 (See Github)
To clarify, we the TAG do think this issue is worth paying attention to and hope it gets the focus it needs.
Comment by @hadleybeeman Jul 26, 2022 (See Github)
We are closing this, since it seems resolved. Please do leave a comment or send us an email when you've established contact with TC39. We are hopeful that the intro to @littledan (above) would help.
Comment by @alvestrand Aug 1, 2022 (See Github)
So is the resolution of this issue a recommendation that WEBRTC and other WGs that are awaiting guidance should do nothing until ECMA-402 has finished engaging with TC39 to find a language-appropriate solution?
Comment by @aphillips Aug 17, 2022 (See Github)
Per an action item, updating this issue.
I had a meeting with ECMA-402 (the I18N subcommittee of TC39) on 2022-08-11 and we plan to have a follow up at their next call. In addition, I am reaching out to TC39 in order to get the ball rolling there as well. Generally speaking the 402 folks are supportive, but needed some time to digest our proposal.
@alvestrand RTC and other groups can make cautious progress: in some cases guidance in String-Meta can be followed now. However, in the main, the folks who were waiting before are still waiting. I hope to have at least initial progress to report with TC39 before TPAC and I will add links here for those who need to follow along or who wish to engage in that conversation as they develop.
Comment by @domenic Aug 17, 2022 (See Github)
How is progress on https://github.com/w3ctag/design-reviews/issues/716#issuecomment-1079145578 ?
Comment by @littledan Oct 27, 2022 (See Github)
The motivation for associating a string with a language seems clear, but what's less clear to me is which APIs this should be modified to accept (or produce) such a value. So, I'm looking forward to the answer to the question @domenic asked.
In general, if we have use cases in JavaScript that would benefit from this feature (whether through direct use or as input to another JS built-in method), I'm not opposed to adding it as a built-in class in ECMA-402.
I get the feeling that the motivation for putting the string-with-metadata in WebIDL has to do with ensuring that it's maximally available for use in other specifications. Would this not also be met by putting this dictionary definition in a third specification, and making normative references to it?
(Apologies for my delay on responding to this thread. Addison and I are in touch by email and I hope to have a call with him soon.)
OpenedMar 8, 2022
صباح الخير TAG!
I'm requesting the TAG express an opinion on a "dispute" related to:
Explanation of the issue that we'd like the TAG's opinion on:
This isn't quite a "normal" technical dispute, but we do seek a conversation with TAG about the technical approach we are taking. We believe that interoperability of natural language strings between different Web APIs is strongly desirable
Quoting our explainer:
Links to the positions of each side in the dispute (e.g., specific github comments):
webidl#1025
What steps have already been taken to come to an agreement:
We don't actually disagree with WebIDL. Some working groups have pushed back on our comments asking for direction metadata because of the lack of a standardized representation on the Web, such as Webauthn and Web Payments.
We'd prefer the TAG provide feedback as (please select one):
leave a comment in the following GitHub issue: i18n-discuss#23
leave review feedback as a comment in this issue and @-notify [github usernames]
open a new issue in our GitHub repo with the feedback
we would like to have a joint call with representatives of the TAG if appropriate
For our own housekeeping: [I18N-ACTION-1103]
Please preview the issue and check that the links work before submitting. In particular, if anything links to a URL which requires authentication (e.g. Google document), please make sure anyone with the link can access the document.
¹ For background, see our explanation of how to write a good explainer.