#948: Web Translation API

Visit on Github.

Opened Apr 24, 2024

こんにちは TAG-さん!

I'm requesting a TAG review of Web Translation API.

Browsers are increasingly offering language translation to their users. Such translation capabilities can also be useful to web developers. This is especially the case when browser's built-in translation abilities cannot help, such as:

  • translating user input or other interactive features;
  • pages with complicated DOMs which trip up browser translation;
  • providing in-page UI to start the translation; or
  • translating content that is not in the DOM, e.g. spoken content.

To perform translation in such cases, web sites currently have to either call out to cloud APIs, or bring their own translation models and run them using technologies like WebAssembly and WebGPU. This proposal introduces a new JavaScript API for exposing a browser's existing language translation abilities to web pages, so that if present, they can serve as a simpler and less resource-intensive alternative.

Further details:

Discussions

2024-05-13

Minutes

we didn't get to this one

2024-05-13

Minutes

This will come up in Breakout C later.

Matthew: concern about how users will understand the use of automated translation - will it be obvious that it's automated to users? So they know there are risks. Also don't want to encourage the reapeated use of ML instead of professionally done translations, but understand that is not always possible. (Similar example: captions on videos for accessibility - helpful, but the ideal is always going to be to prodcue captions, rather than have ML work on them, for accuracy, inclusion, and sustainability.)

Max: If the device doesn't have the requisite hardware (e.g. GPU) the UX could be downgraded - important to consider this.

2024-07-29

Minutes

Matthew: we need Lea's input on API shape... however I have several thoughts...

  • First one is they did talk about a potential extension - running on device only or going to the cloud. That could be important. I think that's important. They haven't done research with developers - would be good for them to see what developers think. It strikes me in some use cases that would need to be controlled.
  • Don't know if they've talked with I18N WG - specifically about their issue 11 on specificity - BCP codes, etc... I18N would have thoughts on that. Also streaming input support.
  • They're namespacing all the functions under ai
  • I have a general concern around efficiency and accuracy. Concerned that developers would rely on this rather than doing translation for static assets as part of their development process. Some non-normative encouragement to not do that maybe?
  • Donwload progress reporting ... downloading the language support .. they think the web app should display the progress for that. Shouldn't that be part of the UA's UI?
  • Also the API used signals. Using signals to see if download has been aborted, etc...

Peter: I think the signal - abort signal - is ok.

Lea: yes.

Dan: cloud vs on device...

Matthew: at the moment it's how the browser wants to implement it. so it could be cloud based or on-device.

Lea: the author should be able to say "if you can translate this without going out to the network, fine"

Tess: it's about fetching language models... The translation still might be fast or slow...

Matthew: i think in some cases they are talking about it going out to the cloud to do the translation... I think we need to check up on that. So we could have: fast-local; slow-local; or remote.

Tess: the slow-local case could also be a privacy issue...

Lea: what's to prevent a web page from iterating over all the languages?

Tess: throttling in the implementaiton...

Lea: also to be clear: the need is there and I'm excited to see this. would be nice if there were an actual list of use cases...

Dan: UI elements or content or both?

Lea: both...

  • Why a whole different object to construct instances? A static method on ATTranslator would also have the advantage of making it clear what objects are constructed. Same for capabilities()
  • I wonder if it would make sense to have a single object, rather than a language detector and a translator, as the functionality seems quite related and often language detection is the first step.
<blockquote>

We agree with and support the user need. Here are our thoughts...

  1. It would be good to have a list of use cases. We could think of some from our own experience, but they may be different than the ones you had in mind. Having an explicit list of use cases ensures that everyone is on the same page.

  2. Please continue chatting with the I18N WG folks about issue 11, and streaming input support.

  3. We're concerned about the use of the network. Specifically, use of the network to download a model, or use of the network to actually perform the translation, could introduce both delay and privacy issues. Is it possible for the developer to specify: "only do this if network access is not required"? We feel that differentiation between fast-local, slow-local (i.e. with downlaod), and remote/cloud-based cases is important for MVP.

  4. We loved the approach you propose to partitioning, and using a fake download, to mitigate fingerprinting!

  5. We recommend a translation-specific namespace instead of ai.

  6. Why is a separate namespace needed at all? We understand these objects are not constructible due to the asynchronicity, but since they are creating instances of the same class, making this obvious by adding the factory as a static method of this class seems more consistent with precedent. Same for the capabilities() method, we don't understand why this needs to live in a different namespace, and we think that the more objects this API is spread across, the harder it will be for authors to understand how the different parts fit together.

  7. We think there should be a prominent note encouraging developers to make use of professional translation of pre-existing content rather than doing automatic translation wherever they can - for both accuracy and efficiency reasons.

  8. It seems to make more sense, and help simplify the API and alleviate some privacy concerns if the UA renders the download progress bar.

  9. We did wonder if it would make sense to have a single object for the detection and translation, since they are so related (and often detection is the first step to translation). Was this direction explored?

</blockquote>
2024-08-05

Minutes

<blockquote>

Hi Domenic - Our feedback that we sent above stands, particularly regarding the name space of this API. We don't think it belongs in a .ai namespace. We think that the name spaces should be funciton-led rather than technology-used-to-deliver-the-function-led. Your proposal under point 6 (AITranslatorCapabilities) is a step forward but we would prefer Translator.capabilities(). We also think your response to point 3 of our feedback is fine. We also would strongly encourage that the spec include a non-normative note encouraging the use of professional translation services where possible (point 7 of our original feedback). Please consider this. We're going to close as satisfied with concerns due to these concerns. Given the current state of the spec, and lack of venue, we understand this to be an early review. We expect that addressing the points above will be important for further standardization. ✨

</blockquote>

Issue closed with sparkles.

2024-08-05

Minutes

Matthew: some additional feedback I've been catching up with... especially about the AI namespace... But there is other stuff they want to put in this AI namespace... the concern is : if they're going to put something in a namespace shouldn't it be problem oriented rather than technology used to solve the problem...

Dan: I'm not anti-LLMs in the browser. Say a browser used a different technology besides AI based to do translation? Then it wouldn't make sense anymore?

Peter: if they want to expose an LLM at a low level then call it an LLM...

Peter: if they are making a new common API pattern then let's make sure it makes sense for everyone... and put it in our Design Principles document...

Dan: propose we say something like

<blockquote>

Hi Domenic - Our feedback that we sent above stands, particularly regarding the name space of this API. We don't think it belongs in a .ai namespace. We think that the name spaces should be funciton-led rather than technology-used-to-deliver-the-function-led.

</blockquote>

Matthew: a little more discussion to go... I'm really concerned about point 7 from our original feedback. But I'd like to see if we have consensus about this. It's related to what I've seen in Accessibility. The one thing we can do is set the tone for the discussion here... W3C is also working on this AI and the web document...

Dan: I agree.

Dan: should we bring up the AI and web document in this dicussion?

Matthew: they have been requesting feedback since the AC meeting...

Matthew: we should keep an eye on it and see if Lea can respond...