#1189: Incubation: Web Speech API: On-Device Recognition Quality

Visit on Github

Opened Feb 2, 2026

Explainer

https://github.com/WebAudio/web-speech-api/blob/main/explainers/quality-levels.md

The explainer

Where and by whom is the work is being done?

  • GitHub repo: https://github.com/WebAudio/web-speech-api/
  • Primary contacts: Evan Liu (evliu@google.com), Google, Author
  • Organization/project driving the design: Google Chrome
  • This work is being funded by: Google
  • Incubation and standards groups that have discussed the design: Audio WG, TPAC 2025
  • Standards group(s) that you expect to discuss and/or adopt this work when it's ready: Audio WG

Feedback so far

You should also know that...

Summary: Extends the SpeechRecognition interface by adding a 'quality' property to SpeechRecognitionOptions. This allows developers to specify the semantic capability required for on-device recognition (via processLocally: true). The proposed quality enum supports three levels: 'command', 'dictation', and 'conversation'.

Specification URL: https://webaudio.github.io/web-speech-api

<!-- Content below this is maintained by @w3c-tag-bot -->

Track conversations at https://tag-github-bot.w3.org/gh/w3ctag/design-reviews/1189

Discussions

Discussed Feb 2, 2026 (See Github)

Christian and Matthew are assigned.

Discussed Feb 9, 2026 (See Github)

Christian: Marcos mentioned this has the same issue as PromptAPI, you can query if there is a model present on the device suitable for a certain level/language, this could be a fingerprint vector. I want to reccomend they look at Prompt API review.

Ehsan: Isn't this the same problem with all language based models? Maybe we can have a consistent answer.

Lola: Do you suggest we should have a document on language-based models?

Ehsan: That would be my suggestion. Come up with a document that describes all of that. Should be done at the WebML groups. Think this is coming up more often.

Marcos: It’s a more general problem of downloading system components which then you can query, because then they become global.

Lola: To make this even more general, should we have a position on downloading system components? Or is this restricted to this use case?

Marcos: No, could be related to everything. Codecs, etc. Should be a design principle.

Lola: Who would be willing to write that? We also have another plenary before the F2F.

Christian: Could offer to do that, would be my first design principle, and a topic where I’m interested in.

Ehsan: Same here. Would be good to have a more experienced TAG member on that as well.

Lola: Design principles is owned by Jeffrey, so we can talk to him about that.

Discussed Feb 16, 2026 (See Github)

Christian: We asked a question here. There is the fingerprint-ability concern. We also talked about when we review the prompt API, thinks that you can basicaly download AI models to the system as a global compenent, and in choosing languages and other options, it becomes fingerprintable.

That was our first reaction. Web speech API with local processing is already there, it exists. It can already download AI modules and the previous tagest with concerns to that. so now they are just adding the quality level, which is a minor change but adds to the fingerprintability.

We are now waiting for a response.

Hadley: maybe if we don't hear back next week, let's nudge?

Christian: sure

Comment by @christianliebel Feb 19, 2026 (See Github)

Hi @evanbliu, thank you for your proposal.

We have one question regarding privacy: Could an attacker fingerprint the user’s browsing history by installing certain or rare languages along with model qualities on site A, and checking for the availability of those permutations on site B? And if so, how is that fingerprinting concern mitigated?

It would be great if you could add a security & privacy questionnaire and answer that question.

Discussed Feb 23, 2026 (See Github)

(Christian/added before the meeting: This is pending external feedback.)

Comment by @evanbliu Mar 4, 2026 (See Github)

Hi @christianliebel,

Thanks for raising this! It's worth noting that this proposal doesn't actually introduce on-device speech recognition, as it's already part of the existing spec. The same fingerprinting concerns you brought up are already present for current on-device speech recognition and are are mitigated by the countermeasures detailed in this PR: https://github.com/WebAudio/web-speech-api/pull/165. These mitigations are based on those developed for the Writing Assistance APIs (https://webmachinelearning.github.io/writing-assistance-apis/).

Let me know if you have any concerns or questions!

Discussed Mar 16, 2026 (See Github)

Christian: this is related to the Global Browser Component topic we discussed in our f2f. Previous TAG said saistisfied or satisified with converns, but not for the baicl local web speech API. Now they want to extend that local API by giving you a quality property, the level of speech recognition quality. Marcos looked at the pull request. Hard to say anything other than satisfied by concerns because we already said that.

Marcos: perfect summary.

Christian: shall we close it with satisfied with concerns, where the concerns being fingerprinting?

Matthew and Marcos: yes

Discussed Mar 30, 2026 (See Github)

Christian: We feel this will be satisfied with concerns. We asked them to add fingerprintabiltiy information, but this was ignored. They said it's been covered already, but adding the quality dimension adds a lot of fingerprinting vector.

Marcos: We can see the problem again where we're closing issues but then have no mechanism to follow up - e.g. Christian said it would be nice to see certain things. Maybe we should file bugs relating to those concerns? Then they can't avoid the feedback we give them.

Lola: We've spoken about this in relation to other issues.

Matt: We have process for what you just described Marcos. We can see in one place when groups have closed issues, if they've addressed resolutions, etc. We have it and we could use it.

Christian: Should we try out that process here?

Matt: I'm working on guidance for how to do this, I'm happt to take that and make it applicable for TAG but docs are needed. Happy to take it on and work with chairs.

Lola: Let's write the closing comment for this, and then figure out with chairs, Matthew, and anyone else, how to track things.

Christian: Let me know on the proosed comment.