design-reviews#771: Delta review (to CR) of Web Neural Network API

#771: Delta review (to CR) of Web Neural Network API

Opened Sep 9, 2022

Hi TAG!

I'm requesting a delta TAG review of the Web Neural Network API.

<details> <summary><h4>More details about this review request</h4></summary> The Web Neural Network API (or WebNN API in short) is a specification for constructing and executing computational graphs of neural networks. It provides web applications with the ability to create, compile, and run machine learning networks on the web browsers. The WebNN API may be implemented in web browsers using the available native operating system machine learning APIs for the best performance and reliability of results.

Explainer: https://github.com/webmachinelearning/webnn/blob/master/explainer.md
Specification URL: https://www.w3.org/TR/webnn/
Tests: mocha tests, migrating to wpt/webnn
Security and Privacy self-review: completed, see wide review tracker
GitHub repo: https://github.com/webmachinelearning/webnn
Primary contacts:
- Ningxin Hu (@huningxin), Intel, Editor
- Chai Chaoweeraprasit (@wchao1115), Microsoft, Editor
- Anssi Kostiainen (@anssiko), Intel, Chair
Organization(s)/project(s) driving the specification: participants of the Web Machine Learning Working Group
Key pieces of existing multi-stakeholder review or discussion of this specification: Web and Machine Learning workshop report and spec GH issues
External status/issue trackers for this specification:

Further details:

I have reviewed the TAG's API Design Principles
Relevant time constraints or deadlines: CR publication slated Q4 2022
The group where the work on this specification is currently being done: Web Machine Learning Working Group
The group where standardization of this work is intended to be done:
Major unresolved issues with or opposition to this specification: N/A
This work is being funded by: N/A

</details>

For the full review template, please unfold the above section ⤴️

The initial TAG review completed Oct 2021. This delta request focuses your attention on the following architectural changes and issues since the previous review:

Naming of the sync and async methods: createContext, build and compute. The WG has considered two API naming conventions (x() + xSync() or xAsync() + x()) but was unable to reach consensus and resolved to seek TAG recommendation. See https://github.com/webmachinelearning/webnn/issues/272
Related to the naming issue, the WG decided to restrict the sync API to worker context only. This API complements the async API. The key use case for the sync API is to support Wasm code generators. The async API is the recommended path for mainstream use cases. We would like to hear the TAG perspective on this API split. We are aware that the worker-only sync API design is a rare exception on the web platform.
The WG resolved to drop support for WebGL and focused on WebGPU interoperability.

The CR publication is slated for Q4 2022 so your feedback is the most impactful if it arrives by the end Oct 2022 latest.

We'd prefer the TAG provide feedback as:

💬 leave review feedback as a comment in this issue and @-notify @anssiko

For context, these are the related issues in the WebNN GH repo:

Discussions

Comment by @anssiko Sep 26, 2022 (See Github)

@cynthia, the first bullet "naming of the sync and async methods" is the most time critical for the WG. This is to avoid breaking changes to the mainstream parts of the API as we approach Dev Trial.

To ensure you use your review time effectively, I want to clarify we are not asking for a full (re-)review of this spec, but a recommendation on this one question. The latter two bullets are mainly for your information and to provide supporting information and context. Thank you!

Discussed Oct 17, 2022 (See Github)

Max: first, the naming of synchornised and async method. They had a discussion in their WG and it seemed they were unable to reach consensus, that's why they asked us. Two naming proposals. The second part is to ask the TAG to review about the design choices they restrict the sync api to worker only. Rare exception design on the web platform, that's why they want to confirm with us.

Dan: they're really looking for a response on naming in particular..

Max: after looking at WG discussions, maybe first naming is more suitable. They summarized: in the future the api will be async so it's not necessary to .. the default behaviour should be async in the future. So the first..

Dan: Sangwhan should also weigh in. Could you have that discussion with Sangwhan to get feedback, then if you have consensus you could leave that comment.

[come back to this at plenary]

Comment by @torgo Oct 18, 2022 (See Github)

Hi @anssiko just to let you know we're working on getting you feedback here this week.

Discussed Oct 24, 2022 (See Github)

Max: Sangwhan says he has comments but hasn't responded yet. Will follow up.

Hadley: hopefully Sangwhan can respond before plenary

Sangwhan: they want us to choose option 1 or 2, and i'm not sure about either. 1 is all async, and 2 is sync and add async as a suffix. both are ugly. if it's inevitable in the platform, we should have a principle for it. Seems there is an implementation detail here being unnecessarily exposed to the user. imagine you locked everything down to be sync or async.. you'd have to change the paradigm. I wanted to ask domenec, but I feel like both approaches are not great. If you think about making an asyc message sync, there is a semantic for that already. I'm sure they considered it.

what do others think about putting sync or async at the end of every API call?

Amy: why is the wg asking the TAG?

Sangwhan: they couldn't come to consensus

Amy: have they shared their minutes where they show their thinking?

Sangwhan: I read some....

Amy: I suppose a TAG perspective would be about consistency on the web platform?

Sangwhan: there is nothing to be consistent with.

Amy: so it would be setting a precedent

Yves: we usually try to move away from sync calls. Better for performance.

Sangwhan: yes. if we had to chose, it would be asyc by default and sync by exception. but do we want these patterns?

Yves: TC39 might have a voice?

hadley: what's the best way to take this forward? a call with them? including tc39?

sangwhan: we lost our TC39 liaison, so i'm going to message Dan Eranberg to see what we can do. https://cryptpad.w3ctag.org/code/#/2/code/edit/XbFesGNiFynT3ReFOoiiYEIN/ hadley: okay, then do you want to pick a time with breakout C for that call?

sangwhan: may be us/APAC. if it's the same liaison. Let me check with times and see what we can do.

But given that there is an await semantic, i feel like putting it into the API naming is not great. if we're going to do it, we should write a principle about it.

Both seem to be ugly. Not sure what others think.

Yves: it's not nice, but it may be the less evil of all the things.

sangwhan: there is a gap, in await, that we're missing that should probably be addressed. rather than stick sync on everything.

yves: that's why it's better to bring in TC39.

hadley: so when we do have this call with TC39 and the authors, that will be the same question?

sangwhan: semantically, xsync() vs await x should semantically be similar things but there is clearly a gap here.

Sangwhan: I'll report back on who should be involved, and will make this call happen.

Comment by @anssiko Oct 27, 2022 (See Github)

@torgo thank you for keeping us updated. The WG is slated to discuss this on 2022-11-03.

Comment by @littledan Oct 27, 2022 (See Github)

I'm wondering, what's the motivation for the sync API? I think async APIs should be friendly to WebAssembly in the near future, given that WebAssembly JavaScript Promise Integration reached Phase 3 in the Wasm CG.

Comment by @anssiko Oct 27, 2022 (See Github)

@littledan this was also brought up by @domenic in https://github.com/webmachinelearning/webnn/issues/272#issuecomment-1231249752, more context there.

Do we have an estimate when this integration might ship in multiple engines?

Comment by @littledan Oct 27, 2022 (See Github)

Sorry, I don't have any information about timelines here; maybe @fgmccabe does

Comment by @fgmccabe Oct 27, 2022 (See Github)

Probably the next engine to implement JSPI will be Firefox. However, that will likely not be before sometime next year; they have a lot of other things to work on.

Discussed Oct 31, 2022 (See Github)

Sangwhan: there is work in ECMA to make this less terrible. One year out until we can use it. Asking to follow that pattern they will have to wait a year to ship. Async by default and sync as explicit will be the less terible default. That should be the recommendation we push for. It would be nice if things could be done in an idiomatic way. I will also draft a principles section on how to deal with this - and also note that once there is an idiomatic pattern in place then consider following that patttern.

Sangwhan: files issue in design principles

Dan: would this be a temporary design principle until TC39 bakes this into js?

Sangwhan: correct. Might be longer than temporary. So feedback on this issue is to recommend the less terrible option.

Discussed Nov 14, 2022 (See Github)

Dan: bump to plenary

Amy: they have this issue about naming

Discussed Nov 28, 2022 (See Github)

Dan: no new information. Will ask Sangwhan.

Comment by @cynthia Nov 30, 2022 (See Github)

Apologies this took so long, especially given the small change. Given the implementation support ot do this nicely is currently not ready, we discussed this at length and came to a conclusion (albeit somewhat distant from unanimous) that the following tradeoff would probably be acceptable (while not ideal).

x() + xSync() as conventions - we will add a design principle on this.
Assuming xSync() would block the main thread for longer than we would likely be comfortable with, keeping it limited to worker should at least prevent inappropriate use degraging user experience.

That said, we would like to see a migration path forward once WASM Promise integration lands on multiple implementations. Thank you for your patience, and we're excited to see this work move forward.

Comment by @anssiko Nov 30, 2022 (See Github)

Thank you @cynthia & TAG on behalf of the WG. Both the spec and the WIP implementation now follow this convention.

Comment by @anssiko Jan 26, 2024 (See Github)

Hi again TAG!

NB: I'm piggypacking on this issue to retain context, but please let me know if I should file a new issue instead. On behalf of the WG I hope the TAG is happy to see these changes and look forward to your comments.

We're looking to publish a new CR Snapshot of the Web Neural Network API in Q1'24 and wanted to give you a heads up with the following high-level summary of changes for your information and review:

Since the initial Candidate Recommendation Snapshot the Working Group has gathered further implementation experience and added new operations and data types needed for well-known transformers to support generative AI use cases. In addition, the group has removed select features informed by this implementation experience: higher-level operations that can be expressed in terms of lower-level primitives in a performant manner, and support for synchronous execution. The group has also updated the specification to use modern authoring conventions to improve interoperability and precision of normative definitions and is developing a new feature, a backend-agnostic storage type, to improve performance and interoperability between the WebNN, WebGPU APIs and purpose-built hardware for ML.

You are probably happy to see we're removing support for synchronous execution per your guidance (removal discussed in https://github.com/webmachinelearning/webnn/issues/531, we expect to land this change ahead this publication) and moving toward JSPI that is coming finally.

Edit: fix link to https://github.com/webmachinelearning/webnn/issues/531