#514: Client-side video editing (MediaBlob)

Visit on Github.

Opened May 19, 2020

Hello TAG!

I'm requesting a TAG review of Client-side video editing (MediaBlob)

MediaBlob is a high-level API that enables video editing on the browser. This API solves these common overheads/workarounds that web developers currently use for editing media:

  1. Use external JavaScript/WebAssembly libraries usually in several MBs (affects bandwidth and may not be efficient)
  2. Server roundtrip (affects both time and bandwidth)

This API is introducing trim, concatenation and split operations on the media files.

Further details:

  • I have reviewed the TAG's API Design Principles

  • The group where the incubation/design work on this is being done (or is intended to be done in the future): WICG

  • The group where standardization of this work is intended to be done ("unknown" if not known): unknown

  • Major unresolved issues with or opposition to this design: None yet

We'd prefer the TAG provide feedback as:

🐛 open issues in our GitHub repo for each point of feedback

Discussions

Comment by @kenchris May 27, 2020 (See Github)

What is the relationship to WebCodec? Would it be possible to implement this API client-side on top, for instance as a polyfill/library, or do you foresee performance issues with that approach?

The API does seem quite user friendly, but I wonder if it is flexible enough as some of the use-cases seem to overlap with WebCodec.

Information about the above would be great to add to the explainer

Discussed Jun 1, 2020 (See Github)

Ken: there is overlap between this and web codecs. I have been following discussions on blink-dev - looking at this API, for me it seems nice. This seems like a good API. but it was pointed out on blink-dev that these are not the most common use cases. But a comment people have given is that it doesn't scale.

Tess: the lesson from that is layering. This captures common use cases and those should be easy. If it can be defined in terms of the lower level stuff then that's good. If the convenient stuff works for you then great.

Rossen: it does hit and address the most common cases which are ~80% of what people need. Where the strggle that comes out - it was one of the issues that was raised - one of the main weaknesses of the API is splicing differently encoded and different types of media... This API becomes lossy for that. It will have to renormalize to something that loses the original quality. On the flip side, from experiments that [MS] team has run with real media, they have run a number of partner use case verifications and the results were overwhelmingly positive. Today this is very painful. In the most common use cases - e.g. trimming a video, it helps.

Ken: would it be possible to show a slider with frames?

Rossen: scribing? I think this is capable through video element and API. the point is what happens next. Today you send it back to the server. There are some JS libraries that have performance cliffs. As a general principle, I'm aligned to what Tess said about reducing developer pain.

Ken: some examples missing for how to tie this together with playback. Seems like something very separate from playback. As it's supposed to be simple, some examples of how to tie it together with playback would be useful.

Rossen: yes, the main selling point is ergonomics. Showcasing an end-to-end authoring case would be good. I will add the comment. Push a couple weeks?

Ken: O

Comment by @kenchris Jun 9, 2020 (See Github)

According to my above comment, there seems to be a similar discussion on the blink-dev mailing list:

https://groups.google.com/a/chromium.org/forum/#!searchin/blink-dev/mediablob%7Csort:date/blink-dev/3eac-HVygFY/_BBjNckXBQAJ

Comment by @palemieux Jun 9, 2020 (See Github)

As discussed at https://github.com/WICG/video-editing/issues/13 and as currently specified, the API introduces inefficiencies and generational loss in even simple cases. Refactoring the API to be playlist-centric would address these issues and simply future extensions.

Comment by @atanassov Jun 15, 2020 (See Github)

One of the strengths with the API is its ergonomics for developers. Further, one of the common scenarios it addresses well can be thought of as: playback > seek > trim > playback. Can you add a more complete example in the explainer that shows how developers will be achieving this?

@ykh015, there is an active discussion on blink-dev https://github.com/w3ctag/design-reviews/issues/514#issuecomment-641124322, can you summarize what are the next steps for this API?

Comment by @kenchris Jun 15, 2020 (See Github)

Can you add a more complete example in the explainer that shows how developers will be achieving this?

for instance if I am trimming a video stream, I want to be able to seek through the stream (playback) and then when I trim, be able to easily update the UI

Comment by @ykh015 Jun 15, 2020 (See Github)

After the discussion on the i2p, it was concluded that demuxing and muxing can be handled by JS libraries efficiently and need not be provided by the browser. WebCodecs is already providing transcoding and when combined together with external JS libraries, media editing scenarios can be addressed.

On providing a more ergonomic and direct API for video editing, we are re-evaluating the current proposed model of using MediaBlobOperation and looking into a playlist type model as being discussed in this issue.

Discussed Jul 1, 2020 (See Github)

Ken: touches on the same area as webcodecs. They are looking about how to use this on top of web codec. Hard dependency. This API is a more user friendly API running on top of codecs.

Ken: propose close

Sangwhan: 2nded

[closed

Comment by @palemieux Jul 28, 2020 (See Github)

@kenchris What does the propose closing label mean and why was it applied?

Discussed Aug 1, 2020 (See Github)

Alice: I see this is propose close...

Ken: Yeah, until the propose something new with WebCodec as the base primitive. Now this is about providing convenience APIs, which they say they're discussing in another issue.

Yves: Could this be related to Insertable Streams..?

Ken: I don't know.

Yves: No requirements to have streams in real time... could stop, do some editing and continue. One of the goals of insertable streams was to allow things like adding backgrounds.

Ken: Seems anyway like this won't progress until there is this new thing...

Yves: Should we ask if they still need the review...?

Ken: Yeah, I'll write something.

Comment by @cynthia Aug 18, 2020 (See Github)

What does the propose closing label mean and why was it applied?

Propose closing generally means "we are done-ish with the review and we'll discuss at our plenary meeting and close if everyone is happy with the outcome of the review". This isn't always the case, but covers about 80% of the times we use that label.

In this case, it seems like the label was used because the proposal will need to be re-worked and we'd like to review when the new proposal is out.

Comment by @kenchris Aug 18, 2020 (See Github)

We propose closing as this doesn't seem to be moving forward in its current form and we need to clean our backlog. You can always reopen when you have something new or file another review issue.