design-reviews#541: jxl Content-Encoding

#541: jxl Content-Encoding

Opened Jul 30, 2020

Saluton TAG!

I'm requesting a TAG review of "jxl Content-Encoding".

One of the features of JPEG XL is byte-wise lossless JPEG image repacking. On average, an encoded image is 22% smaller. We propose it as a new HTTP "Content-Encoding".

Explainer: https://github.com/google/brunsli/blob/master/explainer.md
Specification URL: https://arxiv.org/pdf/1908.03565.pdf
Tests: N/A, fetch API forbids altering "Accept-Encoding" header
Security and Privacy self-review: https://github.com/google/brunsli/blob/master/security-privacy-questionnaire.md
GitHub repo: https://github.com/google/brunsli
Primary contacts:
- Eugene Kliuchnikov (eustas), Google
Organization(s)/project(s) driving the specification: Google, JPEG XL
Key pieces of existing multi-stakeholder review or discussion of this specification: -
External status/issue trackers for this specification: https://www.chromestatus.com/feature/5678152091172864

Further details:

I have reviewed the TAG's API Design Principles
Relevant time constraints or deadlines: 2020Q3
The group where the work on this specification is currently being done: ?
The group where standardization of this work is intended to be done: ?
Major unresolved issues with or opposition to this specification: none
This work is being funded by: ?

We'd prefer the TAG provide feedback as: 🐛 open issues in our GitHub repo for each point of feedback

Discussions

Comment by @annevk Aug 4, 2020 (See Github)

Is this not specific to JPEG despite the name? If it is specific to JPEG it seems a bit weird to use Content-Encoding for this.

Comment by @eustas Aug 4, 2020 (See Github)

The format could be used for encoding any type of data. So it fits the definition of HTTP Content-Encoding. It is unlikely that the compression ratio will surpass the compression ratio of gzip / brotli unless content is a JPEG file.

The volume of JPEGs traffic is bigger than total volume of HTML, CSS and JS traffic. jxl Content-Encoding will cover the weaknesses of gzip and brotli for such kind of traffic.

Comment by @ylafon Aug 13, 2020 (See Github)

In the specification, I see this: If the codestream starts with bytes {0xFF, 0xD8, 0xFF, 0xE0}, the decoder shall decode a JPEG1 as specified in ISO/IEC 10918-1:1993(E). Otherwise, if the codestream starts with bytes {0x0A, 0x04, 0x42, 0xD2, 0xD5, 0x4E}, the decoder shall reconstruct the original JPEG1 codestream as specified in Annex M. Otherwise, the codestream shall be structured as shown in Table 1; the syntax is described in Annex A.

So it seems linked heavily to JPEG processing, at least for the {0x0A, 0x04, 0x42, 0xD2, 0xD5, 0x4E} case.

Is there a way to recognise the generic encoding, if this applies to other formats, or just the signature above is used and only for JPEG? If it is only for JPEG, then a new media type would be easier to deploy than going through the IETF Review which is needed to update https://www.iana.org/assignments/http-parameters/http-parameters.xhtml , although the Content coding option is not wrong.

@annevk mime sniff applies only after Content-Encoding is processed, so it shouldn't be an issue if the content coding option is used (ie: no need to add a new signature for a repacking of jpeg)

Comment by @eustas Aug 13, 2020 (See Github)

It is proposed only to process {0x0A, 0x04, 0x42, 0xD2, 0xD5, 0x4E} case as Content-Encoding. So, you are correct: only JPEG1 streams could be expressed in compressed form; arbitrary data could be "framed" (see M.5.1) as well.

image/jxl is already registered as provisional standard media type (see https://www.iana.org/assignments/provisional-standard-media-types/provisional-standard-media-types.txt). This media type corresponds to fully-fledged JPEG XL encoding.

Perhaps explainer lacks the motivation, why we want to add the lossless past of JPEG XL as Content-Encoding. Going to update explainer with the following information soon:

Pros
 + faster JPEG rendering of cached content: JPEG uses less CPU compared to modern image formats
 + compatibility: there are a lot of expensive cameras / smartphones that produce JPEGs; it will take at least 10 years to transition to the new format
 + seamless experience: when user saves the image (JPEG) it is guaranteed to be supported by existing software
 + lower deployment risks: Content-Encoding story is similar to Brotli, so it would not cause new risks
 + risk-free serving: no re-review is required for transcoded images, as those are guaranteed to be exactly the same
 + easier for webmasters: no image optimization stack is involved; nginx / Apache plugins will support the new encoding with the zero effort
 + exactly same pixels for legacy clients and new clients (both are decoding the same jpeg)
 + same behaviour: same progression

Contras
 - systems for image optimization / re-coding already exist (cloudinary, cloudflare, akamai, etc.): adding a new image format looks easier for those than adding new Content-Encoding
 - adding "save-as" in browsers would solve the "seamless experience" point (however it would unexpectedly for users loose transparency / animation / etc. features)

Comment by @ylafon Aug 17, 2020 (See Github)

Note that Content-Encoding is a property of the payload, so saving-as will save the jxl version, as opposed to Transfer-Encoding which has the drawback of being hop-by-hop. So you will probably need a specific 'save-as after resolving content-encoding' while brunsli is available everywhere.

Also conneg done via plugins like that are often either not really doing a proper job, as they don't have access to all the axis available for content negotiation, so listing those plugins as "pros" is debatable.

The real "pro" is that the jpeg/jxl/jpeg conversion is lossless, and that is only what matters when asking for a new content coding, even if it is not generic but tight to a specific media type (there are some examples in the IANA listing of content codings)

Discussed Aug 24, 2020 (See Github)

Yves: I made a comment on the issue... It's not ideal to have a content encoding for only one media type but it has been done in the past. Proosed close: satisfied.

Dan: propose close and we can close at the plenary

Discussed Jan 1, 2021 (See Github)

Peter: My first reaction is - why isn't this just a mime type and Transfer-Encoding?

Rossen: Would that work in all scenarios? For example, going through proxies etc. and coming back with the original encoding?

Peter: It should be able to send a jpg from the server, encode it into jxl along the way, transport it assuming all proxies support it and finally get decoded as jpg back in the browser.

Peter: It seems like very much an edge case for which we're trying to add a lot of overhead for very little benefit.

Lea: It's not clear if and what the savings could be compared to other encodings that are already widely used, gzip etc.

Peter: Computation cost of the encoding/decoding isn't clear. Should they need both copies on the server and rely on content negotiations?

Rossen: Let's leave a comment and move on.

Comment by @plinss Jan 26, 2021 (See Github)

Looked at this in our VF2F with @rossen and @LeaVerou, we're concerned that while this is perfectly valid, it seems to be adding a lot of complexity relative to the value. The same end goal could be achieved by the server having both jpeg and jxl versions of the file on the server and using content negotiation, rather than encoding negotiation. Doing so prevents the server from having to constantly re-encode the source file and spending CPU cycles and energy.

Comment by @eustas Jan 27, 2021 (See Github)

Note that Content-Encoding is a property of the payload, so saving-as will save the jxl version

'Content-Encoding' is a property of network payload. Most browsers decode the network payload first and then provide it to users. Example: load a text file with Content-Encoding: br (or gzip). "Save" will save plain text... With jxl Content-Encoding the output is a normal JPEG file, so users don't have to have support of new format to use the saved images.

Comment by @eustas Jan 27, 2021 (See Github)

@plinss The benefit of jxl Content-Encoding is that users will be able to "save" images right away and those will be normal JPEGs, not jxl. Also with "Content-Encoding" we make a promise that users will be receiving the same content without any possible conversion loss.

Discussed May 1, 2021 (See Github)

Reviewd their feedback, propose closing satisfied, still not sure the result is worth the effort but not harmful. To be seen if this is more popular that simply swithing to a jxl image format and mime type.

Comment by @plinss May 13, 2021 (See Github)

Thanks for the responses. @ylafon and I took a look a this in today's breakout session and we don't see any issues with this proceeding. We think it's to be seen how much this gets taken up in the wild vs just switching to jxl content types. At some point this may be a target for deprecation if usage numbers are low, but given the amount of jpeg usage this may have some demonstrable short-term benefits.