#350: Async Clipboard - image/delayed content
Discussions
Comment by @annevk Mar 11, 2019 (See Github)
As I realized a little late during the TPAC meeting, I don't think we should move ahead with this without figuring out the drag & drop counterpart. These APIs share a lot of infrastructure and should be co-evolved.
Comment by @dway123 Mar 12, 2019 (See Github)
We choose to defer decisions around drag-and-drop, even though it does share common infrastructure, so that we can decrease the scope and time of this explainer/review/implementation. That said, this spec'ed feature is compatible with the current web drag-and-drop API, and could easily expand to cover drag-and-drop in a future explainer/spec. I've noted that we have explored this (and other alternatives) in the explainer on the last page. Thank you for catching that we forgot to note the alternatives we had explored.
Are there any specific concerns or risks you're concerned about? I can try to address them either here or in the explainer.
Comment by @annevk Mar 14, 2019 (See Github)
Another thing here, during the F2F where this was discussed https://www.w3.org/2018/10/23-WebPlat-minutes.html#item02 there were objections from Apple and Mozilla representatives to doing this without standardizing some kind of pickling format. Perhaps the image re-encoding can be standardized somehow, but the explainer doesn't go in much detail and https://w3c.github.io/clipboard-apis/#dom-clipboard-write even less.
It's also not clear to me that it's a good idea to do away with the current API entirely, that's shared between drag & drop and copy & paste. Perhaps if both APIs were redone in a shared model it'd be easier to judge that nothing problematic is happening and there is indeed a natural extension path.
Comment by @garykac Mar 18, 2019 (See Github)
Re: Drag and Drop
There isn't any real shared infrastructure other than the need for a dictionary of { mimetype -> data }. Beyond that, the two APIs are very different and touch different parts of the UA implementation.
-
Clipboard: access system clipboard + related permission and security issues (re: decoding malicious content) + dict(mimetype->data)
-
DnD: lots of UI events relating to the dragging + permission/security(?) + dict(mimetype->data)
Note that any permission/security issues for DnD (to address malicious content) will be different than for Clipboard (since there is a guaranteed user gesture, but can't easily show a prompt).
Re: DataTransfer
DataTransfer
is only used by the Clipboard Event API to get the aforementioned dictionary of { mimetype -> data }. Beyond that it contains a collection of DnD-specific attributes (dropEffect
, effectAllowed
, dragImage
) that are not useful for Clipboard.
Our approach is to have a ClipboardItem
type that contains a standard dictionary. This allows the clipboard data to be created more easily (esp. when compared to a series of DataTransferItemList.add()
calls).
I've updated the explainer with a section about DataTransfer
.
Re: Pickling
The pickling conversation was in the context of making custom mimetypes and raw (non-transcoded) image data available on the native clipboard. We intend to document this format, but that is out-of-scope for this current phase.
Sanitization/transcoding
We want to leave the exact details of the transcoding up to the UA, but I need to make that explicit in the spec. Thanks for pointing that out.
Re: "do away with the current API"
The only clipboard API that actually uses the DataTransfer object is the Event-based API, and no changes are being made to that. The Async Clipboard API has never used DataTransfer (although we did try to make it work), so we're not removing any functional API.
Comment by @annevk Mar 19, 2019 (See Github)
other than the need for a dictionary
What about data sanitization? That data structure is quite a bit more complex than a map. And sure, there are some differences in the interaction with that data structure, but it's the shared primitive that is being touched here.
We want to leave the exact details of the transcoding up to the UA
I don't think that's acceptable.
Comment by @dway123 Apr 16, 2019 (See Github)
While data sanitization is a shared need, Clipboard and Drag-and-Drop do have many more unrelated needs, as garykac@ mentioned, such as differing UI events, controllers, and security models.
The specification mentions transcoding as a way to improve security. We don't plan to specify exact algorithms for the following reasons:
Doing so would effectively force us to unnecessarily enclose large parts of the relevant image format specifications. Having a precise transcoding algorithm may preclude hardware-accelerated implementations. However, we will clarify that the transcoding process will output image data conformant with the format's specification.
Comment by @annevk Apr 16, 2019 (See Github)
Such an approach forces other implementers to reverse engineer, right?
Comment by @pwnall Apr 16, 2019 (See Github)
@annevk Could you explain what & why other implementers would need to reverse engineer?
I'm going to lay out my assumptions below. Please react to anything that seems incorrect.
The idea behind data sanitization is prevent remote code execution vulnerabilities in native decoders. These vulnerabilities generally take advantage of insufficient input validation in a native decoder, so the sanitization would make sure that the browser only writes data that conforms to the expected input format.
The transcoding (should it be called re-encoding?) idea comes down to taking advantage of the fact that browsers already ship media processing libraries like libpng. So, instead of writing new sanitization code for each format to be written to the clipboard, a browser could use its existing libraries to decode the content produced by the page, then encode it again into the same format. The browser's libraries need to decode arbitrary content from the Web, so the decoders should already be secured against malicious input. Since the encoder ships with the browser, the encoder can be trusted to produce valid bytes.
So, other browsers shouldn't need to reverse-engineer anything. Chrome's transcoding / re-encoding logic is (modulo thread-hops):
const inputBytes = await ReadBytesFromBlob(inputBlob);
const decodedImage = PngDecode(inputBytes);
const safeBytes = PngEncode(decodedImage);
WriteImageToSystemClipboard(safeBytes);
Chrome currently calls into Skia for encoding/decoding images. The expectation is that other browsers would use the logic above, and have it call into their own media processing stack.
WDYT?
Discussed
May 1, 2019 (See Github)
[bumped as Sangwhan is not here
Discussed
May 1, 2019 (See Github)
Sangwhan not present.
Comment by @slightlyoff May 7, 2019 (See Github)
There seem to be 2 separate questions around transcoding in my mind, and neither seem to have been answered to my satisfaction:
- Why do we need to do this at all given that the same risks are exposed by file download, which we also allow on arbitrary user activation?
- What controls (if any) does the developer have over transcode quality?
The first one seems to be the more urgent to answer. If we don't need to transcode, @annevk's concern doesn't apply (not withstanding its overstatement; we've moved out of the era of closed-source browsers; you don't need to reverse engineer anything; just read some code).
As to the second, this gets to a question of layering. A well-layered web platform feature builds on an ever-shrinking set of primitive operations. We have some image transcoding operations in <canvas>
today, and the API should either be expressed in terms of them (although, again, these APIs are poorly layered and don't expose near enough developer control over encoding parameters) or, better, an off-thread-friendly, streams-based codec API whose parameters would enable developers to configure the various quality knobs and dials that are essential to make this API more than a toy for users of professional creative apps.
Comment by @dway123 May 8, 2019 (See Github)
-
We need transcoding to prevent remote code execution (RCE) vulnerabilities in native decoders from being exercised. File download has slightly different threats to be concerned about. For example, a user will be able to see that the file was downloaded (on the bottom downloads bar), and to see the name of the file before executing it (which is what may lead to these RCE vulnerability being exercised). In contrast, the Async Clipboard API may access data fairly silently after permission is granted, so a user may be caught off guard when pasting data outside Chrome if the data placed on the clipboard was not expected. With transcoding, this data is still guaranteed to be safe, even with user permission.
-
The developer doesn't have controls over transcode quality, which means this has parity with the existing clipboard API (and many other web platform APIs regarding images). As transcoding doesn't reduce/affect image fidelity, this shouldn't be a concern.
Regarding layering and use with related parts of the web platform, <canvas>
, <HTMLElement>
, streams
, DOMString
, and many other types were recommended as potential ClipboardItemDataType
s. Blobs are simply just the first implemented ClipboardItem type, as they’re able to represent a large variety of different Clipboard Types, and browsers are free to implement other representations as resources allow.
Additionally, as image/png (through blob) is the first type being added, this means that, as shown in the web platform tests, the API, when manipulating image/png, can interact with other APIs supporting image/png, such as ImageBitmap. However, this API isn’t dependent on ImageBitmap, so that browsers don’t need to implement ImageBitmap (or similarly, Streams, which also only has partial support across browsers), as a prerequisite for implementing this API.
Comment by @dway123 May 8, 2019 (See Github)
Also, to clarify, this spec/implementation should not preclude future implementations from using <canvas>
or ReadableStream
directly. The design and inclusion of ClipboardItemDataType
is intentional, to make this API fairly flexible, and to allow future implementations to build upon it as necessary. However, we considered these alternative types out of scope of the current spec/implementation.
Comment by @cynthia May 8, 2019 (See Github)
The story for transcoding sounds reasonable; what level of preservation should the user expect in terms of color space, metadata, and other non-raster information?
Comment by @dway123 May 8, 2019 (See Github)
For images, because the encoder/decoder that Chrome uses (Skia), and existing sanitization, can only guarantee the sanitization of raster information, only raster information is preserved. Non-raster information, therefore, is tossed out. This is consistent with the level of preservation for image metadata offered in other Clipboard APIs, and the ImageBitmap API.
However, the specification should not prevent implementers from deciding whether they choose to preserve non-raster information, as long as they can ensure it's secure.
Comment by @yoavweiss May 16, 2019 (See Github)
For images, because the encoder/decoder that Chrome uses (Skia), and existing sanitization, can only guarantee the sanitization of raster information, only raster information is preserved
Just to make sure, the transcoding is then entirely lossless? Does the same apply to lossy formats, such as JPEG?
Comment by @garykac May 16, 2019 (See Github)
No, the transcoding is not lossless. It cannot be since it needs to (in some cases) cleanup encoding exploits.
For images, because the encoder/decoder that Chrome uses (Skia), and existing sanitization, can only guarantee the sanitization of raster information, only raster information is preserved.
Read this as "the pixels come through (sanitized) but the image metadata is not copied". Note that this is just a description of Chrome's current implementation. UAs can choose preserve whatever metadata from the image that they feel is appropriate.
The main takeaway from the transcoding issue is that users cannot rely on the image being identical to what they put on the clipboard. The UA may need to make changes to it to address security concerns. This applies to both the image data and to the metadata.
Ideally, UAs would preserve as much information as they can in a safe matter.
Comment by @dway123 May 17, 2019 (See Github)
As garykac mentioned, users cannot rely on transcoding to be lossless.
It happens to be lossless for image/png's raster information, but this is only for Chrome's current implementation, and should not be expected/wouldn't preserve non-raster (metadata) anyways. For lossy formats, it may very well be lossy, or not, depending on UA implementation, but this is up to UA and there shouldn't be any expectation for lossless transcoding.
Comment by @slightlyoff May 30, 2019 (See Github)
Sorry for the slow reply. Guess we don't have updates from the last meeting.
On the transcoding front, I don't understand how this will work for professional productivity apps. Lossy transcoding is surely going to not fly for, e.g., image editing apps. What is their recourse? A separate permission requesting lossless access? Custom datatypes? Something else?
Comment by @pwnall May 30, 2019 (See Github)
Thank you for your feedback! We will be shipping image support for the async clipboard API in Chrome 76, and we are confident that we can address any feedback on transcoding without changing the API shape.
We are working on a different proposal targeting professional editing applications. We expect they will need complete raw clipboard access, which includes reading/writing OS-specific data types, as well as bypassing any transcoding. We're in the very early stages of that work, and will engage with TAG once we have a better idea of what we'd like to cover.
Comment by @cynthia May 30, 2019 (See Github)
UAs can choose preserve whatever metadata from the image that they feel is appropriate.
I think content authors would some level of consistency across implementations, leaving it completely open-ended has the risk of promoting preference of specific implementation over another - which is something we should try to avoid. (This browser does something vastly different from what we need for our tool to do what it is supposed to do, so let’s sniff and display a “please use a different UA” dialog is a mistake the web has made in the past.)
Comment by @annevk Jun 12, 2019 (See Github)
(not withstanding its overstatement; we've moved out of the era of closed-source browsers; you don't need to reverse engineer anything; just read some code)
Please remind me why we're here again?
Comment by @alice Mar 5, 2020 (See Github)
Since this has shipped, and we have no major outstanding concerns, happy to close this.
Thanks for the discussion!
OpenedMar 9, 2019
Góðan TAG !
I'm requesting a TAG review of:
DataTransfer
.Further details (optional):
You should also know that...
read()
andwrite()
methods of the new Async API (although comments on other parts are welcome).We'd prefer the TAG provide feedback as (please select one):
Please preview the issue and check that the links work before submitting
For background, see our explanation of how to write a good explainer.