#580: Limit allowed "accepted" extensions in File System Access API file pickers.

Visit on Github.

Opened Dec 2, 2020

HIQaH! QaH! TAG!

I'm requesting a TAG review of a small tweak to the File System Access API.

Initially the File System Access API (previously known as Native File System API) had no limitations on what strings were allowed to be used as accepted file extensions in the showOpenFilePicker and showSaveFilePicker methods.

Since the file picker (on most platforms) appends these extensions to the filename the user enters, this can result in filenames with characters we don’t want to allow/that are otherwise problematic. In particular we don't want to allow control characters or whitespace in suffixes, or filenames that end in a '.'. As such we add restrictions on what characters are allowed in accepts file extensions/suffixes, as well as limiting their length to 16.

Limiting extensions to only contain alphanumeric characters, + or . still allows all extensions in the shared-mime-info database as well as nearly all extensions in Wikipedia's List of filename extensions.

Further details:

  • I have reviewed the TAG's API Design Principles
  • Relevant time constraints or deadlines: As this fixes potential security issues we will be shipping these changes as soon as possible. We will try to address any feedback that comes in afterwards.
  • The group where the work on this specification is currently being done: WICG
  • The group where standardization of this work is intended to be done (if current group is a community group or other incubation venue): WebAppsWG
  • Major unresolved issues with or opposition to this specification:
  • This work is being funded by: Google

You should also know that...

[please tell us anything you think is relevant to this review]

We'd prefer the TAG provide feedback as (please delete all but the desired option):

💬 leave review feedback as a comment in this issue and @-notify @mkruisselbrink

Discussions

Comment by @annevk Dec 3, 2020 (See Github)

Should these restrictions be aligned with <input type=file accept>?

Comment by @mkruisselbrink Dec 3, 2020 (See Github)

I don't see any reason why we couldn't align these restrictions, but it also matters less for <input type=file accept>. The security concerns are mostly about writing to files with dangerous extensions, so for security purposes I don't think we need to align them, but I don't see a reason why aligning would hurt either.

Discussed Jan 1, 2021 (See Github)

Looks sensible, but none of us have ever see .c++ file extension

Comment by @cynthia Jan 27, 2021 (See Github)

@kenchris and I looked at this today.

First of all, neither of us think we have ever seen a main.c++ file in the wild. Is that particular bit really needed, aside from this really weird case? (Unless this was a brown M&M, in which case well done!)

Otherwise, this looks like a sensible change.

Comment by @kenchris Jan 27, 2021 (See Github)

I think that restriction to ASCII + "." makes a lot of sense - thought you can actually use emoji and control chars and the like in file extensions on Windows

image

Does any OS actually associate .c++ with an IDE? I have never in my life as a C++ developer seen ".c++" extension and maybe we can avoid whitelisting "+"

Comment by @kenchris Jan 27, 2021 (See Github)

I don't think we should support "+" because then you could argue that we should also support "#" as some people and tools support for that C#

image

image

image

Comment by @kenchris Jan 27, 2021 (See Github)

There is of course also ".c--":

image

Maybe we should just add "+", "-" and "#". @cynthia found another example of # used in the wild

Comment by @kenchris Jan 27, 2021 (See Github)

And "$" and "!" :-)

image image

Luckily I don't find anything else weird here:

https://en.wikipedia.org/wiki/List_of_filename_extensions_(A%E2%80%93E) (and other sections)

Almost forgot that ~ is also used in Linux for temporary files and also seen elsewhere:

image

So ["+", "-", "#", "$", "!", "~"] - I am fine with adding those, but just adding "+" and not the others seems strange

Discussed Mar 1, 2021 (See Github)

Ken: no feedback. About what file extensions you can use, they added .c++ ... if you're adding that you need .c# and others commonly in use that are weird, I researched and aded screenshots and we haven't heard anything back.

Dan: will ping Chris H about it. Re-review in plenary and see if there's anything new

Discussed Mar 1, 2021 (See Github)

Sangwhan: no response to our feedback

Dan: I'll ping the requester and Chris

Comment by @torgo Mar 23, 2021 (See Github)

Hi @mkruisselbrink any feedback you can share based on Ken's comments above?

Comment by @mkruisselbrink Mar 23, 2021 (See Github)

Sorry for the slow reply. The current list of characters is indeed fairly arbitrary. I think I originally did include most of what you're asking about (going of the wikipedia list), but got some push back that without clear use cases it was better to be as minimal as possible (because who knows what characters might have special meaning on particular file systems/operating systems). I agree that + is a bit of an odd one out in that regard. I believe I mostly went by what was in the freedesktop.org shared-mime-info database, which does include .c++, but none of the other characters (so yes, most linux installations will treat .c++ files as the same mime type as .cpp and .cc, and thus have the same applications associated with them).

I'd be fine with adding the rest, although so far nobody has asked for them yet.

Comment by @kenchris Mar 24, 2021 (See Github)

Ok sounds good, we are good to close this then. Thanks for consulting the TAG.