design-reviews#37: http-problem

#37: http-problem

Visit on Github.

Opened Oct 14, 2014

IETF spec in the APPSAWG - http://datatracker.ietf.org/doc/draft-ietf-appsawg-http-problem/

May be interesting for TAG review as it effectively defines an object to convey an (HTTP) API error.

Discussions

Comment by @mnot Jan 8, 2015 (See Github)

Discussed in NYC; @domenic to review. @slightlyoff might also have feedback.

Comment by @domenic Jan 8, 2015 (See Github)

Stuff to not forget when reviewing:

@slightlyoff's initial feedback that there's no namespacing of extension properties vs. blessed ones
Whether if we have a standard error format, we can leverage that in e.g. Fetch or similar places throughout the platform?
Align with ES errors (potentially better ones in the future)?

Comment by @mnot Mar 17, 2015 (See Github)

Ping? Being discussed at IETF Dallas on Monday.

Comment by @domenic Mar 17, 2015 (See Github)

Some urgency is exactly what this needed, thanks. Will do this week.

Comment by @domenic Mar 20, 2015 (See Github)

Here's what I got. Should I deliver it to apps-discuss, or pull request it here, or should we workshop it first, or...?

A proposal along these lines is sorely needed on the web, as explained very well by its introduction. My main concerns are around the confusing nature of type, title, and message fields, and how they might not be suited for common use cases.

My first concern is the recommendation that "type" be a dereferenceable URL. I think many APIs will simply want to have a unique-within-their-API ID per problem type, but that seems to be implicitly discouraged by this document, which wants them to mint new HTTP-accessible HTML pages describing the problem. I understand how it's very nice to have this kind of self-describing type, similar to link relations. But I worry that the additional burden will discourage people from using this format, or will end up with them putting non-URLs in the type field.

I'm also unsure how to make sense of the combination of "title" plus "detail". Given that both are meant to be human-readable, when is the separation helpful? What is the recommended way to display this to humans: title, two newlines, then detail, perhaps? Most error-communication systems in programming langauges have a machine-readable type (e.g. TypeError, IOError, etc.) and a single human-readable string (often message). I would venture a guess that many developers model their conception of errors similarly. Introducing two human-readable fields makes it harder to conceptualize how this is supposed to map to existing systems, or how to display the resulting error to the user. Ideally, I think, there would be a single "message" field, which could be mapped to that same field in most programming languages.

The "instance" field seems to be of unclear utility. In which scenario would this be useful? Is it worth making it part of the spec, instead of letting it be a per-scenario extension? How do you envision it being generically processed?

The optional "status" field seems fragile, as mentioned in section 5. What is the purpose of including it, especially since it is optional?

Finally, I'm unsure how comfortable I am with the lack of constraints on extension members. It might be more future-proof, or perhaps just cleaner, to separate them into a sub-object ({ "type": ..., "title": ..., "other": { "customer_id": ... } } or similar). Or, simply mandate that they must begin with a given prefix (an underscore, "other_", "details-", ...).

All in all though, this is really exciting. If it takes off, you could imagine it having ripple effects across web architecture. For just one example, you could imagine a built-in option to fetch that will use a application/problem+json response to construct a JavaScript Error object with its fields appropriately pre-filled, and reject the resulting promise instead of relying on a status-code-then-parse-error path in the fulfillment handler. Or, if the URL-based error identifier approach really takes off, you could imagine it helping solve our problem in JavaScript of identifying errors more precisely.

Comment by @mnot Mar 23, 2015 (See Github)

Thanks for the review. A few responses:

Here's what I got. Should I deliver it to apps-discuss, or pull request it here, or should we workshop it first, or...?

I'll send a link to this to apps-discuss for their info.

A proposal along these lines is sorely needed on the web, as explained very well by its introduction. My main concerns are around the confusing nature of type, title, and message fields, and how they might not be suited for common use cases.

Great!

My first concern is the recommendation that "type" be a dereferenceable URL. I think many APIs will simply want to have a unique-within-their-API ID per problem type, but that seems to be implicitly discouraged by this document, which wants them to mint new HTTP-accessible HTML pages describing the problem. I understand how it's very nice to have this kind of self-describing type, similar to link relations. But I worry that the additional burden will discourage people from using this format, or will end up with them putting non-URLs in the type field.

Well, it's a SHOULD, so you can violate it if you have reason to. It's a requirement because it seemed important to document the error type, but I agree that it might be too high a bar -- it's more of a strong suggestion than a requirement.

I'm happy to downgrade it to something more suggestive.

BTW, just to be clear, you're not saying it's a problem that it's a URI (possibly relative), correct?

I'm also unsure how to make sense of the combination of "title" plus "detail". Given that both are meant to be human-readable, when is the separation helpful?

"title" is defined by the error type, and therefore static between instances of the error. "detail" is specific to a particular instance.

What is the recommended way to display this to humans: title, two newlines, then detail, perhaps?

SGTM.

Most error-communication systems in programming langauges have a machine-readable type (e.g. TypeError, IOError, etc.) and a single human-readable string (often message). I would venture a guess that many developers model their conception of errors similarly. Introducing two human-readable fields makes it harder to conceptualize how this is supposed to map to existing systems, or how to display the resulting error to the user. Ideally, I think, there would be a single "message" field, which could be mapped to that same field in most programming languages.

I think the mapping is:

"type" - error type
"title" - human readable description of error type
"detail" - stack trace (except we don't want actual stack traces exposed via HTTP APIs, usually, because that's a security issue)

The "instance" field seems to be of unclear utility. In which scenario would this be useful? Is it worth making it part of the spec, instead of letting it be a per-scenario extension? How do you envision it being generically processed?

For example, you might use it to correlate a particular instance of an error in logs, identify it in support calls, etc.

Make sense, or need more convincing?

The optional "status" field seems fragile, as mentioned in section 5. What is the purpose of including it, especially since it is optional?

Many people complain that the HTTP status code isn't available in their API, or isn't persisted in their systems, so putting it in the body preserves that information. Personally, I'm ambivalent about this, and happy to drop it; they can infer the intended status from the type's documentation.

Finally, I'm unsure how comfortable I am with the lack of constraints on extension members. It might be more future-proof, or perhaps just cleaner, to separate them into a sub-object ({ "type": ..., "title": ..., "other": { "customer_id": ... } } or similar). Or, simply mandate that they must begin with a given prefix (an underscore, "other_", "details-", ...).

I like the sound of that. Any preference in terms of style?

All in all though, this is really exciting. If it takes off, you could imagine it having ripple effects across web architecture. For just one example, you could imagine a built-in option to fetch that will use a application/problem+json response to construct a JavaScript Error object with its fields appropriately pre-filled, and reject the resulting promise instead of relying on a status-code-then-parse-error path in the fulfillment handler. Or, if the URL-based error identifier approach really takes off, you could imagine it helping solve our problem in JavaScript of identifying errors more precisely.

In-ter-esting!

Comment by @domenic Mar 23, 2015 (See Github)

I'll send a link to this to apps-discuss for their info.

OK, thanks :). Hi apps-discuss, and thanks for your willingness to come engage in our issue tracker.

Well, it's a SHOULD, so you can violate it if you have reason to. It's a requirement because it seemed important to document the error type, but I agree that it might be too high a bar -- it's more of a strong suggestion than a requirement.

I'm happy to downgrade it to something more suggestive.

BTW, just to be clear, you're not saying it's a problem that it's a URI (possibly relative), correct?

I think the core of my issue is envisioning someone designing a HTTP API, perhaps someone who hasn't bought into REST all that much. They might naively expect something like { "type": "not_enough_money" }. Being told that they have to convert this to { "type": "https://api.example.com/error-types/not_enough_money" } could cause them to abandon this spec on first read-through as not suited for their server. If they could simply do { "type": "not_enough_money" }, that would be great. If they instead had to do { "type": "about:not_enough_money" } or something, that might also be OK. Remember that we don't need to be globally unique here---we just need to be unique per-API.

What are your thoughts on this perspective? When you talk about possibly-relative URLs, do you mean that { "type": "not_enough_money" } is fine, since technically it fits the grammar of a relative URL?

"title" is defined by the error type, and therefore static between instances of the error. "detail" is specific to a particular instance.

Sure, I got that. But when is that helpful?

I think the mapping is:

OK, this is a bit worrying. So given new TypeError("custom ID must be a string"), your mapping is:

"type": "TypeError"
"title": "Type Error"
"detail": "<...gory stack details here ...>"

So there isn't even any place for message to go?

For example, you might use it to correlate a particular instance of an error in logs, identify it in support calls, etc.

Make sense, or need more convincing?

Makes sense indeed. Error logs is a good point to bring up. Although, again, suggesting that it should be a URL seems unlikely to be something that people will be willing to do. A simple ID would be something I could envision actually implementing. Whereas, I'm definitely not going to spin up the HTML page production machinery (templating, etc.), in order to write out a new page and start serving it, in the middle of my error handler. A GUID or incrementing "problemID" seems more likely what I would program into my server.

I like the sound of that. Any preference in terms of style?

Sub-object would be my weak preference. @slightlyoff might have more opinions.

Comment by @slightlyoff Mar 23, 2015 (See Github)

I'm very sorry for not getting this done in a more timely way. Sincerest apologies to the IETF WG in question.

In no particular order:

Reserving an extension syntax is a good idea: Having something like "-" to delimit extensions from built-in items, or defining your extension object (where anything goes) is a good idea early on. Don't really have a preference, but "-" in the name seems fine.
"instance" is odd: it doesn't define a schema for the problem definition or the information to be de-ref'd. Is this common enough to need to be part of a spec? Is their prior art justifying it?
How is the pre-defined problem-type registry going to be maintained?: It isn't clear how additions will get made. Who "owns" it?

In general, I think this is great. Love that it's happening. Is there someplace I can look to understand the discussion that led to each of the fields?

Comment by @mnot Mar 23, 2015 (See Github)

Hey Alex,

No worries. I'm going to open separate issues on my repo and link back here to make sure we don't lose anything (both for your comments and @domenic's).

Reserving an extension syntax is a good idea: Having something like "-" to delimit extensions from built-in items, or defining your extension object (where anything goes) is a good idea early on. Don't really have a preference, but "-" in the name seems fine.

I got some pushback from people on using '-' in names, as there's a preference for camelCase in many existing APIs. Given @domenic's preference above, an extension object might be better.

"instance" is odd: it doesn't define a schema for the problem definition or the information to be de-ref'd. Is this common enough to need to be part of a spec? Is their prior art justifying it?

See explanation above - does that make sense? Obviously need to improve docs if both you and @domenic didn't get this.

How is the pre-defined problem-type registry going to be maintained?: It isn't clear how additions will get made. Who "owns" it?

There isn't a registry; you just use the URI. If a common set of useful types develops, people can collect them together in informal resources (e.g. a wiki). Good enough?

In general, I think this is great. Love that it's happening. Is there someplace I can look to understand the discussion that led to each of the fields?

Hmm, it was discussed a bit on-list, and in my repo before that. See: https://github.com/mnot/I-D/issues?q=is%3Aissue+label%3Ahttp-problem

Comment by @mnot Mar 23, 2015 (See Github)

@domenic -

OK, this is a bit worrying. So given new TypeError("custom ID must be a string"), your mapping is:

"type": "TypeError" "title": "Type Error" "detail": "<...gory stack details here ...>" So there isn't even any place for message to go?

No, that would be something like

"type": "customIdType"
"title": "custom ID must be a string"
"detail": <any specifics>

or, more likely something like:

"type": "dataValidationError",
"title": "Data Validation Error",
"detail": [
    {
         "fieldName": "customID",
         "expectedType": "foo"
    }
]

Comment by @domenic Mar 23, 2015 (See Github)

Yeah, we're starting to get at the source of the issue here. Which is that, I think most programming languages don't encourage you to create an entirely new error class for each situation. Instead, you reuse a few error types (e.g. ArgumentException or TypeError or similar) in many different ways.

The larger point I'm arguing is to think about how server-side programers are meant to be using this and integrating it with their existing programming paradigms, most of which will be based around things like new TypeError("customerID field must be a string") (JavaScript) or new ArgumentException("customer ID must be a string of alphanumeric characters") (Java). If this draft were written by one of them, I think they would have likely reduced to either one or two fields (message, and maybe type).

As-is, there's a pretty big mismatch between the error model implied by this draft, and the one I think most server-side software will be using already. So the implication is that server-side programmers will not be able to start producing these problem responses right away, until they go through all of the places in their current system that produce errors, and replace those lines of code with something that produces an appropriate { "type", "title", "detail" } structure. If this is the path envisioned, that's fine---maybe someone will publish an npm package that lets you do throw new Problem({ type, title, detail }) and people will start using that instead of throw new TypeError. Or maybe the next version of Rails/Spring/etc. will define similar classes built-in to their response processing. But that's a very different adoption path than would be given by a single "message" field, and implies a lot more work on the part of both framework authors and server-side developers, which I think will limit adoption.

Comment by @mnot Apr 1, 2015 (See Github)

Sorry, got busy at IETF92.

I question whether it's good practice to just catch errors and deterministically transmute them to on-the-wire artefacts; it seems like an open invitation for security issues.

Also, the aim here is not to take over all error reporting formats -- though that would be nice. It's to avoid people defining their own formats, so they don't get trapped by the common pitfalls.

Comment by @mnot Apr 23, 2015 (See Github)

Discussed in SF; can close.

Comment by @jasnell Apr 28, 2015 (See Github)

Btw, it's far from perfect but... https://www.npmjs.com/package/http-problem

@domenic is right... it's not particularly natural to use this model on the server side.

Comment by @jasnell Apr 28, 2015 (See Github)

It would be far more natural having a pattern such as ...

throw new Problem('http://problem_id', 'the message', {foo:'bar'});

where the serialization would be...

{
  "@type": "http://problem_id",
  "message": "the message",
  "foo": "bar"
}