#855: RDF Canonicalization
Discussions
2023-07-mos-eisley
Hadley: use cases are making sense.
Amy: the explainer reads as 'this is the work we will do' rather than 'this is the work we have done', as the explainer was originally written for the charter (noted by phila). Would be useful to have it updated to what was actually done. But assume they would have mentioned if they'd done anything radically different. They haven't filled out S&P questionnaire, but have S&P in the spec. We should ask them to fill the questionnaire.
Hadley: using quads as a triple with the graph name. Sounds complicated and repetitive. If you're hashing you should just be able to do that once
Amy: could ask the rationale for that. There's a 'todo' in privacy considerations in the spec.
Hadley: what if the hashing algorithm is no longer secure? SHA256 is okay for now.
Amy: be good to see mention of that in security considerations
We (@hadleybeeman and I) reviewed this in our virtual face-to-face this week. We like the direction of the work, and the design is sensible.
We noticed you haven't yet filled out the privacy and security questionnaire. Understanding that not all of the questions may be relevant, please could you do this?
Also, we see that you are using quads instead of triples and adding in the graph name once? It sounds more complex — but we suspect you have considered this at length. We are just interested in your thought process here. (This is the sort of thing we normally expect to see in an [explainer](https://github.com/w3ctag/tag.w3.org/blob/main/explainers/template.md).)
Also, we'd love to see the explainer when you've updated your explainer to bring it in line with the spec.
And finally, what happens if the hashing algorithm becomes insecure? It might be helpful to put a comment in the security considerations section to advise implementers in the future to consider that possiblity.
OpenedJun 9, 2023
こんにちは TAG-さん!
I'm requesting a TAG review of RDF Data Canonicalization.
There are a variety of use cases that depend on the ability to calculate a unique and deterministic hash value of RDF Datasets, such as Verifiable Credentials, the publication of biological and pharmaceutical data, or consumption of mission critical RDF vocabularies that depend on the ability to verify the authenticity and integrity of the data being consumed. See the use cases for more examples. These use cases require a standard way to process the underlying graphs contained in RDF Datasets that is independent of the serialization itself.
Further details:
You should also know that...
The spec has a long history and has implementations using the original version in production software.
We'd prefer the TAG provide feedback as (please delete all but the desired option):
💬 leave review feedback as a comment in this issue and @-notify gkellogg, dlongley, yamdan, philarcher, peacekeeper.