Clarify the notion and mechanisms for server-managed information #177

csarven · 2020-04-27T22:57:07Z

What are the conceptual and implementation differences between server-managed containment triples and the triples in server-managed resources (auxiliary)?

It appears to be that we have two different mechanisms in the spec that are conceptually similar, if not the same. How can we reconcile this?

Containers are self-descriptive in that the containment information is part of the resource description and managed by the server. Contained resources are tied to the lifecycle of the container. Container may include other information provided by clients. Containment triples can't be updated by clients.

Server-managed resources (auxiliary) are independent resources discovered through primary resources and are tied to their lifecycle. Server-managed resources are not writeable by clients.

We can see that although cumbersome, containment information can be found in a server-managed resource (auxiliary). We can also see that any information that can part of a server-managed resource (auxiliary) can instead be in a self-descriptive resource.

justinwb · 2020-05-01T01:01:22Z

We can see that although cumbersome, containment information can be found in a server-managed resource (auxiliary).

It's true that you could store containment triples in a server managed auxiliary resource associated with a container, but I'm not sure that we have to.

We can also see that any information that can part of a server-managed resource (auxiliary) can instead be in a self-descriptive resource.

It is true that you could store the same information in a regular Solid resource as a server managed auxiliary resource, but the provenance would not be the same. This is the key value in the server managed auxiliary resource type. No agent would have the ability to directly write or modify the data in a server managed resource. A consumer of the data in a server managed resource can have confidence that the data inside it (i.e. timestamps, creator, etc) was written by the server and no-one else.

csarven · 2020-05-01T09:07:46Z

The point of this issue is to note and justify why we have one mechanism where a server controls certain kind of information and yet another mechanism for other information. It seems to be a bit of an arbitrary split.

but the provenance would not be the same.

Explain.

No agent would have the ability to directly write or modify the data in a server managed resource.

Holds true for containment triples in the container resource that is managed by the server:

Containment triples can't be updated by clients.

Clients affect resources directly and indirectly eg:

Client requests to append/remove a resource to/from a container. Only the server manages the containment relationships.

Client requests to create a resource. Only the server manages the creator relationship.

It is not apparent why a container (or any other resource for that matter) can't be self-describing especially when it comes to information like "timestamps, creator etc." It is already doing that for containments as well as other information. Aside: Some of the Solid servers exposed posix information in the container, and client have nothing to do with it.

If the server-managed auxiliary resource had its own lifecycle ie. potentially outliving the resource it is about, that'd be a clear enough reason to keep the server-managed resource decoupled from the primary resource. It would allow the server-managed resource to be more useful and preserved independently. But that's not what's intended or within the scope of this auxiliary resource. If there is no value to preserving the server-managed information beyond the primary resource's lifecycle, why isn't the information in server-managed resource part of the primary resource - further supporting self-describing documents? I don't think the access modes is significant because the underlying information is deemed to be server protected regardless of where it resides.

It is more intuitive to find something about the resource at the resource than at another resource. Not a whole lot different than finding the Last-Modified HTTP header (which is server-controlled) where one would expect.

Just a thought: server-managed auxiliary resource seems a notch too broad / catch all. Would it help to bring it down a layer and frame it around provenance level information? We don't lose out on timestamps, creators etc. or it being server-managed (ie. read-only for clients). It could be expressed along the lines of activities, entities, agents (eg. PROV-O).

If the information that's intended for server-managed resource is dependent on the primary resource's lifecycle, it'd be relatively simpler for a server to manage that through a single resource instead of two. It also has minimal Web-footprint. Decoupling or perhaps loosely coupling the two resources may be preferable if different lifecycles have value.

jaxoncreed · 2020-05-04T20:45:12Z

It is more intuitive to find something about the resource at the resource than at another resource.

I disagree with this. It makes more sense to be able to filter out metadata. A "resource" is a thing and only that thing. It shouldn't need to include extra information. I'm in favour of using link headers to discover metadata.

elf-pavlik · 2020-05-04T20:46:11Z

During the call we also discussed Non-RDF Sources as well as resources with digital signatures. In both cases server can't really add triples to those resources.

csarven · 2020-05-04T22:17:30Z

@jaxoncreed On the contrary, we are not in disagreement. I didn't claim that resources should include metadata. Containment information and server managed auxiliary information are part of resource description. I consider them as data. There can be separate resources describing provenance information, history/audit, access logs etc.

We reframe what's intended for "server managed" to specific kinds of resources. For example, proposal for clients requesting to create Memento's URI-R and have the server create URI-M, create/update URI-T: #61 (comment) . Essentially include header:

PUT https://csarven.ca/linked-research-decentralised-web
Link: <http://mementoweb.org/ns#OriginalResource>; rel="type"

201 Created
Location: https://csarven.ca/linked-research-decentralised-web

A client can for example discover a resource's URI-M and URI-T from the link relations:

GET https://csarven.ca/linked-research-decentralised-web
Link: <http://mementoweb.org/ns#OriginalResource>; rel="type"
Link: <https://csarven.ca/linked-research-decentralised-web.timemap>; rel="timemap"
Link: <https://csarven.ca/archives/linked-research-decentralised-web/ce36de40-64a7-4d57-a189-f47c364daa74>; rel="memento"

Or discover provenance information through eg. rel prov:has_provenance. Use specific properties to discover audit, access logs. They can all be managed by the server.

@elf-pavlik Makes sense. The context was RDF sources any way. [My point was about server's capability, not whether they can, should, or not for any arbitrary resource.]

justinwb · 2020-05-05T01:25:07Z

@csarven I'm not sure I have a firm grasp of what you're looking to see changed for server managed auxiliary resources in the current proposal at #156.

From what I can tell, you support the use of link headers to discover server managed data in auxiliary resources (which is described in #156), but want those relations to be more specific, rather than one single catch-all for "server managed" data?

csarven · 2020-05-05T09:12:55Z

@justinwb That's right. Key points:

Server managed auxiliary resource is too broad (vague) for clients. In theory it can contain any information that the server deems useful, but it needs a specific shared data model for it to be actually useful for clients.
With a single relation for "server managed" information, clients will not know whether the description in the auxiliary resource is actually useful to them until they fetch and inspect.
General rule of thumb: resources should describe themselves. If the most recurring subject of the statements in server managed auxiliary resource use the same identity as in the primary resource, it is better to describe that identity in the primary resource eg. "R creator x" or "R date y" should be in the primary resource. This is the same pattern as the one I've discussed about where to put resource labels - they are not "metadata"!

One way to move forward:

Use specific link relations to indicate the kind of information that's expected at the target resource eg. provenance, audit, access logs - however the kind of information that the server wants to expose need to be sliced. For instance, the notion of provenance information is sufficiently precise and used in the wild. There is a specific relation available that can be loosely coupled with the Provenance Ontology or others. Same goes for Memento where the client can knowingly follow a relation to obtain Mementos, TimeMap and so forth.
The specific relations can still be managed by a server ie. only the server can write/append, as originally intended. But this ought not be the focus. It is just that there is no (strong) use case for an arbitrary client to modify the activities that a server observes - hence, read-only for authorized agents.

If the entities described in the auxiliary resource are significant and should have their own dereferencable identity, that's a clear enough indication to have them in independent resources instead of lumping it under the representation of the primary resource.

justinwb · 2020-05-05T12:31:40Z

@csarven Thanks for clearing that up, I think we're in agreement, and this also lines up with feedback from members of the data-interopability panel.

kjetilk · 2021-09-02T23:11:52Z

My feeling here is that the key to resolve this issue is to forward a more general understanding of the role of auxiliary resources than is now in the spec. Do people agree to that?

To do that, I think we need to define dimensions of auxiliary resources, I opened #306 for that, and then define hypermedia-based protocol extension points, as in #270. Does that sound like a plan, or do people think we should take a different angle to resolving this?

kjetilk · 2021-09-06T20:59:54Z

Since there is silence, in the interest of fast progress towards a Protocol 1.0 release, perhaps we shouldn't take on the full extension mechanism above, but just define the term "server managed" and then add the auxiliary types that are needed right now?

csarven · 2021-09-06T21:20:28Z

At this time, I'm not sure if there much point in talking about server managed resources if we can't refer to specific types that are part of the Protocol. Above I mentioned things like Memento resources being great candidates for this, which is not even currently required. Perhaps ont:FixedResource. #191 (comment) lists some types but nothing in particular that's strictly server managed. I don't see a strong reason to throw in "placeholders" information into the spec -- if there are applications that's working with that, let's see them and document common patterns. We could come back to this issue.

There is one other possibility that's already spec'd that is a good candidate for server managed: as:CollectionPage or ldp:Page as mentioned in #230 (comment)

kjetilk · 2021-09-16T21:21:04Z

I think there is plenty of patterns that have already emerged, enough to have a general idea to make an extensible system of auxiliary resources that have many different properties, where server-managed is one of those properties. The question isn't that, the question is if we should allow us that time before having the first version of the protocol specification, and we should possibly not do that.

I think this issue has conflated quite a few issues, we need to drill down to the essence of server managed resources. This is apart from the containment triples, which are indeed server managed, but they are a part of the compound state of the container representation.

A server managed resource is simply a resource that a client cannot be authorized to write to. It may have all kinds of different properties in addition to that, but that is the essence.

If we state as a principle that no state change should be unauthorized, one way to think about this is that the server is also an agent, and has privileges accordingly. Since it is the server, it shouldn't need to authenticate (but it could, as a security-in-depth measure). It may also reject clients use of control privileges, if the client attempts to gain write permission to a server-managed resource.

With that, we can define an auxiliary resource for server managed metadata about a container's children, that is tied to the container's lifecycle, has its own access control but where only the server has write privilege, and rejects clients attempts to gain write privilege by responding 403 to such requests, even in the case where the client has Control privilege. That should resolve this issue and #227 .

Then, we could also define an audit log, which is an auxiliary resource which is not tied to the resource's lifecycle, where the server has append, but not write, and rejects clients attempt to gain write privilege.

kjetilk · 2021-11-16T14:15:52Z

Any findings, @justinwb ? We can bump it from the milestone, right?

justinwb · 2021-11-16T17:57:09Z

My feeling here is that the key to resolve this issue is to forward a more general understanding of the role of auxiliary resources than is now in the spec. Do people agree to that?

Partially, though I think it's more specifically in regards to how server-managed information is handled in regular and/or auxiliary resources.

I think this issue has conflated quite a few issues

Agree!

we need to drill down to the essence of server-managed resources. This is apart from the containment triples, which are indeed server-managed, but they are a part of the compound state of the container representation.

I think we have to start by zeroing in more on server-managed data, because (like containment triples) it may not necessary only live in an auxiliary resource. I'm not convinced that we can be constructive when talking about server-managed data in general. I think we need to get more specific and deal with each type of server-managed data in context, because we may treat different kinds of server-managed data differently.

Any findings, @justinwb ? We can bump it from the milestone, right?

Yes - I think that this can be bumped from the milestone. I'm not sure that this is directly actionable, so I don't know that it should be moved to a later milestone. I think that it touches on actions that we need to undertake specifically related to server-managed metadata, but we may be better-served creating specific tickets for each class of metadata that needs specification.

csarven mentioned this issue Apr 29, 2020

Initial auxiliary resources draft #156

Merged

justinwb added the topic: auxiliary resources label May 1, 2020

justinwb self-assigned this May 1, 2020

csarven mentioned this issue Aug 6, 2020

Basic resource typing through HTTP Link header #191

Open

RubenVerborgh mentioned this issue Sep 16, 2020

Aligning representations of document and container resources with REST via single and compound state #198

Closed

This was referenced Feb 4, 2021

Specify container description #227

Open

Add security consideration about information exposure #228

Merged

elf-pavlik mentioned this issue May 11, 2021

Define means for Access Grant Subject to verify Access Receipt delivered in Message solid/data-interoperability-panel#107

Closed

1 task

kjetilk added doc: Protocol phase: Consensus labels Jun 24, 2021

kjetilk modified the milestones: ~First Public Working Draft, Current Month Jun 24, 2021

kjetilk removed the phase: Consensus label Jun 29, 2021

kjetilk mentioned this issue Aug 31, 2021

Dimensions of Auxiliary Resource Types #306

Open

kjetilk added the status: Nominated An issue that has been nominated for the next monthly milestone label Sep 14, 2021

csarven unassigned justinwb Sep 16, 2021

csarven self-assigned this Sep 16, 2021

csarven mentioned this issue Sep 23, 2021

Define creator #315

Open

csarven modified the milestones: Current Month, October 2021 Sep 29, 2021

justinwb removed this from the Release 0.9 milestone Nov 16, 2021

renyuneyun mentioned this issue Jul 11, 2024

Lacking specification of rules of HTTP methods for Description Resource #673

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify the notion and mechanisms for server-managed information #177

Clarify the notion and mechanisms for server-managed information #177

csarven commented Apr 27, 2020 •

edited

Loading

justinwb commented May 1, 2020

csarven commented May 1, 2020

jaxoncreed commented May 4, 2020

elf-pavlik commented May 4, 2020

csarven commented May 4, 2020

justinwb commented May 5, 2020

csarven commented May 5, 2020

justinwb commented May 5, 2020

kjetilk commented Sep 2, 2021

kjetilk commented Sep 6, 2021

csarven commented Sep 6, 2021

kjetilk commented Sep 16, 2021

kjetilk commented Nov 16, 2021

justinwb commented Nov 16, 2021

Clarify the notion and mechanisms for server-managed information #177

Clarify the notion and mechanisms for server-managed information #177

Comments

csarven commented Apr 27, 2020 • edited Loading

justinwb commented May 1, 2020

csarven commented May 1, 2020

jaxoncreed commented May 4, 2020

elf-pavlik commented May 4, 2020

csarven commented May 4, 2020

justinwb commented May 5, 2020

csarven commented May 5, 2020

justinwb commented May 5, 2020

kjetilk commented Sep 2, 2021

kjetilk commented Sep 6, 2021

csarven commented Sep 6, 2021

kjetilk commented Sep 16, 2021

kjetilk commented Nov 16, 2021

justinwb commented Nov 16, 2021

csarven commented Apr 27, 2020 •

edited

Loading