Author: Claudia Wagner | Published: 06th November 2008 | RSS |  LINK

As Henry Story already has pointed out in one of his Blog Posts Data Portability needs Linked Data, because it is impossible to move one thing and all its relations from one site to another using just copy by value, without moving everything.  So copying references is the only solution to port and share data across application boundaries.

I think that instead of copying references from one site to another to reach Data Sharing, it would be better to interlink references (e.g. interlink the reference of  an article with the reference of a site or of a container of a site).  Because through defining such links (e.g. “has_container” or “reply_of”) the user reveals his data sharing needs, his information disclosure wishes and his general view and understanding of the resources and relations in the Social Web and makes this knowledge machine-understandable.

It is up to each application to decide how to handle such incoming or outgoing links which connect resources stored on it with external resources. The handling will heavily depend on the user (and the trust of the user) who has defined the link (i.e. the provenance of the link), the accessibility of the data to share (i.e. the access policies which protect the data) and the data publishing policy of the target application (e.g. such a policy can state that only comments of user’s with a proved email address are published).

Examples:

1) Sharing public resources across applications

Imagine a user reads several online newspapers and all report about a certain event. He comments on one newspaper the article about this event in order to express his opinion. As he wants his comment also to be displayed on one of the other newspapers he tends to read, he interlinks his comment of article A, which is stored on the site A of newspaper A, with the article B on site B of newspaper B (via a sioc:reply_of relation). By establishing this link he clearly exposes for everyone that he wants both sites to use this resource created by him and he also tells them how or in what context to use it.

Such a sharing of resources could also be done without interlinking, but the interlinking has the advantage that users start exposing their understanding and knowledge about resources and relations between them in the Social Web in a machine-understandable form. The fact that applications then directly exploit the links and for example share or port resources, can be the motivation for the users to do the interlinking manually or semi-automatically. Users need to see a direct impact of what they are doing.

I am aware of the fact that it would be also good to have article A and B connected via a resource representing the event or the topic about which they are reporting, because that is the main reason why the articles are related. But on the bottom line both links represent different things. One link represents how the tow articles are related, and the other one represents the data sharing wish from a user’s perspective. Anyhow the automatic calculation of high level concepts which relate resources can be surely supported by the user-defined links and user-defined links can be proposed by applications via exploiting the exisiting relations annotated with concepts.

2) Sharing protected resources across user identities and applications
Imagine that a user wants to share a resource (e.g. a photo) owned by him and stored on his user account of a Social Web application (e.g. flickr) with another user account (owned by him or owned by another person) of another Social Web application (e.g. facebook). Let us call the site where the resource is stored source application and the site with which the resource should be shared target application.

By defining that the photo (sioc:Item), which is stored on a user account (sioc:ImageGallery) of flickr (sioc:site), should be interlinked (via a sioc:has_container relation) with a certain user account (sioc:container) of facebook (sioc:site), the user expresses that he wants this photo to be shared between the selected facebook and flickr user accounts.

The photo the user wants to share is private and so the target application client (which can but need not act on behalf of a user which has an account on the application) can normally not access the resource to share. Therefore he must model access policies which allow these user identity to access as much information as needed about the resource in order to be able to share it (but not more) .

The access control can be basically modeled according to one of the following access control models:

1) Discretionary Access Control (DAC) model

The DAC model is often also called Identity Based Access Control (IBAC) model and is based on Access Control Lists (ACL) which are defined for each protected resource by the owner of the resource. The user who owns the protected resources to share must authenticate on the source application in order to prove that he is the owner of the resources to share. Then he is allowed to modify the ACL of the protected resource. The list stores which identities have which access rights. Access rights define which permissions can be performed on which protected resource. The owner of the resource to share must modify the ACL of his protected resources in order to allow user identities from third party applications to access the resource. For each request the access control system must prove the identity of the requesting client, in order to be able to grant or deny access to the client. That means that the target application client must authenticate at the source application, in order to prove that he owns the user account he claims to own. The client must prove that he really acts on behalf of the user who owns the user account (owning the user account means having writing access on the user account). Therefore distributed authentication mechanism are needed.

2) Role Based Access Control (RBAC) model

The Role Based Access Control model is based on the definition of roles which have certain access rights (i.e. permissions to perform a certain action on a protected resource). A user must authenticate at the access control system in order to prove that he owns the protected resources and in order to model the access policies for his resources. After authenticating he is allowed to define roles which have access permissions on his resources and define which identities or groups of identities with common properties belong to one role. For each request the access control system must prove the identity of the requesting client, in order to be able to find out which roles the client has. Based on the owned roles the access control system grant or denies access to the client. also in this case distributed authentication mechanisms are needed.

3) Capability Based or Authorization Based Access Control model

This model is not identity based. It is based on capabilities. A capability is the reference to a resource that needs some access control. The fact that a requesting client posses a particular capability gives him or her the right to use the referenced protected resource according to the access privileges specified by the capability.

The owner of the resource to share has the possibility to define which properties must be shown by a requesting client in order to get a certain capability for a certain resource. In order to get a capability a client must prove that he has some properties. Properties can be based on the identity of a user, such as “to be the owner of a user account” or “to have a OU email address”. But properties can be also independent of a user’s identity, but rely on user’s context, such as “to be on a certain location” or “to know a certain secrete”.

Each time a requesting client tries to access a resource he only needs to present the his capability (access token). Only the first time the client must prove that he has certain properties to get the capability.

As far as I have understood how oAuth works, it basically provides one way to implement such a capability based access control model.

In the Social Semantic Web where applications should share resources from distributed sources without copying them, a capability based access control seems to be most appropriate, because authentication is not needed for each request . The owner of a resource should be able to grant or deny access to a requesting application, depending on whatever he wants. The owner can grant or deny access to an application or to a special user of an application.

In the example described above facebook is the target application which was interlinked via a “has_container” or “container_of” relation with the a resource stored on the source application flickr. Facebook gets the links and can understand what the user wants to express with the link. Facebook should define and store for each resource to share a query to ask for all information about the resource and its properties (e.g. creation data, name of the creator, email of the creator, comments, creation time of the comments, name of the creator of each comment, email of the creator of each comment) which facebook needs in order to be able to display the resource like a normal resource stored on facebook.
Facebook only stores the query and sends the query to flickr whenever the image gallery related with the user account is loaded. Flickr proves if the requested resources are public available. If there are any access restrictions the requesting facebook client must send a request token to flickr in order to state that it wants to access certain resources. The flickr user who owns the resource to share can decides if he grants or denies access to the requesting client. If he can grants temporary or permanent access, the client gets an access token. Through this access token he can access the resource in future without needing to re-authenticate until the access token becomes invalid.

All in all the big advantage of not copying the data but interlinking them is that both sites and the sioc:container belonging to the sites, refer to the same instance of the photo (sioc:Item). If new resources (e.g. comments, tags, people) are related to this photo both container can use them, no matter on which application the resources have been created.

Leave a Reply

Some basic HTML is allowed. Please keep all comments constructive, polite and on-topic. Any spam or offensive comments will be deleted.