For me, the most interesting parts of the talk were webarch-related. It all started with the statement (to paraphrase) "we must build on the web architecture". Carl then pointed out how representations are essentially second-class citizens on the web.
That got me thinking. At the most basic level, repositories are all about managing bitstreams (whether they're considered data or metadata). In webarch, bitstreams seem to equate to what they call "representation data". And a representation is defined by how it relates to a resource:
"A representation is data that encodes information about resource state."So, in w3c-speak, a repository manages representation data. Okay, that's just a terminology change. But what about this statement:
"For robustness, Web architecture promotes independence between an identifier and the state of the identified resource."That makes a whole lot of sense for the web when you consider how often web pages change. But what does it mean for repositories? How do we manage bitstreams if we can't identify them? The answer must be one of the following:
- Indirect identification. Identify the associated "resource" in order the work with the bitstream(s).
- Reification. Elevate the bitstream to a "resource" so we can talk about it.
representationA represents urn:example:someTextFile
representationA contentType "text/plain"
representationA payloadLocation "/path/to/someTextFile.txt"
I think the OAI-ORE work is going to attempt something like the above: a model (and maybe a format?) for expressing resource-representation information in a repository-neutral way. It will be interesting to see what pops out.
No comments:
Post a Comment