|
|
|
Core Elements SubgroupPreservation Rights Metadata Use Cases
Read the most recent activity update!ResponsibilitiesThe Core Elements Subgroup is responsible for the following elements of the charge:
Activity ReportsThe Core-Elements subgroup is having mostly weekly conference calls and has accomplished the following: September 2004: The group spent time discussing the differences between files and bitstreams and how the semantic units applied to them. It was proposed that there was a need for a new level called "filestreams." This also related to previous discussions about embedded files. The group continued its discussion of environment elements and whether this information is dependent on file format information. It continued to define what information is needed about the environment in order to render objects for the long term. Two new participants joined the group, one from DSpace and another from the Walt Disney Company. A workplan was developed to finish the data dictionary by December in anticipation of a final PREMIS report by the end of 2004. August 2004: The group had a face-to-face meeting in Cambridge, Massachusetts, during the first week of August and continued to work through the data dictionary. Participants revised the data model, particularly in terms of how the various entities related to each other. There was also discussion about preservation policy and business rules and how these relate to the data model. The group made considerable progress on the list of semantic units applying to all file formats. Much discussion centered around multiple layers of file formats, i.e. embedded content objects with multiple wrappers and what metadata is needed. Discussion throughout the month centered around many of the issues that arose in the meeting. July 2004: The group revised and further discussed the data model showing entities and their relationships. There was some discussion about the rights and responsibilities inherent in preservation functions. The data model was sent out to the PREMIS Advisory Committee for comments, preferably before the August meeting. June 2004: The group decided to go to an every week schedule (except for one week a month) in order to speed up the work. Also, to move things forward faster, the group agreed to meet in the Boston area in conjunction with the early August meeting of the Society of American Archivists. May 2004: Work during the month included developing the agent entity and what is the minimal amount of information needed for preservation purposes. How agents relate to other entities (rights, events, objects) needs to be considered, and enough information to document who did what and who authorized what is necessary. The level of detail in agent information may depend on a particular implementation, so that defined in the data dictionary should be minimal and general. Agent role in events is important to consider. The group is working on developing the data model more fully, a document which will be a deliverable of the working group. This will detail relationships between entities. Work on technical metadata regardless of file format continued, particularly concerning format identification. April 2004: The group considered some example objects and how the current list of core data elements may be applied. In particular, it considered the Los Angeles Times text archive and Harvard's complex audio files. Conclusions were that those elements dealing with the objects themselves work fairly well, but conveying enough information about relationships between entities is problematic and needs further work. Work continued on the data dictionary, particularly on the objects and events entities. There was discussion about permanence levels mainly in terms of significant properties. Further discussion will be held after the group looks at technical metadata. The group considered strategies for working on technical metadata. Scope should be limited only to technical metadata that applies regardless of file format. Work on the Global Digital Formats Registry is useful here, since we may be able to use an identifier to point to information about a file format, but we cannot assume its existence and it's still in development. March 2004: The group worked on templates to use for the data dictionary largely based on what was developed at the National Library of Australia. We will use slightly different templates for different entity types (i.e. objects, agents, events, relationships). There was a lot of discussion about which elements are recorded at which level of granularity—i.e., at the representation, file, or bitstream level. This is important information to convey for guidance in applying the element set. In addition the group would like institutions engaged in preservation activities to submit use cases for the kind of information needed in the preservation context. Nancy Hoebelheinrich at Harvard is developing a template and sample use case for this purpose. This will help the group to determine what information is needed to determine terms and conditions for objects in a preservation context. February 2004: There was considerable discussion about the entity relationship diagram that the group was developing. It includes entities for objects, events, agents and rights statements. The group is further considering how to show relationships between the entities. As a result, the data dictionary is being reorganized to approach the data elements from an entity point of view. Small groups are revising each section and adding examples, clarifying definitions, etc. Participants are continuing to submit example objects to help clarify use of the element set. In addition the group has discussed the work at NLM on permanence ratings, and a small group was formed to analyze its impact on the core elements. January 2004: The core elements group met in San Diego on Jan. 8 and made substantial progress on preparing a data dictionary for core metadata elements for Preservation Description Information. In defining core, the group agreed that the elements to be included had to be essential for a working archive to know because they satisfy certain functions (e.g. viabilility, renderability, understandability, authenticity, identity). The data dictionary will include the name of the semantic unit and its components, definition, obligation (required or not), data constraints, level of entity, repeatability, examples and notes. Members of the group submitted examples of element usage with different kinds of digital objects. After the meeting it was decided to reorganize the data dictionary according to an object type model, and work began on the data model. In addition, the group continued to discuss elements for rights statements related to preservation and the need to understand use cases. December 2003: The group came to consensus on core elements for events and fixity information. The group will use a narrower definition of fixity information than OAIS to include validating document integrity and whether it had been changed; OAIS includes fixity and authentication together. As a result of this discussion, the group decided to consider additional documents in its deliverables to document any departures from OAIS and a paper about broad guiding principles. Some discussion occurred about rights related to preservation and some of the work underway on developing use cases for rights statements and the draft rights extension schema for METS. November 2003: The group came to consensus on core elements for relationships and continued its discussion of events. As a byproduct, the group again discussed a typology of entities for digital objects so that it will be clear at which level a given metadata element would apply. The group decided to have a face-to-face meeting in conjunction with ALA to make progress on core elements and the data dictionary on Jan. 9, 2004 in San Diego. October 2003: The group had several discussions about relationships between digital objects in order to determine which core elements were needed to describe these relationships. In particular, discussions centered around types of relationships and how they apply to different levels of objects. Two very useful documents were produced as part of this work by members of the group. Other discussions centered around how events are treated in various implementations compared to their treatment in the OCLC/RLG framework document from the previous working group. By the end of the month, the group had begun discussion on most of the elements relating to preservation description information, although more discussion was needed in some areas. September 2003: The group decided that using the spreadsheet for the element comparison would be useful. OCLC provided a second spreadsheet for the elements from Content Information. Element by element discussions began. Much attention was given to how implementations use identifiers and discussion began on relationships between objects. August 2003: Experimentation began with a spreadsheet provided by OCLC mapping their metadata elements to the OCLC/RLG framework. The spreadsheet was revised and others with implementations added theirs to the spreadsheet. This was done only initially for Preservation Description Information. July 2003: Discussions of what core means, the need for a glossary so that we all use the same terminology, methodology for comparing element sets. The group began a glossary that was then given to the full PREMIS group for comment and further discussion. |