To clarify how the translation happens, an example is provided. Below is an image of the full process. This shows all the steps of a translation. Steps three and four are simply the reverse of two and one, respectively, so complete descriptions of them are omitted. In this example, MARC is being used for the interoperable core; it is not the final target of the translation. We're using MARC for the interoperable core while we develop the system.
To translate records, they must first be converted from their native format to our intermediate form. To allow for the most extensible translation system, the syntactic and semantic operations must be separated and the intermediate form is the instrument of that separation.
The intermediate form was designed to be simple and straight-forward. Each record contains one or more fields. Each field must have a name and a namespace and can have a value and any number of subfields. Values are objects unto themselves, and have several properties, including scheme, language and encoding.
Here is an abbreviated example of a GEM record. Here is the same record, in the intermediate form. Notice the "scheme" child of the "format" element in the original record is for its sibling, "contenttype". Another record format might put the scheme information in an attribute of the "contenttype" or "format" element, or as a child of the "contenttype" element. This is a trivial example of the sort of variance that necessitates the intermediate form. If the intermediate form weren't there, then a translation would be required for each variant form, even though they shared identical semantics. Regardless of where the scheme information is in the original record, after being parsed into the intermediate form it would have to be a property of the value associated with the "contenttype" field. To wit:
Namespaces will be omitted from further examples.
Here's an example of a mapping of the Dublin Core title element, which is a component of GEM, to MARC.
(map (source-element "title") (core-element "245" "a") (addField parent ("i1" "0")) (addField parent ("i2" "0")) (addField parent ("h" "[electronic resource]")))
For this simple mapping, the system would start by finding a "title" element in the input record. For each such element, the system would create a "245" field in the core record, with an "a" subfield. The value of the "title" element would be copied to the "a" subfield. After this, the system would look at the rest of the mapping. In this case there are three addField instructions. The second element of the addField instruction is for specifying where the fields will be added. If nothing is provided, the fields are added as children of the last field of the core-element indicator. In this case, since "parent" is specified, the fields will be added as children of "a"s parent, "245". The first element of each list is the name of the field to be added, the second element is the value for that field. So, if the input record contained, in intermediate form
English Grammar 101
The output record would contain this:
English Grammar 101 0 0 [electronic resource]
If there were two
Step three would translate from the interoperable core to the final target schema, in intermediate form. Step four would turn the intermediate form of the record into the desired syntax.