|
|
Batch Processing glossary
A-
abend
- Abnormal end. A batchload abends when OCLC's batchloading routines cannot read your file. Sometimes it is because a tape was physically damaged. Often it is because the tape or file is formatted incorrectly. The file cannot be processed and must be replaced.
-
add
- To load an original bibliographic record into the database after batchloading routines cannot find a match for it.
-
archive record
- The bibliographic record OCLC creates and stores that contains local changes made by an institution to the master record. Archive records are a complete history of an institution's OCLC cataloging activity. There are two separate archives. One for batch transactions, Batchload Archive Records (BARS), and the other for online transactions, Cataloging Archive Records (CARS). Archiving allows an institution to recover data that was lost or corrupted locally by ordering a copy of their archive records through OCLC's Bibliographic Record Snapshot service. Archive records are not accessible by the institution through any online means. OCLC uses archive records to create offline products—catalog cards, electronic files of records.
See also BARS,
Bibliographic Record Snapshot
-
ASCII
- American Standard Code for Information Interchange. A standard computer character code set, consisting of alphanumeric characters, punctuation, and a few control characters (such as a carriage return). Each ASCII character consists of 7 information bits and 1 parity bit for error checking.
B-
BACN
-
See National bibliographic agency control number.
-
BARS
- Batchload Archive Records. For every batchload transaction, a copy of the OCLC master record is stored for the institution in the archive. An institution may ask OCLC to retain their local data in the archive record for any Batchload project. The Batchload Archive is separate from Cataloging Archive Records.
See also archive record,
CARS.
-
Batch cross reference report
-
See batchloading.
-
Batchload Processing Summary
- A summarization of statistics after processing a batch of records using the OCLC batchloading service.
-
Batchload Project Definition
- The specifications for an institution's OCLC batchload project. The Batchload Project Definition details the input format and character set, the how the records will be matched, what clean-up routines will be included in the customized set up, record output if applicable, etc
-
Batchload Status Report
- A report detailing the results of an initial tape or file evaluation or of problems with data after a project has begun in the OCLC batchloading service.
-
batchload with output
- An option for an institution to elect to receive a copy of the matching OCLC MARC bibliographic record when it sets its holding symbol in WorldCat through the OCLC batchloading service.
-
Batchload file analysis
- A software program that identifies critical errors in a file of bibliographic records. These errors, if left uncorrected, would prevent batchload software from processing the records.
-
batchloading
- A process by which records to be processed are collected into batches. The records in a batch are loaded all at once. In OCLC services, batchloading is an automated method of processing bibliographic and local holdings records (LHR) into WorldCat. (Separate service from cataloger-initiated batch processing in the Connexion client.)
See bibliographic record,
local holdings record
-
batchloading size
- The number of records batch processed at one time using the OCLC batchloading service. OCLC can process up to 90,000 records in a single batch, depending upon the type of record and the level of processing. OCLC divides larger batchload files so that no batch contains more than 90,000 records. For example, OCLC may divide a batchload file of 98,000 records into batches of 90,000 and 8,000 records.
-
Batch processing
-
See batchloading.
-
bibliographic record
- Contains the cataloging information that describes the physical format and intellectual content of a single entity (a book, video, computer file, CD, etc.). Catalogers create records by encoding this information using tags, indicators, and subfield codes in a standard format. The MARC 21 (MAchine-Readable Cataloging) or another metadata format may be used. Each record is divided into fields (author, title, subjects, etc.) Fields are subdivided into subfields (place of publication, publisher, etc.).
-
Bibliographic Record Snapshot
- OCLC service used to obtain a complete or partial file of archival records of an institution's cataloging activity for installation on the institution's local system.
-
BL
-
See batchloading.
C-
cancel
- Action to remove an institution's OCLC symbol from a record in the database. Batchloading cancels a record when a match is found and the Record Status element (Leader byte 5) contains the character "d". This is equivalent to the Delete Holdings command (delhld) in the OCLC online system.
-
cartridge
- A case containing reeled magnetic tape, a takeup reel and feed mechanisms, used instead of 9-track reel-to-reel tapes on some computer systems. Although several standard sizes of magnetic tape cartridges exist, OCLC accommodates only IBM 3480-compatible cartridges.
-
CARS
- Cataloging Archive Record is the archive record created from any online transaction (replace, update, cancel holdings, or produce). This archive record is a copy of the OCLC master record including any local edits to the record at the point in time the online transaction occurs. CARS is stored separately from the Batchload Archive Record (BARS).
See archive record,
BARS.
-
Cataloging Agent mode
- An authorization mode that allows the Cataloging Agent of an institution or a group to process cataloging records on a client's behalf. Agents may also process unresolved records from the group's batchloading activity.
See also processing center.
-
cluster
- A membership option for general members that share a local, online, automated library system with libraries that are not general members. The general member provides OCLC database records to the nonmembers for holdings not owned by the general member. This membership option requires a separate written application and agreement with OCLC.
-
CODEN Designation
- The Chemical Abstracts Service assigns six-character Coden Designations to serials. The first four characters are letters and have a mnemonic relationship to the serial. The fifth character is either A, B, C, or D. The sixth character is an alphabetic or numeric check character. For example, AISJB6, CADIDW. The Coden Designation is located in field 030, subfield ‡a and can be used during batchload matching.
-
communications format
- In machine-readable cataloging (MARC), the standards for representation and exchange of data in machine-readable form. In the USA, this is an implementation of an ANSI standard. MARC 21 Format for Bibliographic Data (formerly called USMARC) is an implementation of ANSI standard Z39.2. OCLC-MARC is an implementation of MARC 21 that conforms to the ANSI standard. Standards provide a common way of organizing machine-readable records so that they can be easily exchanged among users.
-
conditional add
- The loading of an original bibliographic record into the database after batchload routines have not found a match. The record must not contain validation errors.
-
CONTENTdm
- OCLC's digital collection software suite. It allows you to digitize, store, search and mount your digital objects on the Web.
-
content designation
- In machine-readable cataloging (MARC), the codes and conventions established to identify and characterize the data elements within a record and to support manipulation of that data. OCLC defines content designation for OCLC-MARC records in OCLC Bibliographic Formats and Standards, Authorities User Guide, and Union List User Guide.
-
content of the record
- The data in the MARC records. The content is defined by standards outside the format, such as Anglo-American Cataloguing Rules, Library of Congress Subject Headings, ANSI/NISO Standards for Serials Holdings Statements or other rules and codes used by the organization that creates the record.
See also record structure;
content designation
-
Cross reference report
- A post-processing report generated for each batch order. This report lists the institution's unique record identifier, usually from the 001 field, and the OCLC number for the matching WorldCat record. Also referred to as the Batchload Cross Reference Report or xref.
-
CSD
-
See OCLC Customer Services Division (CSD).
D-
data check
- An abend-causing error resulting from an unreadable segment in a file. A data check generally means that there is a physical problem with a tape. It may have been damaged in transit, twisted on the reel, exposed to a magnetic source, or otherwise damaged. You must replace the tape or file.
-
DDR
-
See duplicate detection and resolution.
-
default
- Selections made by the computer or a software program in the absence of specific instructions by the user.
- Elements supplied in the bibliographic or local holdings record by subroutines in the customized batchload setup for missing or invalid values. Defaults are only supplied to enable processing or to aid in accurately matching the record.
-
definition
-
See Batchload Project Definition.
-
delete
-
See cancel.
-
delimiter
- Character (‡), followed by a single letter or number code, used to define the beginning of a subfield within a variable field in a MARC bibliographic or authority record.
See also subfield.
-
The delimiter is entered as a dollar sign ($). If immediately followed by a lowercase letter or single numeral, OCLC converts the dollar sign into a delimiter. A $ followed by a numeral and period, two or more numerals, or a blank space is interpreted as a monetary value and is not converted.
See also subfield.
-
Dublin Core
- An international standard that supports creation of simple, informative descriptions of electronic resources that facilitate management and discovery. The Dublin Core standard defines 15 elements for a resource description. Users can define additional elements or qualifiers for the standard elements to adapt Dublin Core to meet their needs. The standard elements provide a shared semantic framework that allows communities operating under different rules or standards to exchange metadata.
Dublin Core has as its goals the following characteristics: simple creation and maintenance, commonly understood semantics, international scope, and extensibility.
For more information visit the official DCMI site at:
http://www.dublincore.org. -
duplicate detection and resolution
- Software that identifies and merges duplicate records in books and serials format. The software uses complex algorithms and can match some records that failed to match in regular batchload processing. For example, regular batchloading does not match the publisher strings Charles Scribner and Chas. Scribner. DDR software may find that all other criteria match, recognize the identical Scribner strings in the publisher, and determine that the records match. OCLC uses this software at its discretion, based on a given file's probable matching rate and probable duplication rate.
-
duplicate record
- A bibliographic record that describes an item already represented in the database.
-
duplication rate
- The percentage of records in a file that would duplicate records in the OCLC database if those records were added.
E-
EBCDIC
- Extended Binary Coded Decimal Interchange Code. A code for representing alphanumeric information. EBCDIC defines a specific set of characters. OCLC does not accept files written in EBCDIC.
-
EBS
- Electronic Batchload Service. An OCLC service that allows batchload participants to transfer files to OCLC over the Internet.
-
EDX
-
See OCLC Electronic Data Exchange (EDX) account.
-
Electronic Data Exchange
-
See OCLC Electronic Data Exchange (EDX) account.
-
error rate
- The percentage of records within a file that contain validation errors.
-
eSerials Holdings Service
- OCLC service that sets and maintains title level holdings for electronic serials in WorldCat.
-
evaluation
- The evaluation of bibliographic records sent for batch processing involves an extensive analysis of the data. This evaluation is necessary because a custom computer set up is created to process the records. Batch Services staff identify problems with the records as well as problems with the file structure. All of this analysis is done to obtain the highest and most accurate hit rate for a project.
-
extended matching algorithm
- A procedure in OCLC batchloading services that creates derived search keys from incoming records and compares data to find a matching record. OCLC has extended matching algorithms for all record formats except computer files and archives/manuscript materials. OCLC applies this algorithm at its discretion based upon evaluation of the data in the file.
F-
file analysis
-
See Batchload file analysis.
-
file transfer protocol (FTP)
- File Transfer Protocol (FTP) is a TCP/IP-based protocol that is generally available for file transfers to and from a large variety of hosts including IBM mainframes, Tandem Guardian systems, and Unix hosts. FTP is the method used to retrieve files from an OCLC EDX account. It is also used in the OCLC Electronic Batchload Service.
See also OCLC Electronic Data Exchange (EDX) account.
-
forced add
- The loading of a file of records to the database without attempting to find matching records and without validating their data. OCLC does a forced add rarely, when materials in the file are so unusual that matching records in the database are unlikely.
-
FTP
-
See file transfer protocol (FTP).
-
full member
- An OCLC general member that contributes current cataloging and holdings to WorldCat.
G-
GAC
- Group Access Capability. A group of institutions that use the OCLC system for resource sharing and interlibrary lending. A GAC has full and selective members. A selective user has access to only abbreviated bibliographic records, and only to records for its own group. Groups are composed of at least one Full member and may include Selective members who use WorldCat Resource Sharing only. GAC can also refer to the group itself.
-
GAC/UL
- Group Access Capability/Union List. The same as a GAC, but a selective user also has access to local holdings records for its group.
-
group project
- Data submitted on behalf of multiple libraries in a single file or tape for batchload processing. Processing options for group projects differ slightly from those for single-institution projects.
I-
IHB
- Institution Holdings Bit (IHB) is represented in online displays by the OCLC symbol.
-
institution symbol
-
See OCLC symbol.
-
International Standard Book Number (ISBN)
- A unique identification number assigned to a work by its publisher. Each ISBN has ten characters. The tenth character is a check character that may be a number or the letter x. In printed form, the ISBN has three hyphens. Hyphens are omitted in online records.
-
International Standard Serial Number (ISSN)
- A unique identification number assigned to a serial through the ISSN Network. Each ISSN has eight characters. The eighth character is a check character that may be a number or the letter x. A hyphen follows the fourth character.
L-
LHR
-
See local holdings record.
-
LHR retrieve
- A scan process that collects and saves all LHRs associated with an OCLC institution symbol into a file so they can be replaced after a scan/delete of that OCLC symbol.
-
LHR update
- The process that returns the LHRs from an LHR retrieve file and resets the OCLC symbol on the associated bibliographic record in the OCLC database.
-
library identifier
- A locally assigned code that represents a particular institution. The code appears in a local data field of a bibliographic record submitted for group processing. A translation table converts the library identifier to an OCLC symbol for a matched record. The OCLC symbol is set on matching records for "set holds" and removed from matched records on "cancels."
-
Library of Congress Control Number
- An accession number assigned by the Library of Congress. The LCCN is usually a two- or four-character number representing the year, followed by a hyphen and up to six numbers. The Library of Congress formerly referred to control numbers as card numbers.
-
live data
- Data to be used for setting and canceling OCLC symbols (and not merely for evaluation).
-
local holdings record
- The local libraries holdings record based on the local holdings format.
- The OCLC record used to contain and exchange holdings data. It can record both institution (SIHD) and copy (SCHD) holdings information.
-
local system
- An institution's computer system that manages cataloging, acquisitions, circulation, serials and/or an online catalog.
-
local system vendor
- The company that manufactured a library's integrated library system.
M-
Machine readable cataloging
-
See MARC.
-
MARC
- Machine-Readable Cataloging. An internationally acceptable standard for the exchange of bibliographic data in machine-readable form.
-
MARC 21 Format for Bibliographic Data
- The format for printed and manuscript textual materials, computer files, maps, music, serials, visual materials, and mixed materials. Bibliographic data commonly include titles, names, subjects, notes, publication data, and information about the physical description of an item.
-
MARC 21 Format for Holdings Data
- The format for representation and exchange of holdings data from the
Library of Congress.
See also communications format.
-
MARC record
- MAchine-Readable Cataloging record. A bibliographic, authority, or other record type based on standards for representing and exchanging electronic data.
See also communications format.
-
mixed batch
- An offline service for batchload members sending both OCLC-derived and non-OCLC-derived records.
N-
National bibliographic agency control number
- The unique numbers assigned to a record by a national bibliographic agency other than the Library of Congress. These numbers are record control numbers used in a national bibliographic agency system and are coded in field 016, subfield $a. They are unique keys that can be used during batchload matching.
O-
OCLC
- Online Computer Library Center, Inc. Nonprofit membership organization serving libraries around the world to further access to the world's information and reduce library costs by offering services for libraries and their users.
-
OCLC control number
- A unique accession number assigned by the OCLC system when a record is added to WorldCat. Used to search for records.
-
OCLC Customer Services Division (CSD)
- OCLC's user assistance and support contact desk that provides support for telecommunications, hardware, and software.
-
OCLC-derived records
- Bibliographic records originally obtained from OCLC. Some institutions export records directly from OCLC into their local system, without setting holdings online. These records are then returned to OCLC as a batchload project to set and/or cancel the institution's holdings.
-
OCLC Electronic Batchload Service (EBS)
- An OCLC service that allows batchload participants to transfer files to OCLC via the Internet.
-
OCLC Electronic Data Exchange (EDX) account
- Service offered by OCLC for the transfer of data via standard File Transfer Protocol (FTP) whereby OCLC provides an Electronic Data Exchange account into which OCLC posts bibliographic and authority records, label records, and reports. The institution retrieves files from the account via the Internet.
See also OCLC Product Services Web,
file transfer protocol (FTP).
-
OCLC holding library code
- A unique code that identifies a holding library within an institution.
See also OCLC symbol.
-
OCLC-MARC
- OCLC's implementation of the MARC bibliographic format.
-
OCLC Product Services Web
- An OCLC Web site from which users retrieve labels, records, and reports from their OCLC Electronic Data Exchange (EDX) accounts. It also provides OCLC software downloads, macros, scripts, and labels.
See also OCLC Electronic Data Exchange (EDX) account.
-
OCLC symbol
- A unique identifier assigned by OCLC to members and participants. OCLC symbols in records and in holdings displays identify libraries that have entered and used the bibliographic record for cataloging.
See also OCLC holding library code.
-
OCLC User and Network Support (UNS)
-
See OCLC Customer Services Division (CSD).
-
OCLC-derived
- Bibliographic records originally obtained from OCLC. Some institutions export records directly from OCLC into their local system, without setting holdings online. These records are then returned to OCLC as a batchload project to set and/or cancel the institution's holdings.
-
one-time batchload
- A batchloading project, such as retrospective conversion, that has a predetermined number of records and takes place within a specified period.
-
ongoing current cataloging
- Bibliographic files submitted regularly for batchload processing. These files represent materials acquired by an institution since becoming a member of OCLC
-
order number
- Number assigned to a file of records as received from the user. Each file the user sends us is an order and is assigned its own number. An order may consist of several physical files, such as separate reels, received at one time and to be processed the same way. If a user sends more than one file and they are to be processed differently (such as one file for canceling holdings and one for adding holdings), there are two orders. An order can be split into multiple batches for processing, but each batch number relates to a specific order. Some batches may be further separated into other files for processing of some sort, but they still relate to the same order.
-
original cataloging
- Cataloging performed by an institution itself and not derived from any other source.
- Records produced by original cataloging might be matched in batchload processing.
-
other standard indentifier (OS#)
- Standard numbers or codes published on an item that cannot be accommodated in another field (fields 020, 022, 027, etc.). Use the first indicator or subfield ‡2 to indicate the type of number or code. this number is located in field 024, subfield ‡a and is used for batchload matching.
-
other system control number (OSCN)
- OCLC uses field 029, subfield ‡a for system control numbers for records from non-OCLC automated systems (e.g., Library and Archives Canada, the British Library, vendors, etc.). OCLC uses these numbers to process and track records from other systems. They can also be used during batchload matching.
P-
PCC
- Program for Cooperative Cataloging. An international cooperative project. PCC includes BIBCO (monographic bibliographic component), CONSER (Cooperative Online Serials Program, the serials bibliographic component), NACO (name authority component), and SACO (subject authority component).
-
Post-processing report
- A report generated for a batch after it has been processed.
-
processing center
- An OCLC-defined arrangement whereby a single cataloging agent creates records or sets holdings in the online system under a single OCLC symbol for multiple institutions. Holding library codes identify the separate institutions. Holding library codes are retained in the online archive copy of bibliographic records, but only the OCLC symbol for the processing center appears in the holdings display of records.
See also Cataloging Agent mode.
-
profile
- A structured definition of the content and format by a participating institution for OCLC products. During the profiling process OCLC assigns symbols to institutions.
-
project coordinator
- The person responsible for acting on behalf of a number of administratively separate institutions involved in a group batchload project. The project coordinator is the main contact with OCLC for all aspects of the project throughout its duration.
-
PSWeb
-
See OCLC Product Services Web.
-
publisher number
- Plate and publishers' numbers for printed music (scores); serial and matrix numbers for sound recordings; videorecording numbers for visual materials, and publisher numbers other than those for sound recordings, music, or videorecordings. The publisher number is located in field 028, subfield ‡a or field 262, subfield ‡c and is used for batchload matching. For example, B 07042 L.
Q-
qualifier
- A means of limiting a search to specific classes of records, for example: type of material (ft), years of publication (yr), microform or not microform (mi), and cataloging source Library of Congress or other (so).
- Qualifiers are elements from the bibliographic record that are used to in combination with a search term to narrow a search. Batchload matching software often use one or more of the following qualifiers to obtain a more accurate match during batch processing: title, date, material type, and language of cataloging.
R-
record
- In the context of OCLC batch search key processing, a record can be any amount of bibliographic data that represents a library item. The data can be minimal, as in a search key, or more detailed information can be provided such as a bibliographic record.
-
record structure
- In MARC 21 formats, record structure is the order in which the content designators and content appear in the record and/or file. Record structure can include such specifications as tape media, header, blocking techniques, and characters sets used in the record and in files. The terms record format and record structure are often used interchangeably. Record format is generally the broader term and often refers to a combination of record structure, content designation, and content of the record.
See also content designation;
content of the record.
-
report number
- Uniquely identifies a technical report; not a series number. There are two kinds: Standard Technical Report Numbers and other, nonstandard numbers.
-
resolved record
- A record that has been successfully processed through Batchload and then either matched to an existing record (including final action) or added to WorldCat.
-
RetroCon Batch Service
- An OCLC service that offers low-cost options for converting minimal-MARC or non-MARC records, catalog cards, or database files, to full bibliographic records in OCLC-MARC format.
-
Retrospective
- (1) Converting bibliographic information to machine-readable records. (2) For OCLC members, converting machine-readable records cataloged before becoming an OCLC member.
S-
scan/delete processing
- An offline process that cancels an institution's OCLC symbol from all bibliographic records and deletes all local holdings records for that institution. This process is not reversible, nor is any record of the deletion retained or written to an archive. The process is used only when an institution closes permanently.
-
set hold
- The attachment of an OCLC symbol to a bibliographic record through batchloading.
-
setup
- Custom programming to modify bibliographic records before every batchload project.
-
single-institution project
- A batchloading project defined for a single institution.
-
status report
-
See Batchload Status Report.
-
streamed data
- Data whose format is such that a logical record may span more than one physical record.
-
subfield
- The smallest unit of information in a variable field. Data in subfields is preceded by a delimiter (‡) and a single letter or number code. A subfield a (‡a) is implicit at the beginning of most fields (the delimiter and code do not display).
See also delimiter.
-
subfield delimiter
-
See delimiter.
-
symbol upgrade
- A one-time batchload to convert holding library codes to OCLC symbols.
T-
tape density
- The number of bits per inch (bpi).
-
Tapeload Save file
- The Tapeload Save file is part of the institution's Cataloging Save file, or Save file. It is an online storage area where unresolved (non-matching) records from batch processing can be placed for resolution by the institution.
-
tapeloading
- Term previously used for batchloading when the only method for transferring data was tape.
-
translation table
- A table used for converting holding library codes (or other local library identifiers) to OCLC symbols.
U-
unconditional add
- The loading of an original bibliographic record into the database after batchload routines have not found a match. The record is loaded even if it contains MARC validation errors. OCLC Quality Control must manually correct such records.
-
uniform resource identifier
- The Uniform Resource Identifier (URI) provides electronic access data in a standard syntax. The URI is located in field 856, subfield ‡u and can be used during batchload matching.
-
union list
- A catalog combining the detailed holdings of more than one institution. Union List refers to the OCLC Union List service.
See detailed level information.
-
unique-key matching algorithm
- A procedure that matches records by these unique keys: field 035 (OCLC control number), field 010, field 020, or field 022.
-
unresolved record
- A record that failed to match any database record after being processed by the matching algorithms and was not added to the database.
-
UNS
-
See OCLC Customer Services Division (CSD).
V-
validation error
- An error in the MARC format of a record as detected by OCLC validation software. Examples of validation errors include: invalid codes, tags, indicators and subfields, missing required elements, and repetition of non-repeatable fields.
See also error rate.
-
vendor-provided data
- Information sent by the vendor partner in the vendor manifest that is added to the bibliographic records provided to institutions, such as barcodes, invoice numbers, invoice dates, and prices.
W-
WorldCat
- A database of tens of millions of online records built from the bibliographic and ownership information of contributing libraries. The WorldCat database is the largest and most comprehensive of its kind. OCLC members use WorldCat for a full array of technical library services, including cataloging, interlibrary loan, reference, union listing, local holdings, and many more.
Z-
Z39.50
- Z39.50 is an information retrieval protocol that supports communication among different information systems. The maintenance agency for the protocol is the
Library of Congress.
|