Please note: This experimental research project has concluded.
The research prototype application is no longer supported or maintained by OCLC services, and information on this page is provided for historical purposes only. Some portion of this content may be out-of-date and include broken links. Please visit the OCLC Research website to learn more about our current research.

NACO Normalization Service

 

This service is used to prepare text strings for machine comparison and sorting, according to the NACO normalization rules.

 

Background

 

These rules were developed for comparing name data. An example of such a comparison is determining whether a particular name already exists in an authority file such as the OCLC Research LC Name Authority Service. However, any kind of text string can be normalized by this service. We have found it useful in working with title strings as well as names.

Why do we need rules for matching and sorting? Because machine comparison is very specific, and names, in particular, can be recorded in different ways.

 

Impact

The NACO normalization rules faciliate accurate machine comparison by providing information about how the names to be compared are constructed, e.g., what character set is used, how capital letters and spaces are used, and how to handle diacritics and punctuation.

 

The NACO rules have been implemented in various ways. We reconciled three different ways of automating them to develop the NACO Normalization Service.

This work has been used in the OCLC Research FRBR projects. OCLC Research's NACO Normalization Service is part of the MIT/HP DSpace product.

 

More Information

 

The following resources are available (ZIP:142K/6 files) to assist you in testing your implementation of the NACO normalization algorithms:

  1. NACO.py - a sample implementation
  2. NACOnocommas.script - a test script that removes all commas from normalized fields
  3. NACOnocommas.check - expected responses for the NACOnocommas script
  4. NACO.script - a test script that leaves the first comma in the first subfield 'a'
  5. NACO.check - expected responses for the NACO script
  6. NACO.java - a sample implementation

Please note that each of the sample implementations will run themselves against the scripts and expected response files.

Use of this site is subject to OCLC's terms and conditions. By continuing past this point, you agree to abide by these terms.

 

Downloads

 

Resources for Testing NACO Implementations

The following resources are available (ZIP:142K/6 files) to assist you in testing your implementation of the NACO normalization algorithms:

  1. NACO.py - a sample implementation
  2. NACOnocommas.script - a test script that removes all commas from normalized fields
  3. NACOnocommas.check - expected responses for the NACOnocommas script
  4. NACO.script - a test script that leaves the first comma in the first subfield 'a'
  5. NACO.check - expected responses for the NACO script
  6. NACO.java - a sample implementation

Please note that each of the sample implementations will run themselves against the scripts and expected response files.

 

 

Use of our prototypes is subject to OCLC's terms and conditions. By continuing past this point, you agree to abide by these terms.


Try the online demo

 

Go to the OCLC Research NACO Normalization Service.

Enter the string of characters to be normalized. Typically this would be the name of an author or other entity associated with a work:

  • Alain-René Lesage
  • Lesage, Alain-René
  • U.S. Dept. of Agriculture

The service also can accept data in the MARCMaker format:

  • $aLytton, Edward Bulwer Lytton $cBaron $d1803-1873

Press .

The demo will return the text string's normalized form:

  • alain rene lesage
  • lesage, alain rene
  • u s dept of agriculture
  • lytton, edward bulwer lytton\baron\1803 1873

 

 

Lead

Thom Hickey