From: Wynar's Introduction to Cataloging and Classification, 9th ed., mss.

 Ó2000 Arlene G. Taylor

(last updated 2/16/2000)

Part II

__________________________________________

ELECTRONIC FORMATTING

 

 

CHAPTER 3

ENCODING

In the age of online catalogs the content of catalog records must use some kind of electronic encoding scheme in order to be machine-manipulable for searching and display. Encoding allows each part of a record to be set off from every other part. Then computer programs can be written so that each part will be displayed in a certain position according to the wishes of the programmer. Encoding also allows for access to a catalog record through the creation of search programs that provide for the searching of certain parts of a record (e.g., the author, the title, a subject, etc.)

Since the mid-1960s an encoding system called MARC (MAchine-Readable Cataloging) has been used to create electronic catalog records. An excellent history of its development may be found in Deborah Byrne’s MARC Manual.1 The most current version for use in the United States is MARC 21, a harmonized version of USMARC and CAN/MARC, published in 1999.2

Other encoding systems have been developed recently in response to the desire to make catalog records available on the Web. SGML (Standard Generalized Markup Language) is an international standard for document markup. It is a set of rules for designing markup languages that describe the structure of a document. The markup languages thus designed are SGML applications called DTDs (Document Type Definitions). DTDs often used for bibliographic data include TEI (Text Encoding Initiative), HTML (HyperText Markup Language), and EAD (Encoded Archival Description). A DTD for MARC records has been developed, and at this writing is in experimental stages. XML (Extensible Markup Language), a subset of SGML that adds some features that solve earlier problems with SGML and HTML. It is gaining in use. A description of these encoding systems may be found in Arlene Taylor’s The Organization of Information.3

Currently, the USMARC encoding system holds the position of being the one used for bibliographic records in the majority of the world’s online catalogs. The remainder of this chapter contains an introduction to this system.

 

Introduction to MARC

FORMATS

There are formats for five types of data. The following formats are currently defined:

This introduction is to the bibliographic format.

 

COMPONENTS OF THE RECORD

A record is a collection of fields. A field contains a unit of information within a record. A field may consist of one or more subfields. Tags, that is, three-digit numerical codes, identify each field. Every field ends with a field terminator (in OCLC, for example, the field terminator appears as a backwards paragraph sign). Each subfield is preceded by a delimiter sign (often represented by a $ or | or ‡) followed by a single character code (usually alphabetical, but can also be numerical).

Each record has the same components:

1. Leader

2. Record directory

3. Control fields

a. Fixed fields

4. Variable fields

Leader—The leader is like the leader on a roll of film. It identifies the beginning of a new record and provides information for the processing of the record. The leader is fixed in length and contains 24 characters.

Record directory—The record directory contains a series of fixed length entries that identify the tag, length, and starting position of each field in the record.

Control fields—Control fields carry alphanumeric (often encoded) data elements. Control field tags always begin with the digit 0. Many control fields are fixed in length, that is, each fixed-length field must consist of a set number of characters (see also "Fixed field'' below). Common control fields are:

001 - Control number

005 - Date and time of latest transaction

006 - Fixed-length data elements - coding information about special aspects of the item being cataloged that cannot be coded in field 008 (separate character descriptions for books, computer files, maps, music, serials, visual materials, and mixed materials)

007 - Physical description fixed field - physical characteristics of an item, usually derived from explicit information in other fields of the record, but expressed here in coded form (separate character descriptions for map, computer file, globe, projected graphic, microform, nonprojected graphic, motion picture, remote-sensing image, sound recording, text, videorecording, and one for "unspecified")

008 - Fixed-length data elements - positionally-defined data elements that provide coded information about the record as a whole or about special bibliographic aspects of the item being cataloged

010 - Library of Congress Control Number (LCCN)

020 - International Standard Book Number (ISBN)

022 - International Standard Serial Number (ISSN)

033 - Date/time and place of an event

034 - Coded cartographic mathematical data

040 - Cataloging source

041 - Language code

043 - Geographic area code

047 - Form of musical composition code

048 - Number of musical instruments or voices code

Fixed field—There is one fixed length control field that is commonly referred to as "the fixed field." Field 008 carries general information about the content of the bibliographic record. This field is often displayed in a single paragraph at the top of the screen and is usually displayed with mnemonic tags. The field has 40 character positions. The data stored in this field are used to manipulate records for retrieval, filing, indexing, etc.

A fixed field from the record for a book as it is displayed in OCLC:

> Type:  a     ELvl:        Srce:        Audn:        Ctrl:        Lang:  eng
  BLvl:  m     Form:        Conf:  0     Biog:        MRec:        Ctry:  ilu
               Cont:  b     GPub:        Fict:  0     Indx:  1
  Desc:  a     Ills:  a     Fest:  0     DtSt:  s     Dates: 1997,     <

Sometimes fill characters (displayed as n or _ or |) are used to indicate elements of the fixed field that were not in use when the record was created or were not provided by the inputting library. A blank space (when written represented as a letter b with a forward slash through it) is meaningful. That is, a blank is input to represent a coded value. In "Srce:'' above, the blank means "Library of Congress.''

Variable fields—Variable fields carry alphanumeric data of variable length. The variable fields carry traditional cataloging data elements. Three-digit numeric tags (100-999) identify variable fields. In order to talk about these tags in groups, a convention is followed in which all fields beginning with "1'' are identified as 1xx fields; those beginning with "2,'' as 2xx fields; etc. Variable fields consist of heading fields and descriptive fields. Although classification fields may be considered to be control fields because they begin with 0, they are placed here with other elements that make up the "surrogate record" or the "bibliographic data" of a completed MARC record. Among the most used variable fields are:

CLASSIFICATION NOTATIONS AND/OR CALL NUMBERS (05x-08x)

050 - Library of Congress Call Number

060 - National Library of Medicine Call Number

082 - Dewey Decimal Classification Number

etc.

MAIN ENTRY FIELDS (1xx)

100 - Main entry--personal name

110 - Main entry--corporate name

111 - Main entry--meeting name

130 - Main entry--uniform title

TITLE AND TITLE-RELATED FIELDS (20x-24x)

240 - Uniform title

245 - Title proper, general material designation, remainder of title, statement of responsibility

246 - Varying form of title

etc.

EDITION, IMPRINT, ETC., FIELDS (25x-28x)

250 - Edition statement

254 - Musical presentation statement

255 - Cartographic mathematical data

256 - Computer file characteristics

260 - Publication, distribution, etc. (Imprint: place, publisher, etc., date)

etc.

PHYSICAL DESCRIPTION, ETC., FIELDS (3xx)

300 - Physical description (extent of item, other details, size, accompanying material)

310 - Current publication frequency

362 - Dates of publication and/or volume designation (serials)

etc.

SERIES STATEMENT FIELDS (4xx)

[title proper of series, remaining title information, statement of responsibility relating to series, ISSN of series, numbering, etc.]

440 - Series statement / added entry--title

490 - Series statement (not an added entry)

etc.

NOTE FIELDS (5xx)

500 - General note

502 - Dissertation note

504 - Bibliography, etc., note

505 - Formatted contents note

506 - Restrictions on access note

508 - Creation/production credits note

510 - Citation/References note

511 - Participant or performer note

516 - Type of computer file or data note

520 - Summary, etc., note

533 - Reproduction note

534 - Original version note

538 - System details note

546 - Language note

547 - Former title complexity note

580 - Linking entry complexity note

etc.

SUBJECT ACCESS FIELDS (6xx)

600 - Subject added entry--personal name

610 - Subject added entry--corporate name

611 - Subject added entry--meeting name

630 - Subject added entry--uniform title

650 - Subject added entry--topical term

651 - Subject added entry--geographic name

653 - Index term--uncontrolled

655 - Index term--genre/form

etc.

ADDED ENTRY FIELDS (70x-75x)

700 - Added entry--personal name

710 - Added entry--corporate name

711 - Added entry--meeting name

720 - Added entry--uncontrolled name

730 - Added entry--uniform title

740 - Added entry--uncontrolled related/analytical title

etc.

LINKING ENTRY FIELDS (76x-78x)

770 - Supplement/special issue entry

772 - Parent record entry

776 - Additional physical form entry

780 - Preceding entry

785 - Succeeding entry

787 - Nonspecific relationship entry

etc.

SERIES ADDED ENTRY FIELDS (80x-840)

800 - Series added entry--personal name

810 - Series added entry--corporate name

811 - Series added entry--meeting name

830 - Series added entry--uniform title

HOLDINGS, LOCATION, ALTERNATE GRAPHICS, ETC., FIELDS

852 - Location/call number

856 - Electronic location and access

880 - Alternate graphic representation

Subfields—All subfields are distinct elements within fields. Subfield definitions vary from field to field. Some of the most commonly encountered ones are:

050 LC call number

$a classification number

$b item number and date

082 Dewey Decimal Classification number

$a classification number

$b item number

$2 edition number [edition of DDC used]

x00 Personal name

(x00 means that these subfields apply in fields 100, 600, 700, and 800.)

$a name

$q qualification of name [e.g., Lewis, C. S. $q (Clive Staples)]

$b numeration

$c titles (e.g., Mrs., Sir, Bishop)

$d dates

$e relator (e.g., ill. [for illustrator])

x10 Corporate name

$a name

$b subordinate unit

$e relator

$k form subheading

x11 Meeting name

$a name

$n number

$c place

$d date

245 Title and statement of responsibility

$a title proper

$b other title information

$c statement of responsibility or remainder of area

260 Publication, distribution, etc.

$a place

$b publisher, distributor, etc.

$c date

300 Physical description

$a extent of item

$b other physical details

$c dimensions

4xx and 8xx Series

$a name of series

$x ISSN (4xx fields only)

$v numbering of series

6xx Subject access point

$a main subject (name, topic, etc.)

$x subject subdivision

$y time period subdivision

$z geographic subdivision

$v form/genre subdivision

856 Electronic location and access

$a Host name

$b Access number

$d Path

$h Processor of request

$u Uniform Resource Locator

etc.

Indicators—The two digits that follow after the tags in a MARC field are called indicators. Each digit position has a certain meaning relating to its particular field and provides computer instructions for processing the data contained in the field. For example, in the OCLC-formatted 245 field shown below, the first indicator, "1,'' tells the system that there should be an access point for the title, and the second indicator, "4,'' tells the system that four nonfiling characters (i.e., T, h, e, and [space]) precede the first significant word of the title:

245 14 The dictionary of misinformation / $c Tom Burnam.

 

Display of MARC Records

MARC records are distributed in the MARC Communications Format.5 Each record consists of one long character string beginning with the leader, followed by the record directory, followed by the fields one after another, with no breaks, to the end of the record (at which point there is a character to represent a record terminator). Such a record is practically unreadable if printed as transmitted, and so each system has a program that will display the record in a form that is more easily read. Figures 3.1 through 3.4 show the same MARC record as it is displayed in four different systems.

 

Fig. 3.1. MARC record as displayed in OCLC:

  OCLC:  19124014            Rec stat:    p
  Entered:    19890119       Replaced:    19900317       Used:    19990309
> Type:  a     ELvl:        Srce:        Audn:        Ctrl:        Lang:  eng
  BLvl:  m     Form:        Conf:  0     Biog:        MRec:        Ctry:  cou
               Cont:  b     GPub:        Fict:  0     Indx:  1
  Desc:  a     Ills:        Fest:  0     DtSt:  s     Dates: 1989,     <
>   1  010     89-2835 <
>   2  040     DLC $c DLC <
>   3  020     0872876217 : $c |$|45.00 <
>   4  050 00  Z674 $b .R4 no. 20 $a Z7837 $a BX1751.2 <
>   5  082 00  020 s $a 016.282 $2 19 <
>   6  090     $b  <
>   7  049     DD0A <
>   8  100 1   McCabe, James Patrick. <
>   9  245 10  Critical guide to Catholic reference books / $c James Patrick
McCabe ; with an introduction by Russell E. Bidlack. <
>  10  250     3rd ed. <
>  11  260     Englewood, Colo. : $b Libraries Unlimited, Inc., $c 1989. <
>  12  300     xiv, 323 p. ; $c 25 cm. <
>  13  440  0  Research studies in library science ; $v no. 20 <
>  14  500     Includes indexes. <
>  15  610 20  Catholic Church $x Bibliography. <
>  16  650  0  Reference books $x Catholic Church. <


Fig. 3.2. MARC record as displayed through the Z39.50 system of LC:

001    89002835 
003 DLC
005 19900227083313.4
008 890119s1989    cou      b    001 0 eng  
010   $a   89002835 
020   $a0872876217 :$c$45.00
040   $aDLC$cDLC$dDLC
050 00$aZ674$b.R4 no. 20$aZ7837$aBX1751.2
082 00$a020 s$a016.282$219
100 10$aMcCabe, James Patrick.
245 10$aCritical guide to Catholic reference books /$cJames Patrick McCabe ;
   with an introduction by Russell E. Bidlack.
250   $a3rd ed.
260 0 $aEnglewood, Colo. :$bLibraries Unlimited, Inc.,$c1989.
300   $axiv, 323 p. ;$c25 cm.
440  0$aResearch studies in library science ;$vno. 20
500   $aIncludes indexes.
610 20$aCatholic Church$xBibliography.
650  0$aReference books$xCatholic Church.

Fig. 3.3. MARC record as displayed in RLIN:

ID:DCLC892835-B             RTYP:c    ST:p   FRN:   MS:p  EL:     AD:01-19-89 
CC:9110  BLT:am      DCF:a   CSC:    MOD:    SNR:      ATC:       UD:03-18-90 
CP:cou     L:eng     INT:    GPC:    BIO:    FIC:0     CON:b      TOC:        
PC:s      PD:1989/           REP:    CPI:0   FSI:0     ILC:       II:1       
010     892835                                                               
020     0872876217 :$c$45.00                                                 
040     DLC$cDLC$dDLC                                                        
050 00  Z674$b.R4 no. 20$aZ7837$aBX1751.2                                    
082 00  020 s$a016.282$219                                                   
100 10  McCabe, James Patrick.                                               
245 10  Critical guide to Catholic reference books /$cJames Patrick McCabe ;         
with an introduction by Russell E. Bidlack.                          
250     3rd ed.                                                              
260 0   Englewood, Colo. :$bLibraries Unlimited, Inc.,$c1989.                
300     xiv, 323 p. ;$c25 cm.                                                
440  0  Research studies in library science ;$vno. 20                        
500     Includes indexes.                                                   
610 20  Catholic Church$xBibliography.                                       
650  0  Reference books$xCatholic Church.

Fig. 3.4. MARC record as displayed in the GEAC system of New York University:

 Local Control # : 10258080    Transaction type : Reserved for LC Marc
 Last updated    : 13 JUL 1994 Leader           : pam0  2
 Cataloguer      : System
 
 008      890501s1989    cou           00110 eng
 010 BB a    80016209
 020 BB a 0872872033 :
        c 22.50
 035 BB a GLIS002580801
 035 BB a (CStRLIN)NYUG6357361B
 040 BB a DLC
        c DLC
        d NNU
 050 B4 a Z674
        b .R4 no. 20, 1989
 082 BB a 016/.282
        2 19      
 100 1B a McCabe, James Patrick.
 245 10 a Critical guide to Catholic reference books /
        c by James Patrick McCabe ; with an introd. by Russell E. Bidlack.
 250 BB a 3rd ed.
 260 BB a Englewood, Co :
        b Libraries Unlimited,
        c 1989.
 300 BB a xiv, 323 p. ;
        c 25 cm.
 440 B0 a Research studies in library science ;
        v no. 20
 500 BB a Includes indexes.
 610 20 a Catholic Church
        x Bibliography.
 650 B0 a Reference books
        x Catholic Church.
 950 BB l BRef1
        a Z674
        b .R4 no. 20, 1989
        i 01/01/01 N
        x 5


 Format                     : BK Book & monographs
 Local Control Number       : 10258080
 Transaction type           : Reserved for LC Marc
 Date record created        : 01 MAY 1989
 Date of last record update : 13 JUL 1994  18:18:53
 
  1. Record status         : p
  2. Type of record        : a
  3. Bibliographic level   : m
  4. Type of control       :
  5. Encoding level        : 0
  6. Descriptive cat. form :
  7. Indicator Length      : 2



 Format                     : BK Book & monographs
 Local Control Number       : 10258080
 Transaction type           : Reserved for LC Marc
 Date record created        : 01 MAY 1989
 Date of last record update : 13 JUL 1994  18:18:53
 
  1. Date entered on file   : 890119   12. Festschrift            : 0
  2. Type of date code      : s        13. Index                  : 1
  3. Date 1                 : 1989     14. Literary form          : 0
  4. Date 2                 :          15. Biography              :
  5. Place of publication   : cou      16. Language               : eng
  6. Illustrations          :          17. Modified record        :
  7. Target audience        :          18. Cataloguing source     :
  8. Form of item           :
  9. Nature of contents     :
 10. Government publication :
 11. Conference publication : 0

 

Endnotes:
1. Deborah J. Byrne, MARC Manual: Understanding and Using MARC Records, 2nd ed. (Englewood, Colo., Libraries Unlimited, 1998), pp. 1-15.

2. MARC 21 Format for Bibliographic Data: Including Guidelines for Content Designation (Washington, D.C., Library of Congress, Cataloging Distribution Service, 1999), 2 v.; MARC 21 Concise Format for Bibliographic Data, available: http://lcweb.loc.gov/marc/bibliographic/ecbdhome.html

3. Arlene G. Taylor, The Organization of Information (Englewood, Colo., Libraries Unlimited, 1999), pp. 63-73.

4. Ibid., p. 64.

5. For an example of a record in the communications format, see Taylor, The Organization of Information, p. 60.