From: Wynar's Introduction to Cataloging and Classification, 9th ed., mss.
Ó2000 Arlene G. Taylor
(last updated 2/16/2000)Part II
__________________________________________
ELECTRONIC FORMATTING
CHAPTER 3
ENCODING
In the age of online catalogs the content of catalog records must use some kind of electronic encoding scheme in order to be machine-manipulable for searching and display. Encoding allows each part of a record to be set off from every other part. Then computer programs can be written so that each part will be displayed in a certain position according to the wishes of the programmer. Encoding also allows for access to a catalog record through the creation of search programs that provide for the searching of certain parts of a record (e.g., the author, the title, a subject, etc.)
Since the mid-1960s an encoding system called MARC (MAchine-Readable Cataloging) has been used to create electronic catalog records. An excellent history of its development may be found in Deborah Byrne’s MARC Manual.1 The most current version for use in the United States is MARC 21, a harmonized version of USMARC and CAN/MARC, published in 1999.2
Other encoding systems have been developed recently in response to the desire to make catalog records available on the Web. SGML (Standard Generalized Markup Language) is an international standard for document markup. It is a set of rules for designing markup languages that describe the structure of a document. The markup languages thus designed are SGML applications called DTDs (Document Type Definitions). DTDs often used for bibliographic data include TEI (Text Encoding Initiative), HTML (HyperText Markup Language), and EAD (Encoded Archival Description). A DTD for MARC records has been developed, and at this writing is in experimental stages. XML (Extensible Markup Language), a subset of SGML that adds some features that solve earlier problems with SGML and HTML. It is gaining in use. A description of these encoding systems may be found in Arlene Taylor’s The Organization of Information.3
Currently, the USMARC encoding system holds the position of being the one used for bibliographic records in the majority of the world’s online catalogs. The remainder of this chapter contains an introduction to this system.
Introduction to MARC
FORMATS
There are formats for five types of data. The following formats are currently defined:
COMPONENTS OF THE RECORD
A record is a collection of fields. A field contains a unit of information within a record. A field may consist of one or more subfields. Tags, that is, three-digit numerical codes, identify each field. Every field ends with a field terminator (in OCLC, for example, the field terminator appears as a backwards paragraph sign). Each subfield is preceded by a delimiter sign (often represented by a $ or | or ‡) followed by a single character code (usually alphabetical, but can also be numerical).
Each record has the same components:
1. Leader
2. Record directory
3. Control fields
4. Variable fields
Leader—The leader is like the leader on a roll of film. It identifies the beginning of a new record and provides information for the processing of the record. The leader is fixed in length and contains 24 characters.
Record directory—The record directory contains a series of fixed length entries that identify the tag, length, and starting position of each field in the record.
Control fields—Control fields carry alphanumeric (often encoded) data elements. Control field tags always begin with the digit 0. Many control fields are fixed in length, that is, each fixed-length field must consist of a set number of characters (see also "Fixed field'' below). Common control fields are:
001 - Control number
005 - Date and time of latest transaction
006 - Fixed-length data elements - coding information about special aspects of the item being cataloged that cannot be coded in field 008 (separate character descriptions for books, computer files, maps, music, serials, visual materials, and mixed materials)
007 - Physical description fixed field - physical characteristics of an item, usually derived from explicit information in other fields of the record, but expressed here in coded form (separate character descriptions for map, computer file, globe, projected graphic, microform, nonprojected graphic, motion picture, remote-sensing image, sound recording, text, videorecording, and one for "unspecified")
008 - Fixed-length data elements - positionally-defined data elements that provide coded information about the record as a whole or about special bibliographic aspects of the item being cataloged
010 - Library of Congress Control Number (LCCN)
020 - International Standard Book Number (ISBN)
022 - International Standard Serial Number (ISSN)
033 - Date/time and place of an event
034 - Coded cartographic mathematical data
040 - Cataloging source
041 - Language code
043 - Geographic area code
047 - Form of musical composition code
048 - Number of musical instruments or voices code
Fixed field—There is one fixed length control field that is commonly referred to as "the fixed field." Field 008 carries general information about the content of the bibliographic record. This field is often displayed in a single paragraph at the top of the screen and is usually displayed with mnemonic tags. The field has 40 character positions. The data stored in this field are used to manipulate records for retrieval, filing, indexing, etc.
A fixed field from the record for a book as it is displayed in OCLC:
> Type: a ELvl: Srce: Audn: Ctrl: Lang: eng BLvl: m Form: Conf: 0 Biog: MRec: Ctry: ilu Cont: b GPub: Fict: 0 Indx: 1 Desc: a Ills: a Fest: 0 DtSt: s Dates: 1997, <
Sometimes fill characters (displayed as n or _ or |) are used to indicate elements of the fixed field that were not in use when the record was created or were not provided by the inputting library. A blank space (when written represented as a letter b with a forward slash through it) is meaningful. That is, a blank is input to represent a coded value. In "Srce:'' above, the blank means "Library of Congress.''
Variable fields—Variable fields carry alphanumeric data of variable length. The variable fields carry traditional cataloging data elements. Three-digit numeric tags (100-999) identify variable fields. In order to talk about these tags in groups, a convention is followed in which all fields beginning with "1'' are identified as 1xx fields; those beginning with "2,'' as 2xx fields; etc. Variable fields consist of heading fields and descriptive fields. Although classification fields may be considered to be control fields because they begin with 0, they are placed here with other elements that make up the "surrogate record" or the "bibliographic data" of a completed MARC record. Among the most used variable fields are:
CLASSIFICATION NOTATIONS AND/OR CALL NUMBERS (05x-08x)
050 - Library of Congress Call Number
060 - National Library of Medicine Call Number
082 - Dewey Decimal Classification Number
etc.
MAIN ENTRY FIELDS (1xx)
100 - Main entry--personal name
110 - Main entry--corporate name
111 - Main entry--meeting name
130 - Main entry--uniform title
TITLE AND TITLE-RELATED FIELDS (20x-24x)
240 - Uniform title
245 - Title proper, general material designation, remainder of title, statement of responsibility
246 - Varying form of title
etc.
EDITION, IMPRINT, ETC., FIELDS (25x-28x)
250 - Edition statement
254 - Musical presentation statement
255 - Cartographic mathematical data
256 - Computer file characteristics
260 - Publication, distribution, etc. (Imprint: place, publisher, etc., date)
etc.
PHYSICAL DESCRIPTION, ETC., FIELDS (3xx)
300 - Physical description (extent of item, other details, size, accompanying material)
310 - Current publication frequency
362 - Dates of publication and/or volume designation (serials)
etc.
SERIES STATEMENT FIELDS (4xx)
[title proper of series, remaining title information, statement of responsibility relating to series, ISSN of series, numbering, etc.]
440 - Series statement / added entry--title
490 - Series statement (not an added entry)
etc.
NOTE FIELDS (5xx)
500 - General note
502 - Dissertation note
504 - Bibliography, etc., note
505 - Formatted contents note
506 - Restrictions on access note
508 - Creation/production credits note
510 - Citation/References note
511 - Participant or performer note
516 - Type of computer file or data note
520 - Summary, etc., note
533 - Reproduction note
534 - Original version note
538 - System details note
546 - Language note
547 - Former title complexity note
580 - Linking entry complexity note
etc.
SUBJECT ACCESS FIELDS (6xx)
600 - Subject added entry--personal name
610 - Subject added entry--corporate name
611 - Subject added entry--meeting name
630 - Subject added entry--uniform title
650 - Subject added entry--topical term
651 - Subject added entry--geographic name
653 - Index term--uncontrolled
655 - Index term--genre/form
etc.
ADDED ENTRY FIELDS (70x-75x)
700 - Added entry--personal name
710 - Added entry--corporate name
711 - Added entry--meeting name
720 - Added entry--uncontrolled name
730 - Added entry--uniform title
740 - Added entry--uncontrolled related/analytical title
etc.
LINKING ENTRY FIELDS (76x-78x)
770 - Supplement/special issue entry
772 - Parent record entry
776 - Additional physical form entry
780 - Preceding entry
785 - Succeeding entry
787 - Nonspecific relationship entry
etc.
SERIES ADDED ENTRY FIELDS (80x-840)
800 - Series added entry--personal name
810 - Series added entry--corporate name
811 - Series added entry--meeting name
830 - Series added entry--uniform title
HOLDINGS, LOCATION, ALTERNATE GRAPHICS, ETC., FIELDS
852 - Location/call number
856 - Electronic location and access
880 - Alternate graphic representation
Subfields—All subfields are distinct elements within fields. Subfield definitions vary from field to field. Some of the most commonly encountered ones are:
050 LC call number
$a classification number
$b item number and date
082 Dewey Decimal Classification number
$a classification number
$b item number
$2 edition number [edition of DDC used]
x00 Personal name
(x00 means that these subfields apply in fields 100, 600, 700, and 800.)
$a name
$q qualification of name [e.g., Lewis, C. S. $q (Clive Staples)]
$b numeration
$c titles (e.g., Mrs., Sir, Bishop)
$d dates
$e relator (e.g., ill. [for illustrator])
x10 Corporate name
$a name
$b subordinate unit
$e relator
$k form subheading
x11 Meeting name
$a name
$n number
$c place
$d date
245 Title and statement of responsibility
$a title proper
$b other title information
$c statement of responsibility or remainder of area
260 Publication, distribution, etc.
$a place
$b publisher, distributor, etc.
$c date
300 Physical description
$a extent of item
$b other physical details
$c dimensions
4xx and 8xx Series
$a name of series
$x ISSN (4xx fields only)
$v numbering of series
6xx Subject access point
$a main subject (name, topic, etc.)
$x subject subdivision
$y time period subdivision
$z geographic subdivision
$v form/genre subdivision
856 Electronic location and access
$a Host name
$b Access number
$d Path
$h Processor of request
$u Uniform Resource Locator
etc.
Indicators—The two digits that follow after the tags in a MARC field are called indicators. Each digit position has a certain meaning relating to its particular field and provides computer instructions for processing the data contained in the field. For example, in the OCLC-formatted 245 field shown below, the first indicator, "1,'' tells the system that there should be an access point for the title, and the second indicator, "4,'' tells the system that four nonfiling characters (i.e., T, h, e, and [space]) precede the first significant word of the title:
245 14 The dictionary of misinformation / $c Tom Burnam.
Display of MARC Records
MARC records are distributed in the MARC Communications Format.5 Each record consists of one long character string beginning with the leader, followed by the record directory, followed by the fields one after another, with no breaks, to the end of the record (at which point there is a character to represent a record terminator). Such a record is practically unreadable if printed as transmitted, and so each system has a program that will display the record in a form that is more easily read. Figures 3.1 through 3.4 show the same MARC record as it is displayed in four different systems.
Fig. 3.1. MARC record as displayed in OCLC:
OCLC: 19124014 Rec stat: p Entered: 19890119 Replaced: 19900317 Used: 19990309 > Type: a ELvl: Srce: Audn: Ctrl: Lang: eng BLvl: m Form: Conf: 0 Biog: MRec: Ctry: cou Cont: b GPub: Fict: 0 Indx: 1 Desc: a Ills: Fest: 0 DtSt: s Dates: 1989, < > 1 010 89-2835 < > 2 040 DLC $c DLC < > 3 020 0872876217 : $c |$|45.00 < > 4 050 00 Z674 $b .R4 no. 20 $a Z7837 $a BX1751.2 < > 5 082 00 020 s $a 016.282 $2 19 < > 6 090 $b < > 7 049 DD0A < > 8 100 1 McCabe, James Patrick. < > 9 245 10 Critical guide to Catholic reference books / $c James Patrick McCabe ; with an introduction by Russell E. Bidlack. < > 10 250 3rd ed. < > 11 260 Englewood, Colo. : $b Libraries Unlimited, Inc., $c 1989. < > 12 300 xiv, 323 p. ; $c 25 cm. < > 13 440 0 Research studies in library science ; $v no. 20 < > 14 500 Includes indexes. < > 15 610 20 Catholic Church $x Bibliography. < > 16 650 0 Reference books $x Catholic Church. <
Fig. 3.2. MARC record as displayed through the Z39.50 system of LC:
001 89002835 003 DLC 005 19900227083313.4 008 890119s1989 cou b 001 0 eng 010 $a 89002835 020 $a0872876217 :$c$45.00 040 $aDLC$cDLC$dDLC 050 00$aZ674$b.R4 no. 20$aZ7837$aBX1751.2 082 00$a020 s$a016.282$219 100 10$aMcCabe, James Patrick. 245 10$aCritical guide to Catholic reference books /$cJames Patrick McCabe ; with an introduction by Russell E. Bidlack. 250 $a3rd ed. 260 0 $aEnglewood, Colo. :$bLibraries Unlimited, Inc.,$c1989. 300 $axiv, 323 p. ;$c25 cm. 440 0$aResearch studies in library science ;$vno. 20 500 $aIncludes indexes. 610 20$aCatholic Church$xBibliography. 650 0$aReference books$xCatholic Church.
Fig. 3.3. MARC record as displayed in RLIN:
ID:DCLC892835-B RTYP:c ST:p FRN: MS:p EL: AD:01-19-89 CC:9110 BLT:am DCF:a CSC: MOD: SNR: ATC: UD:03-18-90 CP:cou L:eng INT: GPC: BIO: FIC:0 CON:b TOC: PC:s PD:1989/ REP: CPI:0 FSI:0 ILC: II:1 010 892835 020 0872876217 :$c$45.00 040 DLC$cDLC$dDLC 050 00 Z674$b.R4 no. 20$aZ7837$aBX1751.2 082 00 020 s$a016.282$219 100 10 McCabe, James Patrick. 245 10 Critical guide to Catholic reference books /$cJames Patrick McCabe ; with an introduction by Russell E. Bidlack. 250 3rd ed. 260 0 Englewood, Colo. :$bLibraries Unlimited, Inc.,$c1989. 300 xiv, 323 p. ;$c25 cm. 440 0 Research studies in library science ;$vno. 20 500 Includes indexes. 610 20 Catholic Church$xBibliography. 650 0 Reference books$xCatholic Church.
Fig. 3.4. MARC record as displayed in the GEAC system of New York University:
Local Control # : 10258080 Transaction type : Reserved for LC Marc Last updated : 13 JUL 1994 Leader : pam0 2 Cataloguer : System 008 890501s1989 cou 00110 eng 010 BB a 80016209 020 BB a 0872872033 : c 22.50 035 BB a GLIS002580801 035 BB a (CStRLIN)NYUG6357361B 040 BB a DLC c DLC d NNU 050 B4 a Z674 b .R4 no. 20, 1989 082 BB a 016/.282 2 19 100 1B a McCabe, James Patrick. 245 10 a Critical guide to Catholic reference books / c by James Patrick McCabe ; with an introd. by Russell E. Bidlack. 250 BB a 3rd ed. 260 BB a Englewood, Co : b Libraries Unlimited, c 1989. 300 BB a xiv, 323 p. ; c 25 cm. 440 B0 a Research studies in library science ; v no. 20 500 BB a Includes indexes. 610 20 a Catholic Church x Bibliography. 650 B0 a Reference books x Catholic Church. 950 BB l BRef1 a Z674 b .R4 no. 20, 1989 i 01/01/01 N x 5 Format : BK Book & monographs Local Control Number : 10258080 Transaction type : Reserved for LC Marc Date record created : 01 MAY 1989 Date of last record update : 13 JUL 1994 18:18:53 1. Record status : p 2. Type of record : a 3. Bibliographic level : m 4. Type of control : 5. Encoding level : 0 6. Descriptive cat. form : 7. Indicator Length : 2 Format : BK Book & monographs Local Control Number : 10258080 Transaction type : Reserved for LC Marc Date record created : 01 MAY 1989 Date of last record update : 13 JUL 1994 18:18:53 1. Date entered on file : 890119 12. Festschrift : 0 2. Type of date code : s 13. Index : 1 3. Date 1 : 1989 14. Literary form : 0 4. Date 2 : 15. Biography : 5. Place of publication : cou 16. Language : eng 6. Illustrations : 17. Modified record : 7. Target audience : 18. Cataloguing source : 8. Form of item : 9. Nature of contents : 10. Government publication : 11. Conference publication : 0
Endnotes:
2. MARC 21 Format for Bibliographic Data: Including Guidelines for Content Designation (Washington, D.C., Library of Congress, Cataloging Distribution Service, 1999), 2 v.; MARC 21 Concise Format for Bibliographic Data, available: http://lcweb.loc.gov/marc/bibliographic/ecbdhome.html
3. Arlene G. Taylor, The Organization of Information (Englewood, Colo., Libraries Unlimited, 1999), pp. 63-73.
5. For an example of a record in the communications format, see Taylor, The Organization of Information, p. 60.