1999 Louisiana Library Association: Meta Data applications
XML
The areas of XML and Java will be discussed and how they can work together for digital library and distance learning applications. For more information, please see URL (NOTE: To advance, click on the SUN microsystems logo) :http://www.princeton.edu/~casey/lita98/
Extensible Markup Language (XML), http://www.w3.org/TR/PR-xml
RDF
Resource Description Framework (RDF) Model and Syntax, http://www.w3.org/TR/WD-rdf-syntax
An Introduction to the Resource Description Framework, Eric Miller
Research Scientist, Online Computer Library Center, Inc., Office of Research, Dublin, Ohio emiller@oclc.org
http://www.dlib.org/dlib/may98/miller/05miller.html
EAD
EAD (Encoded Archival Description and XML (eXtensible Markup Language) formats in the SAGE digital archive project at Emory University, as well as technical considerations in servers and protocols for digital archives projects. For more information about the project, please see URL:
http://elan.library.emory.edu/Misc/Conf/IRIG-XML/
http://sage.library.emory.edu/
Meta Data Dublin Core
Dublin Core Metadata Element Set: Reference Description http://purl.org/metadata/dublin_core
Dublin Core Resource Types
Minimalist DRAFT: July 17, 1997
The Dublin Core proposal for metadata contains a Resource Type element. This element is to describe the genre of the item being described. The following is a draft proposal for a standard set of genre types for use in the Dublin Core Type element. Please submit comments to the Dublin Core mailing list.
Text
A work that is mostly textual in nature, but may include images, maps, tables or other formats. Examples include books, articles, pamphlets, essays, email messages, theses, and technical reports.
Image
Examples: photograph, graphic, animation, video.
Sound
All sound types. Examples include speech, music, and ambient.
Software
Binary executables and source code. For software that exists only to create an interactive environment use Interactive instead.
Data
Alphanumeric collections of data. Examples include spatial data, bibliographic records, statistics, and remotely sensed spectral data.
Interactive
A setting designed for interactive involvement with one or more users. Examples: games, chat services and virtual reality.
------------------------------------------------------------------------
Text
A work that is mostly textual in nature, but may include images, maps, tables or other formats.
Image
Sound
Software
Binary executables and source code. Both may be further subdivided
with the name of the programming language used. For software that
exists only to create an interactive environment use Interactive
instead.
Data
Interactive
A setting designed excusively for interactive involvement with one
or more users. Examples: chat services, virtual reality, and
games.
The following top-level resource types may be subdivided as required using a convention of prefacing any extension with "x-". If the subdivision entry is derived from a controlled vocabulary, the vocabularly may be identified using the Scheme attribute For example:
<META NAME="DC.Type" CONTENT="Text">
<META NAME="DC.Type" CONTENT="Text.x-Thesis" SCHEME="FooBar">
------------------------------------------------------------------------
[ Dublic Core Resource Types ]-------------------------------------------------------------
Http://sunsite.berkeley.edu/Metadata/minimalist.html
World Wide Web Consortium
ftp://ftp.ietf.org/internet-drafts/draft-kunze-dc-02.txt
Dublin Core Workshop Series S. Weibel Internet-Draft J. Kunze draft-kunze-dc-02.txt C. Lagoze
10 February 1998 Expires in six months
Search engine
http://www.stars.com/Search/Meta/Tag.html
Searcher Magazine, June 1998 issue, "Web Search Services in 1998." http://www.ifla.org/II/catalog.htm
Try the chart linked to this article by Diana Botluk. http://www.infotoday.com/searcher/jun/story2.htm
Searchenginewatch.Com
http://www.searchenginewatch.com/resources/tutorials.html
http://www.searchenginewatch.com/reports/reviewchart.html
http://www.lrx.com/columns/engine.htm
A good "cheat sheet" can be found at :
http://www.infopeople.berkeley.edu:8000/src/chart.html
Meta Tag Builder
http://vancouver-webpages.com/META/mk-metas.html
Dublin Core Metadata for Simple Resource Discovery
This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as ``work in progress.''
To learn the current status of any Internet-Draft, please check the 1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or ftp.isi.edu (US West Coast).
Distribution of this document is unlimited. Please send comments to weibel@oclc.org, or to the discussion list meta2@mrrl.lut.ac.uk.
2. Introduction
Finding relevant information on the World Wide Web has become increasingly problematic in proportion to the explosive growth of networked resources. Current Web indexing evolved rapidly to fill the demand for resource discovery tools, but that indexing, while useful, is a poor substitute for richer varieties of resource description.
An invitational workshop held in March of 1995 brought togetherlibrarians, digital library researchers, and text-markup specialists to address the problem of resource discovery for networked resources. This activity evolved into a series of related workshops and ancillary activities that have become known collectively as the Dublin Core Metadata Workshop Series.
The goals that motivate the Dublin Core effort are:
These requirements work at cross purposes to some degree, but all aredesirable goals. Much of the effort of the Workshop Series has been directed at minimizing the tensions among these goals.
One of the primary deliverables of this effort is a set of elementsthat are judged by the collective participants of these workshopsto be the core elements for cross-disciplinary resource discovery. The term ``Dublin Core'' applies to this core of descriptive elements.
Early experience with Dublin Core deployment has made clear the need to support additional qualification of elements for some applications.Thus, Dublin Core elements may be expressed in simple unqualified ways
that minimal discovery and retrieval tools can use, or they may be expressed with additional structure to support semantics-sharpening qualifiers that minimal tools can safely ignore but that more complex tools can employ to increase discovery precision.
The broad agreements about syntax and semantics that have emerged from the workshop series will be expressed in a series of five Informational RFCs, of which this document is the first. These RFCs (currently they are Internet-Drafts) will comprise the following documents.
2.1. Dublin Core Metadata for Simple Resource Discovery
An introduction to the Dublin Core and a description of the semantics of the 15-element Dublin Core element set without qualifiers. This is the present document.
2.2. Encoding Dublin Core Metadata in HTML A formal description of the convention for embedding unqualified Dublin Core metadata in an HTML file.
2.3. Qualified Dublin Core Metadata for Simple Resource Discovery
The principles of element qualification and the semantics of Dublin Core metadata when expressed with a recommended qualifier set known as the Canberra Qualifiers.
2.4. Encoding Qualified Dublin Core Metadata in HTML
A formal description of the convention for embedding qualified Dublin Core metadata in an HTML file.
2.5. Dublin Core on the Web: RDF Compliance and DC Extensions
A formal description for encoding Dublin Core metadata with qualifiers in RDF (Resource Description Framework) [1] compliant metadata, and how to extend the core element set.
3. Description of Dublin Core Elements
The following is the reference definition of the Dublin Core Metadata Element Set. The evolving reference description, including any defined qualifiers, resides at [2]:
http://purl.org/metadata/dublin_core
In the element descriptions below, each element has a descriptive name intended to convey a common semantic understanding of the element, as well as a formal single-word label intended to make the syntactic specification
of elements simpler for encoding schemes.
Although some environments, such as HTML, are not case-sensitive, it is recommended best practice always to adhere to the case conventions in the element labels given below to avoid conflicts in the event that the metadata is subsequently extracted or converted to a case-sensitive environment, such as XML (Extensible Markup Language) [3].
Each element is optional and repeatable. Furthermore, metadata elements may appear in any order, and with no significance being attached to that order.
To promote global interoperability, a number of the element descriptions suggest a controlled vocabulary for the respective element values. It is assumed that other controlled vocabularies will be developed for interoperability within certain local domains.
A metadata element's meaning is unaffected by whether or not the element is embedded in the resource that it describes.
The metadata elements fall into three groups which roughly indicate the class or scope of information stored in them: (1) elements related mainly to the Content of the resource, (2) elements related mainly to the resource when viewed as Intellectual Property, and (3) elements related mainly to the Instantiation of the resource.
Content Intellectual Property Instantiation
----------- --------------------- -------------
Title Creator Date
Subject Publisher Type
Description Contributor Format
Source Rights Identifier
Language
Relation
Coverage
3.1. Title Label: "Title"
The name given to the resource, usually by the Creator or Publisher.
3.2. Author or Creator Label: "Creator"
The person or organization primarily responsible for creating the intellectual content of the resource. For example, authors in the case of written documents, artists, photographers, or illustrators in the case of visual resources.
3.3. Subject and Keywords Label: "Subject"
The topic of the resource. Typically, subject will be expressed as keywords or phrases that describe the subject or content of the resource. The use of controlled vocabularies and formal classification schemes is encouraged.
3.4. Description Label: "Description"
A textual description of the content of the resource, including abstracts in the case of document-like objects or content descriptions in the case of visual resources.
3.5. Publisher Label: "Publisher"
The entity responsible for making the resource available in its present form, such as a publishing house, a university department, or a corporate entity.
3.6. Other Contributor Label: "Contributor"
A person or organization not specified in a Creator element who has made significant intellectual contributions to the resource but whose contribution is secondary to any person or organization specified in a Creator element (for example, editor, transcriber,and illustrator).
3.7. Date Label: "Date"
A date associated with the creation or availability of the resource. Such a date is not to be confused with one belonging in the Coverage element, which would be associated with the resource only insofar as the intellectual content is somehow about that date. Recommended best practice is defined in a profile of ISO 8601 [4] that includes (among others) dates of the forms YYYY and YYYY-MM-DD. In this scheme, for example, the date 1994-11-05 corresponds to November 5, 1994.
3.8. Resource Type Label: "Type"
The category of the resource, such as home page, novel, poem, working paper, technical report, essay, dictionary. For the sake of interoperability, Type should be selected from an enumerated list that is currently under development in the workshop series.
3.9. Format Label: "Format"
The data format of the resource, used to identify the software and possibly hardware that might be needed to display or operate the resource. For the sake of interoperability, Format should be selected from an enumerated list that is currently under development in the workshop series.
3.10. Resource Identifier Label: "Identifier"
A string or number used to uniquely identify the resource. Examples for networked resources include URLs and URNs (when implemented).
Other globally-unique identifiers, such as International Standard Book Numbers (ISBN) or other formal names are also candidates for this element.
3.11. Source Label: "Source"
Information about a second resource from which the present resource is derived. While it is generally recommended that elements contain information about the present resource only, this element may contain a date, creator, format, identifier, or other metadata for the second resource when it is considered important for discovery of the present resource; recommended best practice is to use the Relation element instead. For example, it is possible to use a Source date of 1603 in a description of a 1996 film adaptation of a Shakespearean play, but it is preferred instead to use Relation "IsBasedOn" with a reference to a separate resource whose description contains a Date of 1603. Source is not applicable if the present resource is in its original form.
3.12. Language Label: "Language"
The language of the intellectual content of the resource. Where practical, the content of this field should coincide with RFC 1766 [5]; examples include en, de, es, fi, fr, ja, th, and zh.
3.13. Relation Label: "Relation"
An identifier of a second resource and its relationship to the present resource. This element permits links between related resources and resource descriptions to be indicated. Examples include an edition of a work (IsVersionOf), a translation of a work (IsBasedOn), a chapter of a book (IsPartOf), and a mechanical transformation of a dataset into an image (IsFormatOf). For the sake of interoperability, relationships should be selected from an enumerated list that is currently under development in the workshop series.
3.14. Coverage Label: "Coverage"
The spatial or temporal characteristics of the intellectual content of the resource. Spatial coverage refers to a physical region (e.g., celestial sector); use coordinates (e.g., longitude and latitude) or place names that are from a controlled list or are fully spelled out. Temporal coverage refers to what the resource is about rather than when it was created or made available (the latter belonging in the Date element); use the same date/time format (often a range) [4] as recommended for the Date element or time periods that are from a controlled list or are fully spelled out.
3.15. Rights Management Label: "Rights"
A rights management statement, an identifier that links to a rights management statement, or an identifier that links to a service providing information about rights management for the resource.
4. Security Considerations
The Dublin Core element set poses no risk to computers and networks. It poses minimal risk to searchers who obtain incorrect or private information due to careless mapping from rich data descriptions to simple Dublin Core scheme. No other security concerns are likely to be raised by the element description consensus documented here.
5. References
[1] Resource Description Framework (RDF) Model and Syntax, http://www.w3.org/TR/WD-rdf-syntax
[2] Dublin Core Metadata Element Set: Reference Description,http://purl.org/metadata/dublin_core
[3] Extensible Markup Language (XML), http://www.w3.org/TR/PR-xml
[4] Date and Time Formats (based on ISO 8601), W3C Technical Note,
http://www.w3.org/TR/NOTE-datetime
[5] RFC 1766, Tags for the Identification of Languages,
http://ds.internic.net/rfc/rfc1766.txt
6. Authors' Addresses
Stuart L. Weibel OCLC Online Computer Library Center, Inc.
Office of Research 6565 Frantz Rd. Dublin, Ohio, 43017, USA
Email: weibel@oclc.org Voice: +1 614-764-6081 Fax: +1 614-764-2344
John A. Kunze
Center for Knowledge Management,University of California, San Francisco
530 Parnassus Ave, Box 0840 San Francisco, CA 94143-0840, USA
Email: jak@ckm.ucsf.edu Voice: +1 415-502-6660 Fax: +1 415-476-4653
Carl Lagoze, Digital Library Research Group, Department of Computer Science ,Cornell University
Ithaca, NY 14853, USA
Email: lagoze@cs.cornell.edu Voice: +1-607-255-6046 Fax: +1-607-255-4428
Meta Data Examples and Resources
DIGITAL LIBRARIES: Metadata Resources
http://www.ifla.org/II/metadata.htm
Dublin Library links
http://www.digitallibrary.net/resources.asp?id=18
Dublin Core Qualifiers
ROADS Project,Department of Computer Studies,
Loughborough University.
http://www.roads.lut.ac.uk/Metadata/DC-SubElements.html
An Introduction to the Resource Description Framework, Eric Miller
Research Scientist, Online Computer Library Center, Inc., Office of Research
Dublin, Ohio
emiller@oclc.org
http://www.dlib.org/dlib/may98/miller/05miller.html
DIGITAL LIBRARIES:
Cataloguing and Indexing of Electronic Resources
Error! Bookmark not defined.
Exercise:
Searching
1. Starting with Dogpile, then Metafind and Metacrawler, do a search for Bulletin 1134.
2. What differences do you notice? Which worked best from your perspective?
3. Which of these searches gave you the broadest type of information on Bulletin 1134?
Using Meta Data
<meta name="review" content="04 Jan, 1999">
<meta name="filename" content="Main.html">
<meta name="subject" content="Sandia National Laboratories home page">
<link rel="schema.sandia" href="http://www.sandia.gov/html_schema.htm">
<meta name="sandia.approval_type" content="formal">
<meta name="sandia.approved" content="1996-2533">
<meta name="sandia.create_date" content="10/31/97">
<link rev="owns" title="Manny Ontiveros" href="mailto:mpontiv@sandia.gov">
<link rev="made" title="Kay Rivers" href="mailto:klriver@sandia.gov">
<link rev="made" title="Mona Aragon" href="mailto:mlrage@sandia.gov">
<meta name="Author" content="Manny Ontiveros">
<meta name="Description" content="Sandia National Lab's Home Page">
<meta name="Classification" content="DOE:DOE Web sites via organizational structure:Laboratories and Other Field Facilities">
<meta name = "keywords" content = "Sandia National Laboratories, National Security, Energy and Environment, Research and Development, Defense Programs, Science and Technology, Technology Transfer">
Internet, Technology & Research Consultant: Sandy Colby, MLIS