Metadata
The World Wide
Web is the fastest growing and the most rapidly and widely deployed technology
in the history of technologies. The explosive growth of the Web provided for
ubiquity of information and access to information resources. But it also provided
for information anarchy and chaos. A number of metaphors depict the Web as a
vast ocean of information and many a Web surfer as lost in that ocean.
The Web is not
well organized for searching and retrieving of information. A prerequisite for
more effective organization and searching is knowledge of the structure of the
data and databases. But the big problem is that Web data and Web databases are
notoriously fuzzy and (dis)organized in every which way. The structures vary.
They constantly evolve with time. The consistency is low.
It has been long
recognized that what is needed is some standardized description or language
to increase functionality of the Web. In other words, needed is a mechanism
for a more precise description of things on the Web going from machine-readable
to machine-understandable. This was missing in the original Web architecture.
Enter a solution:
METADATA! Already tiresomely, metadata is defined as data about data, information
about information. Metadata refers to a standardized description of what a text
or any object is all about. It labels parts of the text (or object) with some
standardized, agreeable labels or tags.
But what standards?
Who will develop them? How? How to implement them? These and a host of similar
problems are the gist of great many metadata efforts, activities, projects,
and discussions worldwide.
Libraries and librarians
have been involved with metadata for a long time. Centuries. But they did not
call it metadata. They called it cataloging rules, controlled vocabulary, indexing
formats, and the like. For machines they have developed Machine Readable Cataloging
(MARC) - a set of conventions to enable machine exchanges of cataloging records.
But with development of digital libraries, librarians have joined the other
Web efforts related to metadata.
The selection of
metadata sites below, is but a sample of a variety of metadata projects, standards,
and sources. It is a good beginning for organized surfing into the vast ocean
of metadata information.
Standards and projects
Berkeley Digital Library Sun Site. Z39.50 Information
http://sunsite.berkeley.edu/Z39.50/
"Z39.50 is a computer-to-computer communications
protocol designed to support searching and retrieval of information -- full-text
documents, bibliographic data, images, multimedia -- in a distributed network
environment. Based on client/server architecture and operating over the Internet,
the Z39.50 protocol is supporting an increasing number of applications. And
like the dynamic network environment in which it is used, the standard is
evolving to meet the changing needs of information creators, providers, and
users." Applied in many libraries worldwide. The Berkeley site has links to
standards, applications, and articles.
Dublin Core Metadata Initiative
http://www.dublincore.org
"The Dublin Core Metadata Initiative
is an open forum engaged in the development of interoperable online metadata
standards that support a broad range of purposes and business models. DCMI's
activities include consensus-driven working groups, global workshops, conferences,
standards liaison, and educational efforts to promote widespread acceptance
of metadata standards and practices." Includes description and references
to many activities. The DC metadata set is at http://www.dublincore.org/documents/dces/
Text Encoding Initiative Consortium (TEI)
http://www.tei-c.org/
"The TEI is an international project
to develop guidelines for the encoding of textual material in electronic form
for research purposes." Started in 1988 - an early project geared toward humanities.
Now hosted by four universities and sponsored by a number of organizations.
Provides guidelines for electronic text encoding and interchange. Also includes
a popular TEI Lite at http://www.tei-c.org/Lite/index.html
The UK Office for Library and Information Networking, UKOLN
http://www.ukoln.ac.uk/
"UKOLN is a national focus of expertise
in digital information management. It provides policy, research and awareness
services to the UK library, information and cultural heritage communities."
Includes a number of projects related to metadata at http://www.ukoln.ac.uk/metadata/
and mapping between metadata formats at http://www.ukoln.ac.uk/metadata/interoperability/
U.S. Federal Geographic Data Committee.
http://www.fgdc.gov/metadata/metadata.html
An example of a government initiative.
This one involves Content Standard for Digital Geospatial Metadata.
U.S. Library of Congress. Standards.
http://lcweb.loc.gov/standards/
"...key standards used in the information
community that are maintained by the Library of Congress. Their Web pages
supply information on their maintenance and use. Other links below connect
to information on the Library's collection of standards and key standards-settings
organizations." Includes: MARC Formats - Digital Library Standards; Z39.50
Retrieval Protocol; Encoded Archival Description; ISO Language Codes; International
Standard Serial Number; Standards Collections; Related Standards Organizations.
U.S. Library of Congress. Encoded Archival Description Official
Web site
http://www.loc.gov/ead/ead.html
"The EAD Document Type Definition (DTD)
is a standard for encoding archival finding aids using the Standard Generalized
Markup Language (SGML). The standard is maintained in the Network Development
and MARC Standards Office of the Library of Congress (LC) in partnership with
the Society of American Archivists."
World Wide Web Consortium. (W3C)
http://www.w3.org/
"The World Wide Web Consortium (W3C)
develops interoperable technologies (specifications, guidelines, software,
and tools) to lead the Web to its full potential as a forum for information,
commerce, communication, and collective understanding." THE AUTHORITY on tools
for Web access. A rich site for a number of metalanguage standards, including
HTML, XML. Includes also Web Content Accessibility Guidelines and tutorials
for including accessibility in development of own Web sites, such as found
in: http://www.w3.org/WAI/gettingstarted.
General information
Ariadne
http://www.ariadne.ac.uk/issue12/metadata/
Ariadne is a leading library online publication.
Metadata Corner is a regular feature about metadata issues, projects, and
descriptions - an excellent source of current information. Same column in
earlier issues can be accessed.
International Federation of Library Associations (IFLA). Digital
libraries: Metadata resources
http://www.ifla.org/II/metadata.htm
One of the richest resources for links
to many national and international metadata efforts and project - a whole
alphabet soup of them. Also links to documents in a variety of metadata areas.
A good place to start.
Koehler, W. (2000). Tutorial on author tools.
http://www.ou.edu/cas/slis/courses/LIS5990A/slis5990/authortools/index.htm
Descriptions, examples and links to a
variety of markup languages, metatags and initiatives, including SGML/XML,
Dublin Core and W3C initiatives in metadata. Many examples, e.g. 'Page 3 MetaTags'
gives an example for applying HTML meta tags to your own html pages.
Memorial University of Newfoundland Libraries. Metadata standards,
crosswalks, and standard organizations.
http://www.mun.ca/library/cat/standards.htm
An extensive set of links to anything
on metadata.
Rutgers University Libraries. Center for Electronic Texts
in the Humanities (CETH)
http://scc01.rutgers.edu/ceth/
Lists a number of projects, some extensive
and complex. Includes guidelines, workshops and presentations on XML, SGML.
HTML, Cold Fusion and others, and as such also a good educational site on
metadata.
Schwartz, C. (2000). Simmons College. Metadata Resources
http://web.simmons.edu/~schwartz/meta.html
An eclectic resource about metadata.
Includes readings, national and international efforts, examples of projects,
MARC cataloging, SGML, EAD, and TEI, W3C efforts, and much more. Also, great
bibliography with links to original papers. Part of a larger set of electronic
resources maintained by Candy Schwartz at Simmons College, including a course
on digital libraries.
last update 20 March
2002 tefko@scils.rutgers.edu