Data Virtualization
Main article: Data Virtualization
Data Virtualization has emerged as the new software technology to complete the virtualization stack in the enterprise. Metadata is used in Data Virtualization servers which are enterprise infrastructure components, along side with Database and Application servers. Metadata in these servers is saved as persistent repository and describes business objects in various enterprise systems and applications.
Statistics and census services
Standardisation work has had a large impact on efforts to build metadata systems in the statistical community. Several metadata standards are described, and their importance to statistical agencies is discussed. Applications of the standards at the Census Bureau, Environmental Protection Agency, Bureau of Labor Statistics, Statistics Canada, and many others are described. Emphasis is on the impact a metadata registry can have in a statistical agency.
Library and information science
Libraries employ metadata in library catalogues, most commonly as part of an Integrated Library Management System. Metadata is obtained by cataloguing resources such as books, periodicals, DVDs, web pages or digital images. This data is stored in the integrated library management system, ILMS, using the MARC metadata standard. The purpose is to direct patrons to the physical or electronic location of items or areas they seek as well as to provide a description of the item/s in question.
More recent and specialised instances of library metadata include the establishment of digital libraries including e-print repositories and digital image libraries. While often based on library principles the focus on non-librarian use, espcially in providing metadata means they do not follow traditional or common cataloguing approaches. Given the custom nature of included materials metadata fields are often specially created e.g. taxonomic classification fields, location fields, keywords or copyright statement. Standard file information such as filesize and format are usually automatically included.
Standardisation for library operation has been a key topic in international standardisation (ISO) for decades. Standards for metadata in digital libraries include Dublin Core, METS, MODS, DDI, ISO standard Digital Object Identifier (DOI), ISO standard Uniform Resource Name (URN), PREMIS schema, and OAI-PMH. Leading libraries in the world give hints on their metadata standards strategies.[12][13]
Metadata and the law
United States
Problems involving metadata in litigation in the United States are becoming widespread.[when?] Courts have looked at various questions involving metadata, including the discoverability of metadata by parties. Although the Federal Rules of Civil Procedure have only specified rules about electronic documents, subsequent case law has elaborated on the requirement of parties to reveal metadata.[14] In October 2009, the Arizona Supreme Court has ruled that metadata records are public record.[15]
Document Metadata has proven particularly important in legal environments in which litigation has requested metadata, which can include sensitive information detrimental to a party in court.
Using metadata removal tools to "clean" documents can mitigate the risks of unwittingly sending sensitive data. This process partially (see Data remanence) protects law firms from potentially damaging leaking of sensitive data through Electronic Discovery.
Metadata in healthcare
Australian researches in medicine started a lot of metadata definition for applications in health care. That approach offers the first recognised attempt to adhere to international standards in medical sciences instead of defining a proprietary standard under the WHO umbrella first.
The medical community yet did not approve the need to follow metadata standards despite respective research.[16]
Metadata and data warehousing
Data warehouse (DW) is a repository of an organization's electronically stored data. Data warehouses are designed to manage and store the data whereas the Business Intelligence (BI) focuses on the usage of data to facilitate reporting and analysis.[17]
The purpose of a data warehouse is to house standardized, structured, consistent, integrated, correct, cleansed and timely data, extracted from various operational systems in an organization. The extracted data is integrated in the data warehouse environment in order to provide an enterprise wide perspective, one version of the truth. Data is structured in a way to specifically address the reporting and analytic requirements.
An essential component of a data warehouse/business intelligence system is the metadata and tools to manage and retrieve metadata. Ralph Kimball[18] describes metadata as the DNA of the data warehouse as metadata defines the elements of the data warehouse and how they work together.
Kimball et al.[19] refers to three main categories of metadata: Technical metadata, business metadata and process metadata. Technical metadata is primarily definitional while business metadata and process metadata are primarily descriptive. Keep in mind that the categories sometimes overlap.
* Technical metadata defines the objects and processes in a DW/BI system, as seen from a technical point of view. The technical metadata includes the system metadata which defines the data structures such as: Tables, fields, data types, indexes and partitions in the relational engine, and databases, dimensions, measures, and data mining models. Technical metadata defines the data model and the way it is displayed for the users, with the reports, schedules, distribution lists and user security rights.
* Business metadata is content from the data warehouse described in more user friendly terms. The business metadata tells you what data you have, where it comes from, what it means and what its relationship is to other data in the data warehouse. Business metadata may also serves as documentation for the DW/BI system. Users who browse the data warehouse are primarily viewing the business metadata.
* Process metadata is used to describe the results of various operations in the data warehouse. Within the ETL process all key data from tasks are logged on execution. This includes start time, end time, CPU seconds used, disk reads, disk writes and rows processed. When troubleshooting the ETL or query process, this sort of data becomes valuable. Process metadata is the fact measurement when building and using a DW/BI system. Some organizations make a living out of collecting and selling this sort of data to companies - in that case the process metadata becomes the business metadata for the fact and dimension tables. Process metadata is in interest of business people who can use the data to identify the users of their products, which products they are using and what level of service they are receiving.
Metadata on the Internet
The HTML format used to define web pages allows for the inclusion of a variety of types of metadata, from basic descriptive text, dates and keywords to further advanced metadata schemes such as the Dublin Core, e-GMS, and AGLS[20] standards. Pages can also be geotagged with coordinates. Metadata may be included in the page's header or in a separate file. Microformats allow metadata to be added to on-page data in a way that users don't see, but computers can readily access.
Interestingly, many search engines are cautious about using metadata in their ranking algorithms due to exploitation of metadata and the practice of search engine optimization, SEO, to improve rankings. See Meta element article for further discussion.
Metadata on the broadcast industry
In broadcast industry, metadata are linked to audio and video Broadcast media to:
* identify the media: clip or playlist names, duration, timecode, etc.
* describe the content: notes regarding the quality of video content, rating, description (for exemple, during a sport event, keywords like goal, red card will be associated to some clips)
* classify media: metadata allow to sort the media or to easily and quickly find a video content (a TV news could urgently need some archive content for a subject).
These metadata can be linked to the video media thanks to the video servers. All last broadcasted sport events like FIFA World Cup or Olympic Games use these metadata to distribute their video content to TV stations through keywords. It's often the host broadcaster[21] who is in charge of organizing metadata through its International Broadcast Centre and its video servers. Those metadata are recorded with the images and are entered by metadata operators (loggers) who associate in live metadata available in metadata grids through software (such as Multicam(LSM) or IPDirector used during FIFA World Cup or Olympic Games).[22][23]
Geospatial metadata
Metadata that describe geographic objects (such as datasets, maps, features, or simply documents with a geospatial component) have a history dating back to at least 1994 (refer MIT Library page on FGDC Metadata). This class of metadata is described more fully on the Geospatial metadata page.
Metadata on CDs and DVDs
CDs such as recordings of music will carry a layer of metadata about the recordings such as dates, artist, genre, copyright owner, etc. The metadata, not normally displayed by CD players, can be accessed and displayed by specialized music playback and/or editing applications.
Cloud applications
With the availability of Cloud applications, which include those to add metadata to content, metadata is increasingly available over the Internet.