Metadata Publishing is the process of making metadata data elements available to external users, using a formal process and a commitment to change control processes.
Metadata publishing is the foundation upon which advanced distributed computing functions are being built. But like building foundations, care must be taken in metadata publishing systems to ensure the structural integrity of the systems built on top of them.
Definition of metadata publishing
Published metadata has the following characteristics:
- Metadata structures available to the general public on a web site or by a download
- There is a document review and approval process for adding or updating data elements to the system
- New releases are made available without disturbing prior versions
- A publishing organization makes a commitment to change control process
Benefits of metadata publishing
When classifying benefits of metadata publishing, two groups are usually considered. External parties are usually consumers of information that are not part of the publishing organization. Internal parts are usually the various business units or departments within an organization.
Benefits to external parties
- Allows external systems (both people and agents) to have a clear understanding of the semantics of data elements in a system
- Allows third party to build semantic maps entre data models and data import and export entre systems
- Promotes service oriented architectures and allow horizontal sharing of information entre traditional information silos
- Allows systems to Participate in Accurately indexed and federated search processes
Benefits to internal parties
- allows parts from various business units
- makes Extract, transform, load (ETL) operations more accurate for data warehousing
- allows user interface designers to access a common pool
- promotion of model-driven architecture
Objections to metadata publishing
- Organizations that publish their metadata could make it easier to unauthorized people to find data if they breach an organization’s firewall
- Vendors that publish their metadata risk customers creating tools that allow their customers to export their data from computer systems, thus making it easier to migrate off a vendor’s system
Core process in metadata publishing
The following are some of the core processes in metadata publishing
- Gathering of metadata requirements
- Selection of metadata registry and metadata publishing tools
- Training of metadata concepts to project participants
- Stakeholder group formation
- Metadata harvesting
- Glossary consolidation
- Initial upper ontology construction (abstract data elements)
- Draft data element loading
- Data element review process
- Publishing approved metadata elements in a variety of output formats (see below)
- Creation and maintenance of versions and depreciation of redundant data elements
File format metadata publishing
Organizations that create applications that can also publish metadata definitions. One common way to perform this application is a compressed XML file format. The XML files can be uncompressed and validated against an external XML Schema. An example of this is done by the Open Source FreeMind tool.
Metadata publishing formats
- HTML – used for browsing a website and indexing by text-based search engines
- Web Ontology Language (OWL) – used by metadata search engine Swoogle
- XML Metadata Interchange (XMI) – Standard OMG for exchanging metadata
- Common Warehouse Metamodel (CMW) – Standard OMG for data warehouse metadata
- Topic maps – an ISO standard for the representation and interchange of knowledge, with an emphasis on the findability of information.
- KM3 or Kernel Meta Meta Model used in the Metamodel Zoos. The AtlanticZoo is an open source library of more than 100 metamodels under EPL License. KM3 is a simple Domain Specific Language for specifying metamodels. A number of transformations are available from KM3 to other notations like XMI.