mmx metadata framework
...the DNA of your data
MMX metadata framework is a lightweight implementation of OMG Metadata Object Facility built on relational database technology. MMX framework
is based on three general concepts:
Metamodel | MMX Metamodel provides a storage mechanism for various knowledge models. The data model underlying the metadata framework is more abstract in nature than metadata models in general. The model consists of only a few abstract entities... see more.
Access layer | Object oriented methods can be exploited using inheritance to derive the whole data access layer from a small set of primitives created in SQL. MMX Metadata Framework provides several diverse methods of data access to fulfill different requirements... see more.
Generic transformation | A large part of relationships between different objects in metadata model are too complex to be described through simple static relations. Instead, universal data transformation concept is put to use enabling definition of transformations, mappings and transitions of any complexity... see more.

MMX G4 Design Rationale

July 11, 2010 14:00 by mmx

The primary goals for the next major revision (G4) of MMX Metadata Framework are:

  • OMG MOF compliance, providing facilities to map metamodels created in MOF to MMX Data Model and realize those metamodels in MMX Repository;
  • XML Schema compliance, providing methodology for realization of any metamodel expressed as an XML Schema;
  • support for very basic workflow functionality directly in Core MMX Metamodel;
  • inclusion of Dublin Core attributes and simplified RBAC support directly in MMX Core Metamodel.  

Changes from G3 to G4 in MMX Physical Data Model

Data column Change Comments


remove UML association types (association, aggregation, composition) are supported via containment_ind and reliance_ind flags in G4
md_relation_type.semantic_type_cd remove Semantics of a relationship is expressed directly in relationship type in G4
remove Deprecated. Not supported in G4.
md_relation_type.containment_ind add Indicates that this is a 'contains' association ('aggregation' in UML terms).
md_relation_type.reliance_ind add Indicates that this is an 'owns' or 'relies on' association ('composition' in UML terms). Note that an 'owns' association is also a 'contains' association so to properly indicate an UML composition both flags should contain True. 
md_property_type.domain_cd change Refers to an Enumeration class in metamodel. Replaces domain_type_cd from G3, that combined the meaning of both domain_cd and datatype_cd. 
md_property_type.datatype_cd add Refers to one of the Datatype classes in Core Metamodel. 
md_object.public_ind add Indicates that the object is publicly visible and RBAC permissions for the object are not to be checked. 
md_object.mutable_ind add Indicates that the object can be changed. False would freeze the object. 
md_object.workflow_num add Can be used to tag an object with a workflow state. Note that there are no universal states defined: an application is free to interpret this figure however it chooses to.
md_object.security_id remove Deprecated. Simplified RBAC implementation in G4 does not provide support for RBAC Object concept, G4 RBAC API functions realize this functionality in a more direct way instead. 
md_poperty.domain_id remove Deprecated. In G4 Enumerations are tracked on M2 level.
md_property.value_id add In MOF, a property can be used to identify an object instance. In that case this field would contain object_id of an object. 

Note 1. Enumerations are treated as Core Metamodel classes derived from DataType class and are realized on M2 level. EnumLiterals are realized as instances of one particular Enumeration class and are therefore stored on M1 level, ie. md_object.object_type_cd points to an object_type_cd on M2 level (md_object_type.object_type_cd).

Note 2. Core Metamodel provides very simple facilities to implement basic workflow functionality. These simple facilities come in form of state_ind, public_ind, mutable_ind and workflow_num fields of md_object table that, in combination with published_, edited_, created_ and changed_ timestamps are sufficient to handle basic workflow management needs. The details of a specific workflow implementation are left to an application developers.

Note 3. A simplified, 'low-calorie' RBAC implementation is part of Core Metamodel. User, Role and Permission are implemented as classes of RBAC metamodel. However, RBAC Object is realized as a property of a Permission object. This allows Permissions with multiple Objects, and both class-based (referring a class in M2) and object-based (referring a concrete object in M1) permissions are possible, even in mixed manner. Similarly, RBAC Operation is realized as a property of a Permission object, with enumerated value list stored in M1 as EnumLiterals. Again, interpretation of these Operation tokens is up to an application 'owning' those operations. Finally, in addition to standard RBAC features classes Privilege and Pattern provide additional functionality required to build easy-to-use permission management applications. Privilege acts as a template for Permission objects, Pattern as a template for Role objects.

Trees and Hierarchies the MMX Way

March 30, 2009 22:34 by marx

Implementing trees and hierarchies in a relational database is an issue that has been puzzling many and has triggered numerous posts, articles and even some books on the topic. 

As stated by Joe Celko in Chapter 26, Trees [1]: "Unfortunately, SQL provides poor support for such data. It does not directly map hierarchical data into tables, because tables are based on sets rather than on graphs. SQL directly supports neither the retrieval of the raw data in a meaningful recursive or hierarchical fashion nor computation of recursively defined functions that commonly occur in these types of applications. <...> Since the nodes contain the data, we can add columns to represent the edges of a tree. This is usually done in one of two ways in SQL: a single table or two tables." The single table representation enables one-to-many relationships via self-references (parent-child) while more general, two table representation handles many-to-many relationships of arbitrary cardinality. Based on the principles of Meta-Object Facility (MOF), MMX implements both M1 (model) and M2 (metamodel) layers of abstraction. Two most important relationship types defined by UML, Generalization and Association, are realized.

Generalization is defined on M2 level and is implemented via SQL self-relationship mechanism. Each class defined in M2 must belong to one class hierarchy, and only single inheritance is allowed. In terms of semantic relationship types in Controlled Vocabularies [2], this is an 'isA' relationship. Associations (as well as aggregations and compositions) are realized as a relationship table (an associative or a 'join table') allowing any class to be related to any other class with an arbitrary number of associations of different type (with support for mandatory and multiplicity constraints). This implementation enables straightforward translation of metamodels expressed as UML class diagrams into equivalent representation as MMX M2 level class objects.

M1 level deals with instances of M2 classes and parent-child hierarchies here denote 'inclusion', 'broader-narrower' or structural relationships between objects ('partOf' relationship in Controlled Vocabularies world). UML Links are implemented as a many-to-many relationship table, with both parent-child and link relationships being inherited from associations defined on M2 level. This inheritance enables automatic validation of M1 models against M2 metamodels by defining general rules to reinforce the integrity of models based on the characteristics of respective metamodel elements.

('single table', parent-child, one-to-many)
('two tables', relationship table, many-to-many)
Class hierarchy ('isA'),
UML Generalization
UML Associations
Object hierarchies ('whole-part'),
UML Links
UML Links

There seems to be a huge controversy in data management community whether implementing hierarchies in SQL should employ recursion support built into modern database systems or not. While a technique employing manual traversal and management of tree structures is proposed by Joe Celko in [1], the book is 15 years old and meanwhile the world (and databases) have changed a bit. Recursion is now part of ANSI SQL-99 with most big players providing at least basic support for it, and in many cases arguable gain in performance without taking advantage of recursive processing makes way to the gain in ease and speed of application development with it.

MMX Framework encapsulates all the details of handling inheritance, traversing hierarchies, navigating linked object paths etc. in MMX Metadata API realized as a set of table functions (database functions that return table as the result) that can be easily mapped by Object-Relational Mappers [3]. The performance penalty paid for recursion that might be an issue in an enterprise scale DWH is not an issue here - after all, MMX Framework is designed for (and mostly used in) metadata management, where data amounts are not beyond comprehension. 

[1] Joe Celko's SQL For Smarties: Advanced SQL Programming, 1995.

[2] Zeng, Marcia Lei. Construction of Controlled Vocabularies, A Primer (based on Z39.19), 2005.

[3] Scott W. Ambler. Mapping Objects to Relational Databases, 2000.

MMX Knowledge Model

September 29, 2008 12:35 by marx

MMX Metamodel provides a storage mechanism for various knowledge models. The data model underlying the metadata framework is more abstract in nature than metadata models in general. The model consists of only a few abstract entities, most remarkably, OBJECT, RELATION, EXPRESSION and PROPERTY. The rest of the entities and relationships are 'hidden' behind these root objects and can be derived (inherited) by typifying those. Most of the structure of the data model normally exposed in ER diagram is therefore actually stored as data (meta-metadata). 

MMX Metamodel can be seen as a general-purpose storage mechanism for different knowledge models, eg. Frame system, Description Logic (RDF). Data models for these knowledge models are instantiated inside MMX Metadata Model by defining a set of classes (MD_OBJECT TYPE records) and relations (MD_RELATION_TYPE records) between them. MMX Metamodel corresponds to M3 level (metametamodel) in MOF terms housing both M2 (metamodel) and M1 (model) levels. An arbitrary number of different data models can exist inside MMX Metadata Model simultaneously with relationships between them. Each of these data models constitutes a hierarchy of classes where the hierarchy might denote an instance relationship, a whole-part relationship or some other form of generic relationship between hierarchy members (see Construction of Controlled Vocabularies).  

Several metadata models are predefined in MMX Metamodel, eg:

However, any other type of data model can be described in MMX, eg.

  • Business process elements (business rules, mappings, transformations, computational methods);
  • Data processing events (schedule, batch, task);
  • Data acquisition and transformation processes (container, step, extract, transform, load);
  • Data demographics, statistics and quality measures, etc., 
until the following conditions are fulfilled:
  • Each and every class is part of a primary hierarchy, implemented through parent key;
  • Every hierarchy has a root class denoting the data model;
  • Hierarchies need not be balanced;
  • Members of a hierarchy need not belong to the same class;

There are 2 basic methods of implementing a hierarchy that can be mixed:

  • a hierarchy of objects with different type: object type would infer an implicit name for a group (level);
  • a hierarchy of objects of the same type: no implicit names for levels are provided;


External references to MD_OBJECT and MD_RELATION metaobjects are referenced with URI-s that have different meaning in context of different metamodels. For external well-defined Internet resources this reference takes the form of a URL. In RDF context these references might refer to RDF vocabularies defined elsewhere. In cases of technical metadata (eg. relational model, file system, etc.) where standard URI schemes are not available new unregistered URI-s have to be be used.

Model Architecture

MMX Data Model is based on principles and guidelines of EAV (Entity-Attribute-Value) or EAV/CR (Entity-Attribute-Value with Classes and Relationships) Modelling technique suitable for modelling highly heterogenous data with very dynamic nature. Unlike in traditional ER design, data element names in EAV are stored as data instead of column names and data can only be interpreted with the help of metadata. Therefore modifications to schema on 'data' level can easily be done without physical modifications by just modifiying corresponding metadata.