Data Flow Management
How to give to Business Users an autonomy in their daily data flow management?
Are links between your data killing your data flow management and traceabiliity?
Where are located Business Rules establishing integrity constraints applied to make safe associations between Business Objects?
This question is very acute when data is scattered accross several physical databases. In the left figure, every business object is located in a dedicated physical database. In a more realistic system, data duplications would also be present, which would be an additional difficulty. Indeed, in this situation, data integrity mechanisms provided by every database system are no longer usable.
Therefore, Business Rules required to manage integrity between business objects, are hard-coded within the data integration layer, such as an ESB. These rules are handled by IT specialists only. They are also duplicated and dispersed within every functional silo, inside every database with triggers and other hard-coded software approaches.
This architecture brings a dangerous lack of data flow integrity and audit trail since all business logics are hard-coded, including referential integrity constraints spanning databases. This situation provides a lack of alignment with business regulations and IS governance best practices requiring a complete data flow traceability such as CobiT, Sarbanes Oxley, Solvency II, IAS-IFRS, etc. To tackle this concern, MDM and Data governance come into play and complement with ESB.
MDM and Data Governance are established to leverage data flow management and traceability
First of all, a shared Data Model must be established. This Common Information Model (CIM) is a Business Data Model spanning physical databases boundaries. It describes all links between business objects whatever their locations within silos. This model is obtained through an iterative lifecycle design, without any tunnel effect or big-bang approach. To get more information about modeling procedures applied to achieve this goal, please visit our sister community MDM Alliance Group.Then, the CIM is fully implemented within a Master Data Management (MDM) to save information required to manage and oversee data flow: business objects identifiers (Primary and Foreign Keys), providers and consumers identifiers, referential integrity constraints (outcomes of business rules), technical headers and other data body depending on requirements.
The MDM must provide Business Users with Data Governance features such as querying, version and permissions management, authoring when it is needed to complement data flows, full business audit trail to guarantee data flow traceability, etc.
Could a MDM within the ESB layer bring troubles of performance?
To tackle this point, three types of integration are used and can live together to fully manage performance issues:
First - Synchronous integration
Data flow is checked and saved within the MDM before sending to consumers.
Second – Partially synchronous integration
Data flow is checked only and then pushed to consumers. In other word, data flow storage within the MDM is done after sending to consumers
Third – Asynchronous integration
Data flow are checked and stored after sending to consumers only.
How to choose within these three types of integration?
It depends on your data governance objectives. You can begin with the easiest approach “Asynchronous integration”. There are no impacts on your current ESB architecture. Even with this integration, you obtain a full business audit trail guaranteeing a complete data flow traceability and errors reports. When data flow errors are detected you set up IS governance processes to reestablish a stable situation by benefiting from the MDM full business data flow traceability. When needed you can involve your business users since you hold a business governance of data flow, not only an IT audit trail not readable by business users.