The Architecture Musings | An Orientation to Integration Architecture

Integration is complex architectural challenge. Often the goal of integration is to achieve systems collaboration and interoperability through some type of system integration. In this post, I'll look at how an architect might approach system integration challenges, as well as the best aspects of the integration to document. I'm not trying to reinvent the wheel or be prescriptive here, but rather compiling a starting point.

Setting The Stage (System-of-Systems)

Having a system of integrated systems is not a new discipline; however, an opportunity to define complex systems. Mentioned systems are known as System-of-Systems (SoS) as per Mark W. Maier definition is:

A system of systems is an assemblage of components which individually may be regarded as systems, and which possesses five additional properties:
1. Operational Independence of the Components, 2. Managerial Independence of the Components, 3. Evolutionary Development, 4. Emergent Behavior, 5. Geographic Distribution

Let us begin with a quick definition of each attribute:

Operational Independence of the Components: If a SoS being disassembled into its component systems, each must be capable of operating meaningfully independently. In other words, the components must provide customer and/or operator functions on their own.
Managerial Independence of the Components: The component systems not only can, but also do, function separately. The component systems are obtained and integrated individually, but they continue to operate independently of the system of SoS.
Evolutionary Development: The SoS does not appear to be fully established. Its development and presence are evolutionary in nature, with functions and goals added, eliminated, and altered as experience allows.
Emergent Behavior: The SoS performs functions and meets goals that are not found in any component system. These behaviors are emergent characteristics of the overall SoS and cannot be localized to any individual component system. These behaviours achieve the fundamental goals of the SoS.
Geographic Distribution: The component systems cover a broad geographical area. As communications improve, large becomes a fuzzy notion, but at a least it signifies that the components can easily communicate information rather than huge amounts of mass or energy.

The Architectural Concern

From the IT architect's point of view is this required? Integration is old yet fresh topic which has already been dealt with in a variety of ways, in some precious books like “Software Architecture in Practice” and “Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions” an many more publications. I, not only, am not criticizing the good work on the above references, but I also investigated them and other papers on System Integration and used them in this piece of work as well. Here the intention is not to focus on the generic integration problem. The emphasis in this post is on integration while keeping the system of systems in mind, and thus on the taxonomies connected to that.

This post contains no recipe or explanation of a complete solution. In an architectural context, the emphasis is instead on best sort of classification in hand, which provides a clearer understanding.

Integration Approach Taxonomy

Deciding not to decide is a decision at its own!

Gregor Hohpe in his book "The Software Architect Elevator" talks about one of the important roles of an architect: Making Decision in which he quotes from Martine Fowler:

One of an architect’s most important tasks is to eliminate irreversibility in software design.

We shall generalize it to integration architecture and indicate that prior to select an integration pattern, an architect must be clear on the nature of integration going to be conducted, as this will have a significant impact on the type of integration pattern chosen.

It's a good idea to remind ourselves that Architectural Patterns are simply a collection of design decisions.

While the necessity for sense of purpose may appear evident, selecting a pattern is not even an easy decision for couple of reasons:

Individual systems serve distinct functions hence we should expect different behavior.
Developers infrequently articulate their intentions for building them or the issues at stake when they are used.

Hence, an architect must establish what type of system (integrated system so to speak) is intended, since it will be the driving force of essential architectural decisions. In addition, the architect has to invest some efforts to adapt essences of a pattern (threats and opportunities) to its predicted use.

One’s choice of the key integration patterns are distinguished by couple of design factors like -most importantly- flow(or control) and data. Let us take a closer look.

Data can be Shared (public) or be Isolated (private). If there is shared data, any involved system in the integration can access to it. In reality, only a few integrated systems offer access to all other systems access to data (publicly), also imagining a set of integrated systems without sharing any data is impossible, because there is no way for the systems involved to interact.

Flow, at one end of spectrum, among a specific sequence of activities might be practiced hierarchically in a situation that a system control another. In the other end, none of the systems may control any others, and for sake of sharing data at some point they might operate in a loose hierarchy and yet practicing individual control.

A famous example shows that there it is not counterintuitive if we consider any single system in the integration independent from management and operation perspective while we consider the whole set as one system:

Imagine a payment system and an online store are totally autonomous entities, although, the online store employs the payment system as part of its payment flow during a sale transaction. As a result, from the standpoint of a precise activity sequence, the online store possesses a limited control.

Probably, real world systems involve more choices alongside the data and flow aspects. This can be due to the need of illustrating the constraint imposed by legacy systems, ownership challenges or even the quality measures that an architect is aiming to achieve.

Nearly every goal may be achieved in a computing system with enough allocated resources (Time and Money), nevertheless, the architectural pattern chosen will have an impact on the ease of flow and data aims.

Service Oriented and Agent Based architectures, for instance, are far less suitable for the systems with resource constraints. Whenever quality standards such as Availability and Performance are the motivation for the architecture, finer integration along with tighter control over the activity sequence, is often most preferred. In such cases architects better to consider some patterns like PRC which give higher level of control. On the contrary, centralized control tends to stifle innovation and progress in a predominantly hierarchical society.

This indeed is a vital decision for one architect to make! Yes, the choice of system wide pattern. Whatever decided pattern is, it will have such an effect on the Integrated Environment. The intention of bringing up Data and Flow taxonomy here is not endorsing that specifically for choosing a pattern, but rather to clarify the complexities associated with such choices in front of an architect.

Not to mention that in practice, it is frequently quite complex, and we are not dealing with a binary choice also it often involves pre-existing decisions about the design principles which must be considered by architect.

Development Taxonomy

The first stage in engineering (particularly architecting) a system of integrated systems is to establish the system's nature and context. As a result, I feel it is critical to provide classifications that help us better understand the development environment, scope, and constraints.

A development approach that drastically limits future design plans is decided in this round. So, it is better to know upfront if we are building an utterly new system of systems or merely a new system within a larger context of integrated systems. Also, whether to consider prior developments should really be decided.

Scope

The decision on Scope is key because it determines whether one architect must consider existing decisions on the system of integrated systems or if they are in a position where the decision should be made actively because the System of Systems is yet to be defined.

Typically, an architect must make decisions regarding both establishing the system as a “whole” (a complex of linked systems) and forming the individual system in that context at the same time.

Overall, the more the scope complexity, the greater the impact on the design.

Context

Understanding the development context is the other factor of the development taxonomy. Oftentimes, systems are integrated incrementally rather than with a larger mindset. There are various decisions that can only be taken after establishing the context. The development context can be categorized into three groups: Greenfield, Brownfield, and Closed Source, and the following scenarios can be imagined based on the nature of the job (either architect thinking is about individual System or SoS):

	Greenfield	Brownfield	Closed Source
Individual System	Develop new system to be integrated into an existing larger complex of integrated system.	Alter an existing system so it can be incorporated into an existing bigger complex of integrated system.	Bundle an existing system into an existing larger complex of integrated system.
System as a whole	Develop a new system without taking legacy into considerations.	Developing a new system by introducing new APIs and deprecating old ones.	Wrapping (rather than changing) an existing system to provide an interface to the existing environment.

As shown in the table above, different contexts imply significantly different architectural limitations that determine viable integration methods. In reality, the boundaries are not much clear.

This is obvious that in Closed Source group we are dealing with a Black Box sort of concept, and it can be more challenging due to lack of access to implementation of the system.

Integration Function

Another challenge that needs to be addressed right away is the function or aim of integration. This challenge will frequently necessitate defining on different parts: the objective of system integration against the function of an individual interface. There are few famous potential functions:

One-Directional (Inform): A system need to provide information to one or more other systems. Based on the information provided, the Consumer/Receiver system decides how to act.
Bi-Directional (Sync): To keep each other in sync, more than one system must share information. There is no specified Provider and Consumer relationship in the systems. In this case, too, the Consumer or Receiver system selects how to act based on the information provided.
Control: One system manages the other systems. The direction of control may change over time, or distinct facets may be controlled in different directions. In contrast of One-Directional and Bi-Directional (which are types of information exchange) here the Provider or Sender already is aware of the reaction of Consumer or Receiver system.
Negotiation: To achieve their specific goal, many systems must negotiate. This typically entails specific patterns, such as auctions, and may include the above-mentioned goals as special circumstances.

Technical Integration Attributes Taxonomy

Although it is vital to understand the overall context, the taxonomies presented thus far have been relatively abstract. The emphasis now is very much on specific technical attributes of integration that may be relevant to individual interactions. Indeed, this taxonomy must always be viewed in conjunction with the previous two.

To have a complete classification that could be used to identify relevant patterns for overcoming to integration challenges, is the main purpose of such a technology-oriented taxonomy. These taxonomies are not coarse-grained or generic, but rather narrower. It should not be used to determine the overall integration of a system as a whole (SoS).

It is necessary always to determine the various elements of the system and how firmly they should then be integrated. Having such information in hand will lead an architect to choose right pattern for integration.

Integration Level

The integration level describes how tightly the various systems (better to say applications!) are integrated with one another. I will stick to the one proposed by Rick Kazman, Claus Nielsen, and Klaus Schmid in their "SEI" paper, since there is no widely accepted list of integration categories:

Data Level - Information Exchange: One system supplies data that some other system uses as part of its routine operations. The technical concern in this case is common data access.
Service Level - Basic Behavior Interaction: One system will utilize another's capabilities. And can be as straightforward as a request response, which must clearly include Data Level Information Exchange as part of its communication.
Business Process Level - Complex Behavior Interaction: There is a complex interaction between the various systems in this scenario. In SOA, this is frequently referred to as choreography and orchestration.
UI Level - User Interface Sharing: In this instance, multiple systems may be required to share the user interface without even knowing about each other.

Data Abstraction Level

This classification may provide more granularity of the Data Level category in the Integration Level. Mutual understanding about any communicated information is required and it necessitates a baseline on various levels. Typical levels are highlighted here:

Structural: It is stated how to comply with relevant standards that describe data exchange. It could be a low-level standard that does not fully specify the data format's specifics.
Syntactic: it is stated how data exchange happens with right format. Syntactic is used to indicate that higher level data types are correctly mapped.
Semantic: it is stated how to provide the correct data as part of data exchange.

Many of the patterns which I could find at this point of time do not enforce any particular one.

Data Level Integration

Data level integration is concerned with how data is integrated; it seems to have a considerable impact on the strength of coupling of different systems and, as a result, the optimal integration pattern. In any case, this is another decision that an architect must make ahead of time in order to determine the data level. We will investigate some common mechanisms:

File Transfer: This is the strategy with the least couplings. The file format that will be exchanged has been agreed upon by both sides of the integration. However, in terms of events generated, this is a constraining mechanism.
Message Exchange: A loosely coupled technique that has been highlighted by noteworthy research by Hohpe and Woolf, among others. Messages are exchanged between individual systems to accomplish integration in this approach.
Streams: Oftentimes, systems communicate via continuous data streams like video streams. The fundamental stream is a continuous message that includes the retention and minimization of delays and disruptions, as well as accessibility in a short time span.
Common Data: Several individual systems communicate and access data in an interconnected manner using this technique which is fundamentally different from the Messaging and Streaming. There appear to be consistency assurance difficulties with this technique, which is another story for another day.

Employing more than one technique is normal, and it is primarily determined by the data type. However, a pattern can only tackle one data level integration mechanism, which is just for relevant information.

Interaction Mode

Integration Mode is tied to Integration Functions which explains form of an integration. Here the focus is on integration mode of individual system.

Send: Information is sent unidirectionally from one system to one or more others.

Call: By this mode, not only sends synchronously data, but also initiates specific actions. Though this mode normally implements one of Call and Return/Callback

Call and Return: This mode is enhanced version of Call mode in which a system calls synchronously another system and by that it hands over the control to the other end; it will wait until other system respond and returns the results.

Call and Callback: This mode defines an asynchronous form of Call and Return in which control is not passed to the other end of the integration but is instead used to make an indirect call to the originating system later.

Time Based: It is a mode of interaction in which the involved systems' behavior is synchronized via timing. This mode has a few versions that are based on the well-known concepts of Pulling (calling the source system in timely manner) and Pushing (source system will put the data out of the door again in a timely manner). Obviously, each has their own draw backs!

Multi Call: An enhancement to the Call oriented Modes (Call, Call and Return/Callback) in that it focuses on several interactions rather than just one (one to many calls for instance). This mode can handle more complicated scenarios mostly for negotiation use cases.

Now that we have defined distinct interaction modes, let's map some Integration Functions to Interaction Modes:

Function	Mode
One-Directional	Send, Time Based
Bi-Directional	Send, Call and Return, Call and Callback
Control	Possibly a variation of Call
Negotiation	Multi Call

Be wary about generalizing these types of mappings; they are not prescriptive.

Quality of Integration

The importance of quality in integration cannot be overstated. Therefore, in this section, I compiled a non-exhaustive summary of all the most relevant (or perhaps popular) quality characteristics that your stakeholders may require with the actual integration. Please do not mix them up with the applications' quality foci!

Reliability: The integration ought to be reliable enough that -in worst case scenario- any difficulties with the integration be detected by the origination system.
Performance: Refers to the fact that integration works properly. One easy symptom check for performance is that the integration does not have too many transitional stages, too much data transfer, or too many executions.
Security: This attribute pertains to the integration's security. For illustration, being able to validate the data source system.
Availability: Both ends of the integration must be available during the integration's lifetime.
Interoperability: This facet primarily concerns communication technology issues such as data management, architectural incompatibility, and the like.
Scalability: The integration must not only be scalable over a large(r) number of systems, but it must also perform correctly irrespective of the number of systems involved or the volume of data exchanged.
Manageability: Alludes to the simplicity with which management operations such as adding/removing/changing nodes, or better yet, systems participating in integration, may be performed.
Consistency: The integration of systems assures the legitimacy and integrity of the data sent between them.

It is worth noting that any number of patterns and techniques can ensure integration while also serving quality, or the pattern can be used with all of them.

Levels of Information System Interoperability (LISI)

Morris and colleagues presented a well-known model for System of Systems Interoperability that includes the levels indicated below:

0- “Isolated interoperability in a manual environment between stand-alone systems”: The main characteristic of Level 0 is human interference aimed at providing interoperability in situations where systems are segregated from one another.

“Connected interoperability in a peer-to-peer environment”: The main aspect of Level 1 is physical connectivity, which allows for direct communication between systems. This level models utilize a well-established electronic link with distinct peer-to-peer connections.

“Functional interoperability in a distributed environment”: The capacity of separate applications to communicate and utilize independent data components in a direct or distributed manner across systems is a major aspect of Level 2. The systems in this level are required to share and process sophisticated (heterogeneous) media.

“Domain-based interoperability in an integrated environment”: Level 3's defining characteristic is a domain standpoint, which comprises domain data models and procedures for sharing data throughout independent applications that may begin to function together for an integrated fashion.

“Enterprise-based interoperability in a universal environment”: Level 4 is distinguished by a top-level approach that encompasses enterprise data models and procedures, in which data is effectively exchanged with applications that collaborate across domains in a universal access environment.

Realizing which level of integration must be attained prior to developing the actual system is essential to identify the most suitable patterns, as certain patterns will only address specific levels of integration in different ways.

PAID Attributes

The PAID attributes are part of the LISI model and specify which integration aspects are supported. LISI classifies different facets of information system interoperability into four broad, interconnected attributes: Procedures, Applications, Infrastructure, and Data (PAID). PAID provides a technique for specifying the set of attributes required for exchanging information and services at each level. It outlines a methodology that results in interoperability profiles. LISI focuses on technological interoperability and the complexity of system interoperations but does not address environmental or organizational difficulties that contribute to the development and maintenance of interoperable systems.

Procedure: It includes all specified advice and operational controls that affect all aspects of system development, integration, and operational functionality. This attribute tackles the precise implementation options chosen for a system or systems, and the overall standards and design guidelines for an organization. It consists of operational and functional program development guidelines, along with technical and system architecture standards. The items that comprise the processes attribute are divided into four broad groups that span the interoperability levels:
- Standard: Technical Standards, Technical Architectures, and Common Operating Environment.
- Management: Mission, Doctrine, Systems Requirement Definitions, Installation, and Training
- Security Policy: Classified, Unclassified, or Secret
- Operations: Network, E-Mail Servicing, and Bandwidth Considerations.
Application: It includes the essential purpose and function for which any system is designed, such as system mission. The functional requirements given by users to conduct an operational task are the software application's core substance. Be it necessary to execute basic word processing or advanced nuclear aiming, the functions performed and the apps that enable them to reflect the system's capabilities to the user. For effective interoperability to happen, identical capabilities or a shared understanding of the information shared needs to exist throughout systems; without, users have no common set of standards.
Infrastructure: It facilitates the creation and the use of a "connection" across systems or applications. This connection could be a basic, extraordinarily low transmission, or it could be a network of wireless IP networks with varying levels of security. Infrastructure also contains "system services," which aid in the operation and interaction of systems. Those were elements such as communication protocol stacks and object request brokers that are utilized by functions to establish and influence system interactions. Infrastructure also includes the security equipment and technical capabilities employed to implement security measures.
Data: It focuses on the data that the system processes. This attribute is concerned both with data’s Syntax and the Semantics. It covers all data forms that serve all tier of a system's operations, from the OS and communications infrastructure to the entire range of end-user applications. The data attribute encompasses all information styles and formats, including free text, formatted text, databases (formal and informal), video, sound, photography, visual (map) information, and so on. As a result, the data attribute is clearly the most important part of achieving system interoperability. Large part of today's effort and work toward establishing interoperable systems is occurring inside this attribute, such as defining standard file formats, database standards, and data definitions.

Final words

In this post I have covered the Development and Technical Taxonomies. As a result, the proposal is to document all the attributes listed in the various categories for your integration scenarios. This is essentially a record of the decisions you, as a Solution Architect, took in relation to the integration scenarios in concern.

Cheers,
Mohammad Malekmakan

..and Some References

Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions
Architecting Principles for Systems-of-Systems
Pattern-Oriented Software Architecture
Collaborative Virtual Environments - Digital Places and Spaces for Interaction

https://calhoun.nps.edu/handle/10945/6207
https://pdfs.semanticscholar.org/4c9d/1d30ad20ecbd1a6bca71741c2cbd5b68827a.pdf
http://web.cse.msstate.edu/~hamilton/C4ISR/LISI.pdf
https://lirias.kuleuven.be/retrieve/171909
http://jite.org/documents/Vol1/v1n3p201-211.pdf
https://www.knowledge-communication.org/pdf/schmeil-eppler-iknow-08.pdf
https://resources.sei.cmu.edu/asset_files/TechnicalReport/2004_005_001_14375.pdf

Disclaimer: All opinions and content published in my blog and my social networks are solely my own, not those of my employer(s) and the communities I am contributing in.