White Paper
In our days of digital omni-interconnection is one of the top buzz-words like “#innovation” or “#architecture” were recent. It seems that all IT specialists – developers, designers, architects, and managers – are in need for better understanding and use of the term “#integration”. For example, what is the difference between “integration” and #interaction”? Also, if one re-keys data from one system to another system on a continuous basis, is this integration between systems? We’ll try to answer these questions and explain the majority of related questions about integration and we start from the major question – what integration is for?
Is Integration a Panacea?
In real-world business, the tasks exist in one of several levels of granularity and we have to apply few artificial scales to identify these levels for a particular case; for another case (context) the levels may be different. However, there is a universal rule all business follows – a certain amount of particular business tasks can be combined/composed for resolving another business task; the task granularity levels may be mixed and are not really important for such composition. If we represent each business task by a “technical system”, we can talk about composing several systems for resolving a common business task.
The major purpose of integration is to fulfil one of two major mechanisms to work in composition, for solving a task together. Another mechanism is interaction. Here is another pair of terms where one is a buzzword and another is almost forgotten – collaboration and cooperation. Collaboration is about working together. However, cooperation is also about working together… Well, the devil is in the details, right?
Historically, people worked together via cooperating with each other. Nowadays, small businesses in towns and countries still cooperate and we know several Cooperative Banks. Collaboration appeared relatively recently and meant such a type of joint work where the goal/task of this work becomes one of the tasks of each participant of the joint work. Since our company or team has a new task/goal derived from the joint work, we likely do something special in our organisation to reach this goal. In other words, collaboration assumes (though, does not necessarily demand) some internal changes in the participants of the collective work. In the contrast, cooperation does not assume any internal changes for the sake of collective work. Cooperating entities are used or engaged as-is and may be even unaware of participation in the cooperation. This results in cooperation entities can participate in as many joint works as they want with no additional efforts. Unfortunately, the latter is not true for collaboration.
If you, our reader, have clearly understood the difference between collaboration and cooperation, it will be very easy to you to understand the difference between integration, which realises collaboration, and interaction, which realises cooperation. For example, if a person drives a car, it is an integration due to the physical and emotional efforts the driver spends on the driving; the same person, taking a ride on a bus, cooperates (except a possible ride charge) with the bus.
We can make two conclusions already:
- Integration between systems requires certain changes in one or both systems while interaction does not.
- If a system is created with a lot of interfaces and related internal mechanisms for a collective work with other systems, i.e. has been designed for minimum changes over the life-cycle, this system usually cooperates with other systems while related businesses may collaborate as much as needed.
Pseudo-Integration
From the early days of connecting systems to the source of data needed for those systems, the technology names such connectivity an “integration”. This is nothing but jargon because no integration takes place when a system takes data from a file, or a data store, or via the User Interface. Neither the system nor the data source have a common and shared business task and do not need to work together on the task resolution/realisation. The system and the data source exist on their own for their own purposes.
Here are two examples of pseudo-integration. First, a person sits in front of displays of two systems, reads the data from one system’s display (or from a hard copy of the data on a piece of paper) and types the data into another system. This is not a fairy tale – in my practice, I saw such processes in a couple of insurance companies in London just a few years ago. So, the data sited in one system were needed in another system. These data sets might be in any data container, collected from many different sources even unrelated to the first system, i.e. we cannot talk about integration between two systems.
Second, we have two systems and a stand-alone database. Both systems have the same access to this database. All IT specialists are perfectly aware of the old-days “integration” based on that one system placed data into the database and, then, another system fetched this data. At that time, the popular solution for such “integration” was a process/pattern called Extract-Transform-Load (ETL). It is widely used nowadays as well. This is classic data input into an intermediary data store, which does not constitute any integration – any system with the database access tights cab get and even delete the data, i.e. systems are not integrated but share the same source of data. Neither of these systems knows about each other and can work without another if somebody would place data in the database. In the financial industry, an alternative data source for the same data is commonplace, and applications that use alternative or shared data sources are not considered integrated.
A Joint Work by Interaction
The world of the composition of systems demonstrates another example of the dual nature of collective work: APIs and Events. As we know, API’s end-point per se neither provides any value nor make much sense. The value and sense are provided by the part of the system that situates behind the end-point This part represents the contribution of the system into integration. If this part is changed, the integration is changed. This usually enforces the counterpart of the integration to change. Thus, integration has a problem of coupling and calls for special mechanisms for decoupling.
Events take place in the system and may or may be not announced/declared. An announcement of an even in the form of notification or message is called “firing an event”. Other systems can register and “listen” for certain events/messages/notifications or may ignore them. This is typical cooperation. Also, we have an event model where instead of firing an event, the latter is recorded in the data store (known as Event Sourcing). Those systems that monitor the content of this data store can obtain information of new/and/or all events belong to the particular system.
Our conclusion with this regard is:
- Event-driven architecture is not an integration solution. Every entity is free to listen to certain events or not. The problem here is in if an entity has failed to listen – missed an event or were down for a while – the common cooperation can be destroyed and remain in such a state relatively long time making the problem deeper. The only exception from this scenario is Event Sourcing.
Integration/Interaction Intermediaries
In SOA, service-systems exist to serve the needs of consumer-systems. Each consumer-system integrates with the service-systems because the consumer-system should make internal changes to work with a particular service-system’s interface – end-point and data semantic model. When one service-system acts as a consumer to another service-system, an integration also takes place. Full decoupling in SOA is not possible but it can be made very lightweight. A class of intermediaries under the umbrella name “service bus” is used in the industry. SOA services are capable to work within and across company boundaries. As a result, SOA Services may not rely on inter-trust provided by the company to its internal systems. This is why, SOA Services should establish a trust for intercommunication, i.e. a consumer should know which service provides the required business value/business capability. This is a mandatory precondition to the trust because this service had been not only discovered but also analysed against consumer’s needs up-front. So, when a consumer calls the service via a service bus, the name/ID of the service (provider) should be specified. In SOA, the service bus has no rights to decide which service to engage to fulfil the request. It is a resilient solution on the service side when the service provides its substitute (like in Sib Pattern ) for a case of failure.
For completeness of the White Paper, we have to mention an “event bus” as well. The industry does not have a commonly accepted definition of the event bus though many say that it is a mechanism for making the event-firing entity and event-listening entity interact without knowing about each other. Well, this is only a half-truth. The event-listening entity should know about the certain incident that happened, which is usually associated with a particular entity (however, there are a few exceptions from this “rule”). In the event-driven solution/architecture/design, the creators realise an overall interaction logic via listeners for specified events in certain entities. Apparently, we can imagine an error-service that listens to all error-events from many devices/sensors (like in IoT), but in the majority of cases, an accurate error-handling may be done if these sensors/events are the same.
An event bus allows the event-firing entity to communicate with other entities without knowing who is listening to its events. This is good and bad at the same time It is good because the event-firing entity is decoupled from other entities, but an undesirable listener can listen as well (a security problem). This decoupling is not equal to communication without depending on each other – the listening entity does depend on the event, i.e. on the event-firing entity. If the latter fails, the former will never execute what it has to.
As you probably have noticed already, an integration constitutes a so-called point-to-point interaction. Special Integration Platforms like Boomi, Informatica and that can be built on MA Azure, Amazon AWS, etc. deliver a “point of indirection” and interrupt such interaction. An example of “point of indirection” in addition to a “bus”/messaging can be an API Gateway. We are not enthusiasts of placing additional duties such as data transformation and security controls on an API Gateway – these features should belong to the Platforms and data/messages should not transfer to the Gateway if they are not acceptable/permitted at the destination (either because security or because mismatching data semantics/schemas).
Integration Platforms
We have two types of Integration Platform behaviours – active and passive. In the early days of the IT industry, the majority of systems/applications were designed as self-sufficient monoliths, i.e. they did not provide integration interfaces and linking them together or with other systems was a federal case every time. This enforced the Integration Platforms to proactively communicate with such systems, even via the back-doors like data storage. The next generation of systems/applications has started to represent integration interfaces. However, IT development has had an obsessive idea about integration as an exchange of data and data only. ( SOA services and WSDL interfaces had a concept of integration via calling functionality of provider with no data, but this was totally outmoded by data resources with RESTful API; now, the gRPC interface returns functional integration into the play). As a result, the majority of modern applications are passive and expect the Integration Platform to play the role of active consumer.
Active platforms with passive systems/applications institute a serious problem for companies because such “consumer” is a part of IT infrastructure, not a business system/application and, thus, has no responsibilities in the business processes or transactions. In our opinion, proper integration has to comprise active business systems/applications that need to integrate or interact with each other via passive integration and/or interaction means. The very systems/applications should decide and be responsible for the decisions about which data provider to integrate with and which functionality/capability provider to invoke explicitly via interface or implicitly via events.
24-07-2021