Logging for SOA – an enterprise-level solution

SOA & WOA, 2008

Reads: 13,996

What could be the problem with logging in SOA in the presence of such wonderful tools like log4j, Java’s logging library and similar? Why might we need something special for SOA and why aren’t existing techniques enough? The answer is simple and complex simultaneously – in SOA we are dealing with distributed and composed entities that cause problems in log maintenance, not in log creation

Local Build, Distributed Analysis

Let’s follow a typical service development process and see what might go wrong with logging along the way. Assume we have three business services – F1, F2, and F3 – each implementing one business feature. Each service comprises two components built independently by different people and at different times.

Each component developer used log4j to log information and exception notes of different severity. Since the components weren’t built for the same task, each component has its own log file. Due to the reuse of components, we can’t modify them but can reconfigure each log4j, for example, to create log files in the same location for the same service, at least. We may not merge files online because each component may have its own policy for the log file management, which might conflict with the management rules from another component. Thus, we have six log files in, potentially, three different locations, distributed or co-located: F1C1, F1C2, F2C3, F2C4, F3C5, and F3C6.

Now, it appears that the business defines two business functions where two out of three features may be reused. Being in SOA, we develop two new business services – B1 and B2 – as follows: B1={F1, F2, F3}, B2={F1, F3}. It’s obvious now that for a relatively simple SOA case of two business services, we have to deal with 12 different log files, where four of them might be shared by independent services B1 and B2. This case is shown in Figure 1. Listen for a moment to what the operation and maintenance teams would have to say to us about such development. This isn’t all. In SOA, each business service is expected to have a service contract that includes a service level agreement, a SLA. For B1, the SLA has to be potentially dependent on the performances of six foreign logging procedures and with the robustness of up to three file systems (or other data stores like databases). Moreover, the B1 provider might not be the owner of F1, F2, and F3 services. This means the service contract between a consumer and B1 service provider may depend on three other service contracts and related SLAs. The B2 service presents a similar picture.

Each SOA service may also be reused in different service compositions and processes, in different contexts with different logging policies. You can imagine the complexity of managing all these SLAs, relationships, and performance dependency just for logging.

One Possible ‘How To’
I believe people familiar with logging and, in particular, with log4j immediately recognize that performance problems can be addressed by using asynchronous logging via a log4j messaging appender. This means that each component and the entire service will suffer minimal performance degradation because logging procedures return right after the logging message gets sent into the messaging channel. The messaging system is responsible for delivering the log messages to their destinations, even to the storage (Reliable Messaging can do this for you).

By putting message filters into message destinations, we can select log messages that belong to different entities such as components and services. Thus, if we introduce an identifier of the log origin, we can group log messages, respectively. In our example, we can define F1 ID, F2 ID, and so on including B1 ID and B2 ID. However, one problem remains – a service can be reused in different service compositions and processes. If we simply collect log messages based on the service ID, we end up with a mess – we will get log messages from several independent processes in one place. This won’t be of much help to the maintenance teams.

A possible solution is a composite identifier embedded in the log message and comprised of all the service IDs invoked along the way starting with the top-level service. In other words, the composite ID is a transaction identifier. For example, a request or transaction gets originated in the B2 service, which invokes the F1 service while its C2 component generates the log. The composite ID for this case might be B2F1C2. This idea is illustrated in Figure 2.

The composite ID then gets inserted into the log message and sent over the messaging system. The message filter recognizes elements of the composite ID, in particular, the B2 as a starting point, and puts the log into the Log storage for the B2 service. This lets the system operation team or the appropriate program analyze the log, recognizing all components and services contributed into the execution of service B2 for a given log message. As a result, in our example, we get just two log files or storages, for B1 and B2, instead of 12.

Logging Service
To demonstrate identity-based logging, we’ve postponed the conversation about the logging service. As we know, in Java we can use a commons-logging library to shield component code from actual logging procedures or a Spring injection for this purpose, or a combination. The same effect can be had by using WCF on the Microsoft platform. That is, it seems like a good idea to develop a logging proxy to plug a real logging mechanism into the code. This allows the logging service to be used at the service level instead of the code level.

A logging service represents an entity that can invoke SOA monitoring capabilities as well. In an identity-based log, monitoring can analyze the composite ID and find where, in which composition and process, and how frequently a service was used. It also can recognize the execution context of the service, which is very important for problem resolution.

In this case the information execution context can be identified by the composite ID. For example, the log from service F1 can show if it was engaged by service B1 or service B2. This relates directly to the SOA service contract. It’s assumed that the service contract lists all the execution contexts where the service provider guarantees certain service characteristics. If a service is used in a non-contracted context, the service provider might not be responsible for the service behavior.

The logging service has all the SOA service advantages across heterogeneous platforms and distributed components. However, such a service has some specifics. In particular, it is widely reused, i.e., its availability and accessibility has to have minimal or no downtime. Then, depending on the content of the log messages, the service might have to provide different levels of security protection. A guaranteed logging delivery might also be required if the logging makes records for compliance purposes (for regulations such as SOX/US or MiFID/UK). We can add storage management issues on top.

Overall, the logging service system becomes not as trivial as regular logging appears. This is why, in SOA, logging is elevated from a good programming practice to an enterprise-level solution.

The article has described how to solve a maintenance problem of information logging in SOA services caused by service reuse in service compositions and processes, i.e., by executions in different contexts. The solution is based on using asynchronous logging procedures provided by messaging systems, for example. We have proposed using a composite identifier that accumulates Service IDs produced by each service, top-to-bottom in the service invocation hierarchy. This allows for the gathering of logging information on a per service basis in the service-dedicated data storage.


Leave a comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: