In a modern service landscape, especially when using containers, you are probably using something like the ELK stack (Elasticsearch, Logstash, Kibana) to flow all the logging into. But how to find from all those loglines what caused the nasty bug after a innocent buttonpress? One of the easy answers to this is what’s called a correlation id - basically a unique number assigned to that buttonpress which gets carried around between the services and added to every logline. Sounds good you say? it is so let’s see how to do this.

First we need to generate and send the correlation id from the browser. To do this in a transparent way we add a header with every request to the backend and put a unique id (UUID is mostly used) in it. Below is a piece of AngularJS code to do this:

(function(angular) {
  var InfoFactory =  function($resource) {
      return $resource('/info', {}, {
              get: {
                  method: 'GET',
                  isArray: false,
                  headers: {
                     'CID': calcCorrelationId
                  }
              }
          });
    }

    function calcCorrelationId(){
        // see http://www.ietf.org/rfc/rfc4122.txt
        var s = [];
        var hexDigits = "0123456789abcdef";
        for (var i = 0; i < 36; i++) {
            s[i] = hexDigits.substr(Math.floor(Math.random() * 0x10), 1);
        }
        s[14] = "4";  // bits 12-15 of the time_hi_and_version field to 0010
        s[19] = hexDigits.substr((s[19] & 0x3) | 0x8, 1);  // bits 6-7 of the clock_seq_hi_and_reserved to 01
        s[8] = s[13] = s[18] = s[23] = "-";

        var uuid = s.join("");
        return uuid;
    }

    InfoFactory.$inject = ['$resource'];
    myMod = angular.module("myApp.services");
    myMod.factory("Info", InfoFactory);
}(angular));

This sends a header named CID with each info request to the backend. Next we want to catch this header and log it with each logline in the backend. To do this we will use the MDC class of Slf4J. This class is Thread-local and can be accessed from your logging definition file. A similar class exists in Log4j. On the backend side we can solve this globally with a servlet filter. Since we are using Spring Boot we only need to annotate it with @Component to get it started.

import javax.servlet.Filter;
import javax.servlet.FilterChain;
import javax.servlet.ServletRequest;
import javax.servlet.ServletResponse;
import javax.servlet.http.HttpServletRequest;

import org.slf4j.MDC;
import org.springframework.stereotype.Component;

@Component
public class CidFilter implements Filter {

    @Override
    public void doFilter(ServletRequest req, ServletResponse res, FilterChain chain) throws IOException, ServletException {

		if (req instanceof HttpServletRequest) {

			HttpServletRequest request = (HttpServletRequest) req;
			String requestCid = request.getHeader("CID");

			// add cid to the MDC
			MDC.put("CID", cid);
		}

        try {
			// call filter(s) upstream for the real processing of the request
            chain.doFilter(req, res);
        } finally {
			// it's important to always clean the cid from the MDC,
			// this Thread goes to the pool but it's loglines would still contain the cid.
            MDC.remove("CID");
        }

    }

    @Override
    public void destroy() {
        // nothing
    }
    @Override
    public void init(FilterConfig fc) throws ServletException {
        // nothing
    }
}

Now every time something is logged while handling the request the correlation id is part of the MDC. The Logback documentation for MDC describes pretty well how to add MDC properties to log output but our target was Logstash, the log gatherer part of ELK. As luck will have there is a nice Logstash encoder for Logback which emits the JSON that Logstash loves. The JSON will include all properties found in the MDC, including our CID property, giving us the following neat and simple logback.xml:

OK, now all logging generated during the handling of the request is correlated…​. except for all the other services we need to call to properly handle the request. What we need to do is add the correlation id to every call to a service. To do that we use the same convention we already used, we add the correlation id as a header to the call. In spring Boot we can solve that with another generic solution if we call services with the RestTemplate. Instead of adding the header manually in each service call we can add an Interceptor to the RestTemplate and add the header in there. The interceptor:

import org.slf4j.MDC;
import org.springframework.http.HttpHeaders;
import org.springframework.http.HttpRequest;
import org.springframework.http.client.ClientHttpRequestExecution;
import org.springframework.http.client.ClientHttpRequestInterceptor;
import org.springframework.http.client.ClientHttpResponse;

public class CorrelationIdInterceptor implements ClientHttpRequestInterceptor {
    @Override
    public ClientHttpResponse intercept(HttpRequest request, byte[] body, ClientHttpRequestExecution execution) {
        HttpHeaders headers = request.getHeaders();
        headers.add("CID", MDC.get("CID"));
        return execution.execute(request, body);
    }
}

As you can see the correlation id is transferred from the MDC where it was put by the request filter into the headers used to call the next REST service. Adding the Interceptor to the RestTemplate used by Spring Boot can be done in the Application class by supplying a custom RestTemplate bean:

@SpringBootApplication
public class Application {

    @Bean
    public RestTemplate restTemplate() {
        return new RestTemplate() {
            {
                setInterceptors(Collections.singletonList(new CorrelationIdInterceptor()));
            }
        };
    }
}

We can do this for all (micro) services on both the request side and the call side. All the logging generated by the pressed button will be flowing into ELK with the correlation id added. Now we can find all logging by searching in Kibana for the 'CID' field with a specific id. lot’s more to tell, but this is it for now…​.

shadow-left