Java: centralized logging with Spring Boot, Elasticsearch, Logstash and Kibana

Logging is a crucial aspect of an application. And when we’re dealing with a distributed environment, like microservice architecture or having multiple instances running with the help of a load balancer, centralized logging becomes a necessity.

In this article, we’ll see how to enable centralized logging in a typical Spring Boot with the ELK stack, which is comprised of Elasticsearch, Logstash and Kibana.

Meet the ELK stack

No alt text provided for this image

In a nutshell, this is how centralized logging works:

No alt text provided for this image

The application will store logs into a log file. Logstash will read and parse the log file and ship log entries to an Elasticsearch instance. Finally, we will use Kibana ( Elasticsearch web frontend) to search and analyze the logs.

We’ll use a CRUD RESTful API from a previous post as the application.

To achieve this, we need to put several pieces together. Let’s begin.

Install Elasticsearch

Just follow the official instructions: https://www.elastic.co/downloads/elasticsearch

After that, let’s check if it’s running:

curl -XGET http://localhost:9200

You should see an output similar to this:

No alt text provided for this image

Install Logstash

Just follow the official instructions: https://www.elastic.co/downloads/logstash

Install Kibana

Finally, the last piece. Just follow the official instructions: https://www.elastic.co/products/kibana

Point your browser to http://localhost:5601 (if Kibana page shows up, we’re good — we’ll configure it later).

Configure Spring Boot

To have Logstash to ship logs to Elasticsearch, we need to configure our application to store logs into a file. All we need to do is to configure the log file name in ‘ src/main/resources/application.yml’:

logging:
  file: application.log

And the log file will be automatically rotated on a daily basis.

Configure Logstash to Understand Spring Boot’s Log File Format

Now comes the tricky part. We need to create Logstash config file.

Typical Logstash config file consists of three main sections: input, filter and output:

No alt text provided for this image

Each section contains plugins that do relevant part of the processing (such as file input plugin that reads log events from a file or Elasticsearch output plugin which sends log events to Elasticsearch).

We’ll create a file called ‘logstash.conf’ that will be used in Logstash initialization that will be showed soon. It will be placed under ‘src/main/resources’ directory.

Let’s dig in each section.

Input section

It defines from where Logstash will read input data.

This is the input section:

input {
  file {
    type => "java"
    path => "/home/tiago/desenv/java/crud-exceptionhandling-example/application.log"
    codec => multiline {
      pattern => "^%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{TIME}.*"
      negate => "true"
      what => "previous"
    }
  }
}

Explanation:

1) We’re using file plugin

2) type is set to java - it’s just an additional piece of metadata in case you will use multiple types of log files in the future

3) path is the absolute path to the log file. It must be absolute - Logstash is picky about this

4) We’re using multiline codec, which means that multiple lines may correspond to a single log event

5) In order to detect lines that should logically be grouped with a previous line we use a detection pattern:

5.1) pattern => ”^%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{TIME}.*“: Each new log event needs to start with date

5.2) negate => “true”: if it doesn’t start with a date…

5.3) what => “previous” → …then it should be grouped with a previous line.

File input plugin, as configured, will tail the log file (e.g. only read new entries at the end of the file). Therefore, when testing, in order for Logstash to read something you will need to generate new log entries.

Filter section

It contains plugins that perform intermediary processing on a log event. In our case, event will either be a single log line or multiline log event grouped according to the rules described above.

In the filter section we will do:

  • Tag a log event if it contains a stacktrace. This will be useful when searching for exceptions later on
  • Parse out (or grok, in Logstash terminology) timestamp, log level, pid, thread, class name (logger actually) and log message
  • Specified timestamp field and format - Kibana will use that later for time-based searches

This is the filter section:

filter {
  #If log line contains tab character followed by 'at' then we will tag that entry as stacktraceif [message] =~ "\tat" {
    grok {
      match => ["message", "^(\tat)"]
      add_tag => ["stacktrace"]
    }
  }

  #Grokking Spring Boot's default log format
  grok {
    match => [ "message",
               "(?<timestamp>%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{TIME})  %{LOGLEVEL:level} %{NUMBER:pid} --- \[(?<thread>[A-Za-z0-9-]+)\] [A-Za-z0-9.]*\.(?<class>[A-Za-z0-9#_]+)\s*:\s+(?<logmessage>.*)",
               "message",
               "(?<timestamp>%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{TIME})  %{LOGLEVEL:level} %{NUMBER:pid} --- .+? :\s+(?<logmessage>.*)"
             ]
  }

  #Parsing out timestamps which are in timestamp field thanks to previous grok section
  date {
    match => [ "timestamp" , "yyyy-MM-dd HH:mm:ss.SSS" ]
  }
}

Explanation:

1) if [message] =~ “\tat”: if message contains tab character followed by at then…

2) … use grok plugin to tag stacktraces:

  • match => [“message”, “^(\tat)”]: when message matches beginning of the line followed by tab followed by at then…
  • add_tag => [“stacktrace”]: … tag the event with stacktrace tag

3) Use grok plugin for regular Spring Boot log message parsing:

  • First pattern extracts timestamp, level, pid, thread, class name (this is actually logger name) and the log message.
  • Unfortunately, some log messages don’t have logger name that resembles a class name (for example, Tomcat logs) hence the second pattern that will skip the logger/class field and parse out timestamp, level, pid, thread and the log message.

4) Use date plugin to parse and set the event date:

  • match => [ “timestamp” , “yyyy-MM-dd HH:mm:ss.SSS” ]: timestamp field (grokked earlier) contains the timestamp in the specified format

Output section

It contains output plugins that send event data to a particular destination. Outputs are the final stage in the event pipeline. We will be sending our log events to stdout (console output, for debugging) and to Elasticsearch.

This is the output section:

output {
  # Print each event to stdout, useful for debugging. Should be commented out in production.# Enabling 'rubydebug' codec on the stdout output will make logstash# pretty-print the entire event as something similar to a JSON representation.
  stdout {
    codec => rubydebug
  }

  # Sending properly parsed log events to elasticsearch
  elasticsearch {
    hosts=>["localhost:9200"]
    index=>"logstash-%{+YYYY.MM.dd}"
  }
}

Explanation:

1) We are using multiple outputs: stdout and elasticsearch.

2) stdout { … }: stdout plugin prints log events to standard output (console)

  • codec => rubydebug: pretty print events using JSON-like format

3) elasticsearch { … }: elasticsearch plugin sends log events to Elasticsearch server.

  • hosts => [“localhost:9200”]: hostname where Elasticsearch is located - in our case, localhost
  • index=>”logstash-%{+YYYY.MM.dd}”: the index in Elasticsearch for which the data will be streamed to. You can name it with your application’s name if you want.

Putting it all together

Finally, the three parts - input, filter and output - need to be copy-pasted together and saved into logstash.conf config file. Once the config file is in place and Elasticsearch is running, we can run Logstash:

$ /path/to/logstash/bin/logstash -f src/main/resources/logstash.conf

If everything went well, Logstash is now shipping log events to Elasticsearch. It may take a while before presenting an output like this:

No alt text provided for this image

Notice these lines:

[INFO ] 2019-03-17 17:44:18.226 [[main]-pipeline-manager] elasticsearch - New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["//localhost:9200"]}
[INFO ] 2019-03-17 17:44:18.249 [Ruby-0-Thread-5: :1] elasticsearch - Using mapping template from {:path=>nil}
[INFO ] 2019-03-17 17:44:18.271 [Ruby-0-Thread-5: :1] elasticsearch - Attempting to install template {:manage_template=>{"template"=>"logstash-*", "version"=>60001, "settings"=>{"index.refresh_interval"=>"5s"}, "mappings"=>{"_default_"=>{"dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date"}, "@version"=>{"type"=>"keyword"}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}}}
[INFO ] 2019-03-17 17:44:18.585 [[main]-pipeline-manager] file - No sincedb_path set, generating one based on the "path" setting {:sincedb_path=>"/usr/share/logstash/data/plugins/inputs/file/.sincedb_97bb92ea0ae35e33de7cca814bc912c7", :path=>["/home/tiago/desenv/java/crud-exceptionhandling-example/application.log"]}

Seems that an index called ‘ logstash-*‘ was created and the log file ‘ /home/tiago/desenv/java/crud-exceptionhandling-example/application.log’ was correctly located.

Let’s check if the index was really created at Elasticsearch by this command:

~$ curl -X GET "localhost:9200/_cat/indices?v"

And this is the output. It was created as expected:

health status index               uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   logstash-2019.03.17 ReyKaZpgRq659xiNXWcUiw   5   1         77            0    234.5kb        234.5kb
green  open   .kibana_1           o-dbTdTZTu-MGR3ECzPteg   1   0          4            0     19.7kb         19.7kb

Configure Kibana

Let’s open Kibana UI at http://localhost:5601. Click on ‘Management’:

No alt text provided for this image

Now click on ‘Index Patterns’ under ‘Kibana’:

No alt text provided for this image

And here we put the name of the index that was created in Elasticsearch: ‘ logstash-*‘. Then, hit ‘Next step’:

No alt text provided for this image

Now we choose ‘@timestamp’ as the field to filter data by time and then hit ‘Create index pattern’:

No alt text provided for this image

It’s show time!

Let’s launch the application and perform some operations in order to generate some data. Then we’ll check them later using the Kibana UI.

Fire up the server:

$ mvn spring-boot:run

So, as mentioned earlier, our app is a CRUD RESTful API from a previous article.

Let’s call the endpoint that lists all students. In the controller we have:

private static final Logger LOGGER = LoggerFactory.getLogger(StudentController.class);

/**
 * Get all students
 *
 * @return the list of students
 */
@GetMapping("/students")
public List<StudentDTO> getAllStudents() {
  List<Student> students = service.findAll();

  LOGGER.info(String.format("getAllStudents() returned %s records", students.size()));

  return students.stream().map(student -> convertToDTO(student)).collect(Collectors.toList());
}

Let’s call it:

No alt text provided for this image

In the server’s log we have:

No alt text provided for this image

And if we take a look at what’s going on with Logstash, we see that it was captured:

{
          "path" => "/home/tiago/desenv/java/crud-exceptionhandling-example/application.log",
          "host" => "tiagomelo",
           "pid" => "22174",
         "class" => "StudentController",
       "message" => "2019-03-17 23:14:58.197  INFO 22174 --- [http-nio-8080-exec-1] com.tiago.controller.StudentController   : getAllStudents() returned 6 records",
      "@version" => "1",
    "@timestamp" => 2019-03-18T02:14:58.197Z,
    "logmessage" => "getAllStudents() returned 6 records",
          "type" => "java",
         "level" => "INFO",
     "timestamp" => "2019-03-17 23:14:58.197",
        "thread" => "http-nio-8080-exec-1"
}

Now let’s perform an action that will log an error. We can do it by trying to delete an inexisting student, for example. This is our exception handler:

private static final Logger LOGGER = LoggerFactory.getLogger(ExceptionHandlingController.class);

/**
 * This exception is thrown when a resource is not found
 *
 * @param ex
 * @return {@link ExceptionResponse}
 */
@ExceptionHandler(ResourceNotFoundException.class)
public ResponseEntity<ExceptionResponse> resourceNotFound(ResourceNotFoundException ex) {
  ExceptionResponse response = new ExceptionResponse();
  response.setErrorCode("Not Found");
  response.setErrorMessage(ex.getMessage());

  LOGGER.error(String.format("ResourceNotFoundException: %s", ex.getMessage()));

  return new ResponseEntity<ExceptionResponse>(response, HttpStatus.NOT_FOUND);
}

Let’s call it:

No alt text provided for this image

In the server’s log we have:

No alt text provided for this image

And if we take a look at what’s going on with Logstash, we see that it was captured:

{
          "path" => "/home/tiago/desenv/java/crud-exceptionhandling-example/application.log",
          "host" => "tiagomelo-zoom",
           "pid" => "22174",
         "class" => "ExceptionHandlerExceptionResolver",
       "message" => "2019-03-17 23:20:03.729  WARN 22174 --- [http-nio-8080-exec-3] .m.m.a.ExceptionHandlerExceptionResolver : Resolved [com.tiago.exception.ResourceNotFoundException: Student not found with id: '323']",
      "@version" => "1",
    "@timestamp" => 2019-03-18T02:20:03.729Z,
    "logmessage" => "Resolved [com.tiago.exception.ResourceNotFoundException: Student not found with id: '323']",
          "type" => "java",
         "level" => "WARN",
     "timestamp" => "2019-03-17 23:20:03.729",
        "thread" => "http-nio-8080-exec-3"
}

We’ve performed two actions: listed all students and tried to delete an inexisting one. Now it’s time to visualize them in Kibana UI at http://localhost:5601.

This is the log of the first operation performed, listing all students:

No alt text provided for this image

And this is the log of the second operation performed, deleting an inexisting student:

No alt text provided for this image

Conclusion

Through this simple example, we learned how we can integrate a Spring Boot application with ELK stack in order to have centralized logging.

Download the source

Here: https://bitbucket.org/tiagoharris/crud-exceptionhandling-example/src/master/