Red Hat
Jun 20, 2013
by Shane K Johnson

History

I wanted to create a proof of concept that demonstrated how to leverage JBoss Data Grid (JDG) with JBoss EAP. I wanted to leverage the map / reduce framework in JDG. However, I needed a data source: a big data source.

I decided to use log data because analyzing log files is an established use case for big data / analytics. For example, analyzing log files with Apache Hadoop.

However, analyzing log files with Apache Hadoop is an offline, batch oriented process. First, the log file is imported into Apache HDFS. Then, it is analyzed by running an Apache MapReduce job.

Inspiration

A log file contains log messages.

What if the log messages were persisted to a distributed data store in real time in addition to or instead of a log file?

What if they were persisted to an in-memory data grid?

Inspired by Splunk:

  • Do without access to log files.
    No more requests for log files or access to log files.
  • Aggregate log data from multiple servers / applications.
    No more opening of multiple log files in a single text editor.

Inspired by NoSQL / Big Data:

  • The log data could be distributed.
  • The log data could be analyzed with map / reduce tasks.

Project

GridLogZ is a set of components that enable the persistence and analysis of log message from JBoss EAP in JBoss Data Grid (JDG).
https://github.com/shane-k-j/gridlogz

  • Common – The model. It is based on the java.util.logging.LogRecord class.
  • Services – REST services for persisting and analyzing log messages.
  • Log Handler – A log handler that persists log messages via the services.
  • Web – The front end. HTML5 + D3 (link)

Screenshot

This is a tree map chart that shows the number of log records per logger (package) and per class. A class is display as a rectangle. The size of the rectangle is based on the number of log messages. Further, the class rectangles are grouped by package to form a larger rectangle.

gridlogz-treemap

Screencast

This is the screencast that was shown at Red Hat Summit last week. It is one part presentation and one part demonstration. It does not include an audio soundtrack.


What’s Next?

I’m planning to publish a post tomorrow morning with a screencast that demonstrates how to build and install GridLogZ with JBoss EAP and JDG. In addition, I’m planning to update the README on GitHub. While the UI is limited to a few charts with the initial commit, I’m writing distributed tasks to retrieve log messages (e.g. by time) and return them as JSON. However, I could use some help with the front end.

Update: As promised: build, install, and configure GridLogZ (link).


GridLogZ – Presentation / Demo
Original Post