A few thoughts on aaLog

A couple weeks ago I posted the first commit for aaLog

https://github.com/aaOpenSource/aaLog

I posted a few thoughts on some LinkedIn groups but I thought it was be more appropriate to write in a more long form style here on our home blog.

The Beginnings

For many years I’ve been frustrated by the limited nature of the built-in Log Management software with Wonderware products.  The SMC is a pretty nice interface and gives you some powerful functions to sift and sort your logs.  Unfortunately you can only look at the logs from a singe host at a time.  Yes you can jump from host to host easily but this simply isn’t very practical when you are trying to troubleshoot an issue that spans multiple platforms.  Take the simplest example, a deploy.  It starts on the GR, hops to the platform, and then back to the GR.  Why can’t I see the entire process in a unified view?  In the past I read bits and pieces about syslog and it seemed like a nice solution to this issue.  My only issue was that the Wonderware Logging subsystem did not have a feature to allow for forwarding logs to syslog nor was there a pre-built collector for Wonderware logs.. unlike say 90% of the other software in the world.  Once again living in our own little world where we can’t play with the toys everyone else is playing with.

The Motivation

At the Dallas users conference a few years ago I was having a rollicking good time late night with a few folks, one of them being an engineer that had recently done some work around this exact pain point.  He had the same misgivings as myself and he decided to do something about it.  He built a tool that would read the log files and forward them to numerous different formats; SQL Server, CSV, and syslog.  I saw the work and was duly impressed.  He had a nice GUI for configuring the service and everything.  The core code was quite nicely thought out and I thought it was a winner.  Unfortunately he could never get support from the PTB (powers that be) to actually productize and release the tool either as a free tool or something that customers might pay for.

The Opportunity

A few times I made half-hearted attempts as recreating his work so that it could be shared with the world but I floundered each time.  Why wouldn’t the file watcher start working.. why can’t I read the log records?  So what does any good engineer do when they can’t get someone else’s code to work like they expect?  They study the functionality of the code and then recreate in a form they better understand.  Sometimes the constructs of the original code are brilliant and you just copy them straight up.  Sometimes they may be brilliant but you simply don’t understand them so you instead try to figure out the inputs and the results then build your own algorithms to mimic.

The Boundaries

As I alluded to in the previous section one of the really mind-numbing issues I ran into was that fact that I couldn’t get the stupid file watcher to file every time the log file was updated.  So instead of wasting my time with that I decided I would instead work on the most important piece; the actual log reader functionality.  When you review the repo you will see a few example projects that actually use the library but these are really just basic examples to get you up and running.  They are not intended to be finished works.

The Understanding

The first step in writing this library involved understanding the log files themselves.  The first thing you notice when attempting to read the log files is that if you open them in a text editorm like NotePad++ they simply look like garbly gook

Log File garbly gook
Log File garbly gook

However, as you look a little closer you can definitely see something in the messages but how to figure out delimiters and what the heck is that first line?

What we have here are binary log files.  What are binary log files you ask?  A binary log file is one that is written to in byte form instead of readable text.  Because I’m simply not that smart I don’t know a better way to explain it in words.  But I can explain the pseudocode to read it.

  1. Open the file for reading
  2. Don’t read text from the file.  Instead read the actual bytes.
  3. Study the array of bytes returned and try to figure out what values are numbers and what values are text.  The text field typically will have a NULL in between them to signal that we are changing from one field to the next.  In a typical text log this might be a tab or a comma.  One good way to do this is to stop your code when you have the raw array of bytes and dump the contents to excel. From there you can more easily see that patterns

Here are links to a couple documents on GitHub showing details about each of the formats.

LogRecordFormat.md

LogHeaderFormat.md

If you want a better understanding of what each of the ASCII codes means check out this link on Wikipedia.

One question you might have is why in the world would someone write log files in a cryptic, unreadable (at least with a text reader) format?

Well, after an exhaustive search (ok, about 2 minutes on Google) I found someone who agreed with my understanding so I shall cite them as an authoritative source.

From https://gogs.io/github.com/eliothedeman/binlog

Why should I use a binary log?

Binary logs are far more compact than text based logs, especially when using a text based encoding like JSON or XML. Binary logs also offer a more CPU efficient solution to encoding and decoding large logs of instruction sets.

The Work

Over the course of a long weekend and a few extra days I trudged through some of the original log reader DLL code to understand exactly how it worked. After understanding the format of the log files the rest is just work, adding features and in general making the library easier to use for others.

So What Now?

I will continue adding features as suggestions come in but what I would really like is for others to start building tools and services around this log reader.

3 thoughts on “A few thoughts on aaLog

  1. Pretty cool. It would be neat to make that integrate with Microsoft’s Log Parser http://technet.microsoft.com/en-us/scriptcenter/dd919274.aspx and a tool that uses the Log Parser called Log Parser Lizard. If you are not familial with Log Parser it is a command line tool to let you execute SQL Queries against log files, xml files, csv files, etc.

    Log Parser Lizard is a GUI wrapper around the Log Parser. You can see more of it here.
    https://www.google.com/url?sa=i&rct=j&q=&esrc=s&source=images&cd=&cad=rja&uact=8&ved=0CAUQjhw&url=http%3A%2F%2Fwww.hanselman.com%2Fblog%2FAnalyzeYourWebServerDataAndBeEmpoweredWithLogParserAndLogParserLizardGUI.aspx&ei=W-uhVMLyK9WEyQSXlYGABw&bvm=bv.82001339,d.aWw&psig=AFQjCNEzgk_gJYo0olVjF05f9CeTU3DoOw&ust=1419984089888053

    1. Or… you could just sent things to Splunk and use all the power built into that bad boy. It’s free for up to 500 MB a day as long as you don’t need any security 🙂

      Check out the newest addition. A Splunk Modular Input

      -andy

  2. Very interesting. So could I use log parser in connection with my DLL to let it do the heavy lifting in terms of providing front ends and alternate output formats? Provides for some really interesting possibilities to complement something like sending it to syslog directly.

    care to take on the challenge to produce a small POC for it? I’m sure I know someone who will accept your pull request on the gits repo 🙂

    -a

Leave a Reply