A couple weeks ago I posted the first commit for aaLog
I posted a few thoughts on some LinkedIn groups but I thought it was be more appropriate to write in a more long form style here on our home blog.
For many years I’ve been frustrated by the limited nature of the built-in Log Management software with Wonderware products. The SMC is a pretty nice interface and gives you some powerful functions to sift and sort your logs. Unfortunately you can only look at the logs from a singe host at a time. Yes you can jump from host to host easily but this simply isn’t very practical when you are trying to troubleshoot an issue that spans multiple platforms. Take the simplest example, a deploy. It starts on the GR, hops to the platform, and then back to the GR. Why can’t I see the entire process in a unified view? In the past I read bits and pieces about syslog and it seemed like a nice solution to this issue. My only issue was that the Wonderware Logging subsystem did not have a feature to allow for forwarding logs to syslog nor was there a pre-built collector for Wonderware logs.. unlike say 90% of the other software in the world. Once again living in our own little world where we can’t play with the toys everyone else is playing with.
At the Dallas users conference a few years ago I was having a rollicking good time late night with a few folks, one of them being an engineer that had recently done some work around this exact pain point. He had the same misgivings as myself and he decided to do something about it. He built a tool that would read the log files and forward them to numerous different formats; SQL Server, CSV, and syslog. I saw the work and was duly impressed. He had a nice GUI for configuring the service and everything. The core code was quite nicely thought out and I thought it was a winner. Unfortunately he could never get support from the PTB (powers that be) to actually productize and release the tool either as a free tool or something that customers might pay for.
A few times I made half-hearted attempts as recreating his work so that it could be shared with the world but I floundered each time. Why wouldn’t the file watcher start working.. why can’t I read the log records? So what does any good engineer do when they can’t get someone else’s code to work like they expect? They study the functionality of the code and then recreate in a form they better understand. Sometimes the constructs of the original code are brilliant and you just copy them straight up. Sometimes they may be brilliant but you simply don’t understand them so you instead try to figure out the inputs and the results then build your own algorithms to mimic.
As I alluded to in the previous section one of the really mind-numbing issues I ran into was that fact that I couldn’t get the stupid file watcher to file every time the log file was updated. So instead of wasting my time with that I decided I would instead work on the most important piece; the actual log reader functionality. When you review the repo you will see a few example projects that actually use the library but these are really just basic examples to get you up and running. They are not intended to be finished works.
The first step in writing this library involved understanding the log files themselves. The first thing you notice when attempting to read the log files is that if you open them in a text editorm like NotePad++ they simply look like garbly gook
However, as you look a little closer you can definitely see something in the messages but how to figure out delimiters and what the heck is that first line?
What we have here are binary log files. What are binary log files you ask? A binary log file is one that is written to in byte form instead of readable text. Because I’m simply not that smart I don’t know a better way to explain it in words. But I can explain the pseudocode to read it.
- Open the file for reading
- Don’t read text from the file. Instead read the actual bytes.
- Study the array of bytes returned and try to figure out what values are numbers and what values are text. The text field typically will have a NULL in between them to signal that we are changing from one field to the next. In a typical text log this might be a tab or a comma. One good way to do this is to stop your code when you have the raw array of bytes and dump the contents to excel. From there you can more easily see that patterns
Here are links to a couple documents on GitHub showing details about each of the formats.
If you want a better understanding of what each of the ASCII codes means check out this link on Wikipedia.
One question you might have is why in the world would someone write log files in a cryptic, unreadable (at least with a text reader) format?
Well, after an exhaustive search (ok, about 2 minutes on Google) I found someone who agreed with my understanding so I shall cite them as an authoritative source.
Why should I use a binary log?
Binary logs are far more compact than text based logs, especially when using a text based encoding like JSON or XML. Binary logs also offer a more CPU efficient solution to encoding and decoding large logs of instruction sets.
Over the course of a long weekend and a few extra days I trudged through some of the original log reader DLL code to understand exactly how it worked. After understanding the format of the log files the rest is just work, adding features and in general making the library easier to use for others.
So What Now?
I will continue adding features as suggestions come in but what I would really like is for others to start building tools and services around this log reader.