I've come up with the following so far:
- The concept of an event is something which has "when, where, and what"-ness. Other properties of events such as significance and who-reported-it are trivial. The key bits are when the event occurred, where it occurred, and what the event was.
- Software logs happen to have these three key properties. Put simply: store it in a database that lets you search over a range of times and you have yourself a time machine.
- Couple this with visualizations and statistical analysis. Trends are important. Automatic novelty detection is important.
- Trends can be seen by viewing data over time - whether visual or formulaic (though the former is easier for Joe Average to see). An example trend would be showing a gradual increase in disk usage over a period of time.
- Novelty detection can occur a number of ways. Something as simple as a homoskedasticity test could show if data were "normal" - though homoskedasticity only works well for linear models, iirc. I am not a statistician.
- Trend calculation can provide useful models predicting resource exhaustion, MTBNF, and other important data.
- Novelty detection aids in fire fighting and post-hoc "Oops it's broken" forensics.
The overall goal of this is to somewhat automate problem detection and significantly aid in problem cause/effect searching.
The eventdb system will likely support many interfaces:
- syslog - hosts can log directly to eventdb
- command line - scriptably/manually push data to the eventdb
- generic numeric data input - a lame frontend to rrdtool, perhaps
This sort of trend and novelty mapping would be extremely useful in a production software environment to compare configuration or software changes. That is, last month's syscall averages might be much lower than this months - and perhaps the only change was a configuration file change or new software being pushed to production. You would be able to track the problem back to when it first showed up - hopefully correllating to some change that was known about. After all, the first step in solving a problem is knowing of its existence.
My experience with data analysis techniques is not extensive. So I wouldn't expect the data analysis tools in the prototype to sport anything fancy.
I need more hours in a day! Oh well, back to doing homework. Hopefully I'll have some time to prototype something soon.