Search this site


Metadata

Articles

Projects

Presentations

Growing logstash's value

I spent a while today thinking about nerdy stuff - logstash, etc. I want to grow logstash in terms of performance, use case, deployment instances, happy users, and community.

While musing about on my mental roadmap of logstash, I found most things boil down to costs and returns on investment, even with open source software. Money, time, energy, and patience are all costs. Just because something doesn't cost any money doesn't mean it won't consume any time or energy.

I see two distinct groups of users, with respect to cost. New users and current users. In terms of inputs and outputs, the phrase 'return on investment' comes to mind. New users are likely in the evaluation and investigation phase, mainly estimating ROI or judging "is this solution useful to me?" Existing users are likely in the maintenance and integration phases, mainly trying to improve or maintain ROI or pushing towards improving value provided by a tool.

These two user groups are, in my observation and from a seller's perspective, quite distinct in terms of strategy. How you acquire happy new users is not necessarily how you maintain and energize existing users.

Targeting New Users

New users easily stumble on bad documentation, complex architectures, and excessive steps.

I have a few goals regarding minimization for new users: reducing mean time to demo and mean time to deploy are critical. Reducing 'time to demo' requires that I focus on minimizing steps required to answer 'is logstash useful?'. Additionally, ensuring each required step in 'time to demo' is as simple as possible. Reducing time to deploy means publishing high quality init scripts (upstart, systemd, sysv init) as well as high quality puppet, chef, and cfengine modules.

The time someone spends as what I consider a 'new user' is actually quite short. My goal with users in this stage are to help them quickly and accurately answer, "will this tool benefit me?"

Targeting Existing Users

As an existing user of a tool, I'm often looking for how to reduce operational, maintenance, and integration costs. Operational costs appear as physical resource usage (servers, or fuel-like resources). Maintenance costs appear as bugs and related investigations, monitoring, etc. Integration costs appear as time and energy spent making the tool work well with your infrastructure.

These kind of users are usually a bit more invested in use of the tool, but I want to avoid abusing that fact. Time and energy investments in a free tool can cause as much vendor trappings as monetary investments. Don't treat your existing users like dirt, right?

My goal for the next few months is to become one of these existing users of logstash (to date, as I've confessed, I've never run logstash in production). I'll be able to do that at DreamHost (starting tomorrow!).

Additionally, I will be focusing strongly on improving logstash new user experience. Happier new users should reflect nicely on community growth and activity. That's my goal, anyway!

logstash's first major release - 1.0.0

Ready for log and event management that doesn't suck or drain your budget? It's time to logstash.

After lots of refactoring and improvements to logstash since the first minor release last November, logstash is ready for wider usage now.

Read my announcement here.

The logstash site is also online and has docs, intros, slides, and videos.

http://logstash.net

Happy logstashing!

logstash is ready for use

I've talked for a while about logging problems. Parsing, storing, searching, reacting.

Today is the first release of logstash.

What is logstash? Free and open source log/event management and search. It's designed to scale and help you more easily manage your logs and provides you a great way to search them for debugging, postmortems, and other analysis.

You can read more about it on logstash.googlecode.com.

Smart logging hacks in ruby

Ruby has Logger. It is good, but strings suck. In a world where more and more people are using log data for inputs and analysis, structured data is good. I want to log structured data.

This lead to me subclassing Logger and additionally providing my own logger format class. The code for this is in logstash, logging.rb.

What did I add? Two main goals: First, improve context. Second, log structured data (objects). This is achieved by style changes (log objects, not strings), adding awesome_print support, adding code context to each log (line/file/method), etc.

To support the first goal (context), if the loglevel is 'debug' I will inspect the call stack and include the file and line of code that is logging. I also set the 'progname' to the name of the program by default. To support the second goal, log objects and improve how objects are formatted into strings with Object#inspect (or awesome_inspect, if available).

Some examples:

>> logger = LogStash::Logger.new(STDOUT)
I, [2010-11-12T15:19:48.388469 #18782]  INFO -- : ["Failed: require 'ap' (aka awesome_print); some logging features may be disabled", #<LoadError: no such file to load -- ap>]
# This is an example of what javascript folks call 'progressive enhancement'
# - we still function if awesome_print is not available.

>> logger.level = Logger::WARN
>> logger.warn("Hello")
W, [2010-11-12T15:20:05.465705 #18782]  WARN -- irb: Hello

>> logger.warn(["rejecting bad client", { :client => "1.2.3.4" }])
W, [2010-11-12T15:21:04.639404 #18782]  WARN -- irb: ["rejecting bad client", {:client=>"1.2.3.4"}]

>> logger.level = Logger::DEBUG
>> logger.warn("Hello")
W, [2010-11-12T15:21:57.754874 #18782]  WARN -- (irb):12#irb_binding: Hello
# Notice the context (file, line, method)       ^^^^^^^^^^^^^^^^^^^^
# When DEBUG level is set only due to performance and verbosity concerns.
The main benefit personally is logging objects instead of strings, which you can do, today, with the standard Logger. However, standard logger doesn't make nice with awesome_print or add file/line/method context. Anyway, logging objects lets you later hook a smarter error handling tool up to your logging that can inspect the structured data rather than having to regex your way through a single string.

If you have awesome_print available, the object output by my formatter gets even more useful for human viewing:

Why log structured? Easier to parse and query later, like in a logstash query.

New project: eventmachine-tail

Logstash uses EventMachine, which is an event-driven library for Ruby. Part of logstash's requirements is the ability to watch logfiles like 'tail -f' would. Previously, I was using File::Tail, but this was not EventMachine friendly.

Additionally, it's pretty common for applications to write to files with generated names that include timestamps, etc, so it was clear logstash would need a way to watch a pattern of files, like a glob such as /var/log/*.log

Thus was born eventmachine-tail.

You can install it with:

gem install eventmachine-tail
And try the 'rtail' tool that comes with it:
rtail -x "*.gz" "/var/log/**/*"
The project is hosted on github: jordansissel/eventmachine-tail.

This is the first project I've tried to use git with. I'm not really happy with git as it only seems to complicate my workflow, but I'll try to stick with it.