Personal Workflow Blog

To content | To menu | To search

Tag - munin

Entries feed - Comments feed

Saturday, 13 April 2013

Spinoffs in the munin ecosystem

KISS is the core design of Munin

Munin's greatest strength is its very KISS architecture. It therefore gets many things right, such as a huge modularity.

Each component (master/node/plugin) has a simple API to communicate with the others.

Spin-offs ...

I admit that the master, even the node, have convoluted code. In fact some rewrites already do exist.

... are welcomed ...

And they are a really good thing, as it enables rapid prototyping on things that the stock munin has (currently) trouble to do.

The stock munin is a piece of software that many depend upon, so it has to move at a much slower pace than one does want, even me. As much as I really want to add many many features to it, I still have to take extra care that it doesn't break stuff, even the least known features.

So I take munin off-springs very seriously and even offer as much help as I can in order for them to succeed.

... because they are very valuable in the long term

In my opinion competition is only short bad in the short term, and in the long term they usually add significant value to the whole ecosystem. That said, there's always a risk to become slowly irrelevant, but I think that's the real power of open-source's evolutionary paradigm : embrace them or become obsolete and get replaced.

Since, if someone takes the time to author a competitor that has a real threat potential, it mostly means that there's a real itch to scratch and that many things are to be learnt.

Different layers of spin-offs

The munin ecosystem is divided in 3 main categories, obviously related to the 3 main components of munin : master, node & plugin.

Plugins

That's the most obvious part as custom plugins are the real bread and butter of munin.

Stock plugins are mostly written in Perl or POSIX shell, as Perl is munin's own language and POSIX shell is ubiquitous. That fact is acknowledged by the fact that core munin provides 2 libraries (Perl & Shell) to help plugin authoring.

So, it's quite natural that each mainstream language has grown its own plugin library. Some language even have two of them.

C

Some plugins got even rewritten in plain C, as it was shown that shell plugins do have a significant impact on very under-powered nodes, such as embedded routers.

Node

This component is very simple. Yet, it has to be run on all the nodes that one wants to monitor. It is currently written in Perl, and while that's not an issue on UNIX-like systems, it can be quite problematic on embedded ones

Simple munin

The official package comes with a POSIX shell rewrite that has to be run from inetd. It is quite useful for embedded routers like OpenWRT, but still suffers from an hard dep on POSIX shell and inetd.

SNMP

SNMP is another way to monitor nodes. While it works really well, it mostly suffers the fact that its configuration is quite different of the usual way, so I guess some things will change on that side.

Win32 ports

Win32 has long been a very difficult OS to monitor, as it doesn't offer much of the UNIX-esque features. Yet the number of win32 nodes that one wants to monitor is quite high, as it makes munin one the few systems that can easily monitor heterogeneous systems.

Therefore, while you can install the stock munin-node, several projects emerged. We decided to adopt munin-node-win32.

Android

There's also a dedicated node for Android. It makes sense, given that the Android is yet Linux-derived, but lacks Perl, and is a Java mostly platform. This node also has some basic capabilities of pushing data to the master instead of the usual polling.

This is specially interesting given the fact that Android nodes are usually loosely connected, so the node spools values itself and pushes them when it recovers connectivity.

Note that this is specifically an aspect that is currently lacking in munin, and I'm planning to address it in the 2.1 series. So thanks to its author for showing a relevant use-case.

C

That's my last experiment. It started with a simple question : how difficult would it be to code a fairly portable version of the node ?

It turned out that it wasn't that difficult. I'm even asking myself about eventually replacing the win32 specific port with this one, as the code is much simpler. The win32 node has several plugin built-in mostly due to platform specifics. I still have to find a way to work my way around it, but it's in quite good shape.

This post was originally done to promote it, but while writing it I noticed that the ecosystem deserved a post on its own. So I'll write another one, specific to the C port of munin-node and plugins.

Master

The master is the most complex component. So rewrites of it won't happen as-is. They usually take the form of a bridge between the munin protocol and another graphing system, such as Graphite.

Clients

There are also client libraries that are able to directly query munin nodes, to be able to reuse the vast ecosystem. Languages are various, from the obvious Python to Ruby, along with a quite modern node.js one.

Sunday, 24 February 2013

When having good relationships with package maintainers can also be a curse

I advise every user to only use the packaged version of munin. Here's a short article to explain the background of my reluctance to ask for users to directly use the official tarball.

I have become upstream of munin a while ago now. As such, I'm in contact with package maintainers. They take the official releases and cram it into their own distribution of choice[1].

I have to admit that the various epic war stories read throughout the web about upstream vs packagers are very far from the truth here. They are a charm to work with. Often challenging and demanding, but always because there's a real need. And that's quite a good thing, as I'm still a rookie in term of open source software management. Therefore I'm quite grateful when they gently pinpoint my mistakes[2].

Yet, this nice team comes with a price. Since we mostly hang out on IRC together, there is way much inter-distro communication than on other software. But I'm the sole owner of the tarball distro .

Yet, as I don't like to build everything from source, I obviously use a distro. There, since the packaging is very nicely done, I don't feel to take the hassle of using my own "tarball" to test them. I just build a package for my distro out of the release code.

That's also a curse, as I admit that I although I test the code, I only seldom test the packaging. This means that I cannot really advise someone on using the tarball, nor directly git code as even I don't do it.

But, that said, I still think I'm the luckiest upstream around. Thanks guys !

Notes

[1] Be it linux-based like Gentoo, Redhat..., BSD-based as FreeBSD, OpenBSD..., or even multi-kernel based as Debian

[2] Defaulting to CGI graphics was a move that was way too premature, end-user wise. So thanks to them, it defaults to cron again

Friday, 1 February 2013

Avoid those milli-hits in Munin

A recurring question on IRC is : why do I have 500 million hit/s in my graph ?.

Turns out that they are really seeing 500m hit/s, and that lower-case m means milli, and not Mega as specified in the Metric system. This is automatically done by RRD.

To avoid this you should just specify graph_scale no as specified.

Monday, 20 June 2011

Enhance RRD I/O performance in Munin 1.4 and Scale

As with most of the RRD-based monitoring software (Cacti, Ganglia, ...), it is quite difficult to scale.

The bad part is that updating lots of small RRD files seems like pure random I/O to the OS as stated in there documentation.

The good part is that we are not alone, and therefore the RRD developers did tackle the issue with rrdcached. It spools the updates, and flushs them to disk in a batched manner, or when needed by a rrd read command such as graphing. That's why it is scales well when using CGI graphing. Otherwise, munin-graph will read every rrd, and therefore force a flush on all the cache.

And the icing on the cake is that, although it is only fully integrated to munin 2.0, you can use it right away in the 1.4.x series.

You only need to define the environment variable RRDCACHED_ADDRESS while running the scripts accessing the RRDs.

Then, you have to remove the munin-graph part of the munin-cron and run it on its own line. Usually only every hour or so, to be able to accumulate data in rrdcached before flushing it all to disk when graphing.

Updating to 2.0 is also an option to have a real CGI support. (CGI on 1.4 is existing but has nowhere decent performance).

Monday, 23 August 2010

Waiting for Munin 2.0 - Keep more data with custom data retention plans

RRD is Munin's backbone.

Munin keeps its data in an RRD database. It's a wonderful piece of software, designed for this very purpose : keep an history of numeric data.

All you need is to tell RRD for how long and the precision you want to keep your data. RRD manages then all the underlying work : pruning old data, averaging to decrease precision if needed, ...

Munin automatically creates the RRD databases it needs.

1.2 - Only one set

In 1.2, every database creation was done with the same temporal & precision parameters. Since the output parameters were constant (day, week, month, year graphs), there were little need to have a different set of parameters.

1.4 - 2 sets : normal & huge

In 1.4, various users showed their need to have different graphing outputs, and began to hack around Munin's fixed graphing. It became rapidly obvious that the 1.2 preset wasn't a fit for everyone.

Therefore a huge dataset was available to be able to extend the finest precision (5min) to the whole Munin timeframe. This comes at a price though : more space is required, and the graph generation is slower, specially when generating the yearly one, since more data has to be read and analysed.

The switch is done for the whole munin installation by changing the system-wide graph_data_size, although already created rrd databases aren't changed. It is then even possible for a user to pre-customize the rrd file. Munin will then happily uses them transparently thanks to the RRD layer.

Manual overriding

Altering the RRD files after it is created is possible, but not as simple. Standard export & import from RRD take the structure with it. So data has to be moved around with special tools. rrdmove is my attempt to create such a tool. It copies data between 2 already existing RRD files, even asking RRD to interpolate the data when needed.

2.0 - Full control

Starting with 2.0, the parameter graph_data_size is per service. It also has a special mode : custom. Its format is very simple :

 
graph_data_size custom FULL_NB, MULTIPLIER_1 MULTIPLIER_1_NB, ... MULTIPLIER_NMULTIPLIER_N_NB
graph_data_size custom 300, 15 1600, 30 3000

The first number is the number of data at full resolution. Then usually it comes gradually decreasing resolution.

A decreasing resolution has 2 usages :

  • Limit the space consumption : keeping full resolution for the whole period (default : 5min for 2 years) is sometime too precise.
  • Increase performance : RRD will choose the best fitting resolution to generate its graphs. Already aggregated data is faster to compute.

- page 1 of 3