Waiting for Munin 2.0 - Performance - Architecture
Munin has a very simple architecture on the master : munin-cron
is launched via cron every 5 minutes. Its only job is to launch in order
munin-update
, munin-graph
, munin-html
& munin-limits
.
The various processes
munin-update
This process retrieves the values from the various nodes and to update the rrd files. This one should never take more than 5 minutes to run, otherwise there will be gaps since the next update will not be launched (lockfile-protected runs).
This process stresses the I/O on the master, and depends on the plugins execution time on the various nodes. On 1.4 the retrieval is multi-threaded[1], so an slow node doesn't impact too much the whole process.
2.0 proposes asynchronous updates and vectorized updates.
munin-graph
This process generates all the image files from the rrd files.
It is usually a process that is quite CPU-bound, it generates also a fair load of I/O. Since 1.4 there might also be a parallel graphing generation in order to take advantage of multiple CPU / multiple I/O paths.
A simple optimization is to generate only needed graphs instead of all of
them each time. This leads to CGI-generation of graphs. 1.2 & 1.4 took a
first step in this direction, but it's quite a hack since it's only a very
basic script that calls munin-update
with the correct
parameters.
A FastCGI port of the wrapper (munin-cgi-graph
) removes the
overhead of starting the wrapper for each call, but in 1.4 the code is quite
experimental and has some serious bugs that would need extensive patching to be
fixed.
2.0 completes the integration of CGI graphing with removing the
overhead of calling munin-graph
and does this extensive patching
for bugs fixing
munin-html
This process generates all the html files from the rrd files. This one is quite fast for now.
munin-limits
This process checks the limits to see if there is a warning/alert to send via mail or nagios. This one is also quite fast for now.
Notes
[1] more multi-process actually