Personal Workflow Blog

To content | To menu | To search

Thursday, 10 June 2010

Waiting for Munin 2.0 - Performance - Architecture

A little intro/refresh on munin's architecture on the master

Munin has a very simple architecture on the master : munin-cron is launched via cron every 5 minutes. Its only job is to launch in order munin-update, munin-graph, munin-html & munin-limits.

The various processes

munin-update

This process retrieves the values from the various nodes and to update the rrd files. This one should never take more than 5 minutes to run, otherwise there will be gaps since the next update will not be launched (lockfile-protected runs).

This process stresses the I/O on the master, and depends on the plugins execution time on the various nodes. On 1.4 the retrieval is multi-threaded[1], so an slow node doesn't impact too much the whole process.

2.0 proposes asynchronous updates and vectorized updates.

munin-graph

This process generates all the image files from the rrd files.

It is usually a process that is quite CPU-bound, it generates also a fair load of I/O. Since 1.4 there might also be a parallel graphing generation in order to take advantage of multiple CPU / multiple I/O paths.

A simple optimization is to generate only needed graphs instead of all of them each time. This leads to CGI-generation of graphs. 1.2 & 1.4 took a first step in this direction, but it's quite a hack since it's only a very basic script that calls munin-update with the correct parameters.

A FastCGI port of the wrapper (munin-cgi-graph) removes the overhead of starting the wrapper for each call, but in 1.4 the code is quite experimental and has some serious bugs that would need extensive patching to be fixed.

2.0 completes the integration of CGI graphing with removing the overhead of calling munin-graph and does this extensive patching for bugs fixing

munin-html

This process generates all the html files from the rrd files. This one is quite fast for now.

munin-limits

This process checks the limits to see if there is a warning/alert to send via mail or nagios. This one is also quite fast for now.

Notes

[1] more multi-process actually

Tuesday, 8 June 2010

Waiting for Munin 2.0 - Introduction

This is the first article of a series about the coming version 2.0 of Munin.

The idea came from the series Waiting from 8.5 about PostgreSQL.

The ironic part is that their 8.5 release has become a 9.0, just like our 1.5 will be a 2.0.

I'll post several small articles about new or enhanced-enough features. They will all be tagged munin20.

Planned summary :

  1. Performance - Architecture context
  2. Performance - FastCGI
  3. Performance - Asynchronous updates
  4. Performance - Misc
  5. Native SSH transport
  6. Custom data retention plans (keep more data)
  7. Dynamic zooming

Thursday, 1 April 2010

Don't use Excerpt... At least with DotClear.

DotClear automatically generates a meta description tag from the blog entry, but it doesn't take the excerpt into account.

It just takes the beginning of the article content. Since the excerpt is also shown at the beginning of the article, I cannot just write 2 times the same content.

meta description is quite interesting since it is usually used for the little snipped under a search result in usual search engines, so having the beginning of the post in here is very nice.

This fact annihilates the good point of having excerpts.

I'm now falling back to removing progressively all the excerpts on my posts...

Wednesday, 31 March 2010

API Design: Avoid hidden costs of simple features

Programmers are usually like water : they always use the path of least resistance.

Let's see how to use this fact to predict the usage of an API when you design it.

Initial API

Consider the very simple DB API that consumes a connected ResultSet and presents a disconnected version of it.

class DisconnectedResultSet{
        public DisconnectedResultSet (ResultSet rs);
        public boolean next();
        public Object getObject(int col_idx);
}

It's usage is quite easy :

while (drs.next()) {
        int col_idx = 1;
        drs.getObject(col_idx++); // Do something w/ 1st col
        drs.getObject(col_idx++); // Do something w/ 2st col
        //...
}

Just a little evolution...

Since the DisconnectedResultSet is disconnected, we can imagine that it should implement a rewind() method in order to use it several times without running the initial query again. We now have an updated class :

class DisconnectedResultSet{
        public DisconnectedResultSet (ResultSet rs);
        public boolean next();
        public Object getObject(int col_idx);   
        public void rewind(); // Be able to rewind it
}

And its classical usage :

while (drs.next()) {
        // do stuff...
}
// ...
drs.rewind();
while (drs.next()) {
        // do something else with the same data...
}
// ...
drs.rewind();
while (drs.next()) {
        // do something else with the same data...
}
// ...

A new need comes

A new need comes : see if the DisconnectedResultSet is empty or not in order to avoid sending header.

The usual way is to send them once when iterating like :

boolean is_headers_sent = false;
while (drs.next()) {
        if (! is_headers_sent) { 
                send_headers(); 
                is_headers_sent = true;
        }
        // do something else with the same data...
}

But since there is a nice rewind()method, just waiting to be used, the code might become :

if (drs.next()) {
        send_headers(); 
}
drs.rewind();
while (drs.next()) {
        // do something else with the same data...
}

Now, this code isn't generic anymore to accommodate a connected ResultSet.

So, as John Carmack said :

The cost of adding a feature isn't just the time it takes to code it. The cost also includes the addition of an obstacle to future expansion.

That's really true when you design APIs since their purpose is to last long and to be extended.

So, think twice when you propose an extension "just in case".

The little evolution, revisited...

To solve this case, don't propose a rewind() method, but offer a duplicate() one. It offers the same functionality, just in a new object.

The usage will be almost the same as shown below, but since it feels more performance-sensitive, it won't be used as lightly : the boolean is_headers_sent pattern has now more chances to be used.

while (drs.next()) {
        // do stuff...
}
// ...
drs = drs.duplicate();
while (drs.next()) {
        // do something else with the same data...
}
// ...
drs = drs.duplicate();
while (drs.next()) {
        // do something else with the same data...
}
// ...

It's an other example that immutable objects are the way to go, but for a different reason this time.

Note: Just finished my March 2010 article, even on time... I'm still trying to keep at least a one article per month blogging rate. So far so good for 2010, still 9 months to go !

Saturday, 20 February 2010

Free Exception lunch : Use unchecked exceptions, but still announce which ones you might throw.

In a previous article I choosed my side : Unchecked Exceptions are much simpler to use.

But, on the other side of this great division, there is a very valid point : You usually declare checked exceptions. Sure it's possible to only declare to throw Exception, but that would defeat the whole purpose of using checked exceptions.

The nicest thing is that you can also have a custom exception hierarchy, but based on RuntimeException instead of a plain Exception. This way it's like in C++. Everything might be thrown, and you don't need to handle them.

Declaring them, on the other side, is very interesting because you are documenting your interface for almost free.

So, use unchecked exceptions to free yourself of the checked catch-slavery, but still declare the custom ones you might throw.

- page 4 of 12 -