Personal Workflow Blog

To content | To menu | To search

Saturday, 20 February 2010

Free Exception lunch : Use unchecked exceptions, but still announce which ones you might throw.

In a previous article I choosed my side : Unchecked Exceptions are much simpler to use.

But, on the other side of this great division, there is a very valid point : You usually declare checked exceptions. Sure it's possible to only declare to throw Exception, but that would defeat the whole purpose of using checked exceptions.

The nicest thing is that you can also have a custom exception hierarchy, but based on RuntimeException instead of a plain Exception. This way it's like in C++. Everything might be thrown, and you don't need to handle them.

Declaring them, on the other side, is very interesting because you are documenting your interface for almost free.

So, use unchecked exceptions to free yourself of the checked catch-slavery, but still declare the custom ones you might throw.

Immutability of an URL

In the pure spirit of Data is King I think that URL should never change. Even the W3C agrees with their Cool URIs don't change article.

But we all know that in IT never is only not in the foreseen future. So URL do change, at least after a while, and usually for technical reasons[1].

Since you can update your website to update the URLs, but the inbound link cannot be easily updated. To handle this need, the HTTP protocol has specified the 301 response code.

The solution is that the site should remember all the urls that it generated and redirects accordingly. This way you'll never loose a potential reader to the infamous 404 (this page does not exist).

Some sites even try to approximate the page on a custom 404 page. That's another reason to have user-friendly urls : to be able to hint your reader to appropriate pages in case you don't find his initial destination.

Sadly, this redirect behavior isn't supported by my blog engine (dotclear)... That's for the eat your own dog's food, but I'm looking forward to do it on my current blogging platform.

Notes

[1] upgrade to another blog engine...

Friday, 11 December 2009

Avoid the Preprocessor : Use ''Compile-Time Polymorphism'' for Cross-platform Development

When writing portable cross-platform code, don't litter your code with preprocessor macros, use compile-time polymorphism instead.

A flexible build system will enable you to use advanced OOP-like compile-time polymorphism. That way you can hide all the specifics of the different platform behind an interface firewall. It is the usual way that most cross-platform toolkits and frameworks (such as QT, GTK or wxWidgets) are designed.

Continue reading...

Thursday, 26 November 2009

Native SSH transport for Munin

I blogged about the munin monitoring system a while ago.

The fact the Munin team did quite a remarkable job in cleaning up the 1.2 code for the 1.4 release enabled me to add a native SSH transport for Munin, and made be able to get rid of all SSH tunnels.

Continue reading...

Saturday, 14 November 2009

Sed is much slower than Perl, or not...

I wanted to do some text replacement with a huge file (think ~18GiB), filled with huge lines (think ~2MiB per ligne)[1].

I naïvely piped it through sed and I was quite shocked that it was CPU bound, and not I/O bound. The average rate was about 5 MiB/s (measured with pv, and the CPU was at almost 100%.The text file was gzipped on the filesystem, but with a 1/100 ratio, so the gzip process just took less than 2% CPU. I replaced then the sed -e with the Perl one-liner perl -lnpe, and .... tadaa, it was flying at a rate of 50MiB/s !

While I'm a big fan of Perl, and know its effectiveness to handle text streams, I'm was still astonished : being 10x faster than sed was something.

But in the good old saying Too good to be true means suspect, I remembered something about the character encoding of the regular expression. Since the system is entirely configured in UTF8, I suspected the infamous UTF8 overhead over plain ASCII.

I was right : a little LANG=C in front of the sed command line restored the rate to 50MiB/s.

So, beware of the performance impact of UTF8 strings, and try to avoid it if you can.

Notes

[1] For the record, it was a MySQL dump

- page 1 of 8