Personal Workflow Blog

To content | To menu | To search

Friday, 11 December 2009

Avoid the Preprocessor : Use ''Compile-Time Polymorphism'' for Cross-platform Development

When writing portable cross-platform code, don't litter your code with preprocessor macros, use compile-time polymorphism instead.

A flexible build system will enable you to use advanced OOP-like compile-time polymorphism. That way you can hide all the specifics of the different platform behind an interface firewall. It is the usual way that most cross-platform toolkits and frameworks (such as QT, GTK or wxWidgets) are designed.

Continue reading...

Thursday, 26 November 2009

Native SSH transport for Munin

I blogged about the munin monitoring system a while ago.

The fact the Munin team did quite a remarkable job in cleaning up the 1.2 code for the 1.4 release enabled me to add a native SSH transport for Munin, and made be able to get rid of all SSH tunnels.

Continue reading...

Saturday, 14 November 2009

Sed is much slower than Perl, or not...

I wanted to do some text replacement with a huge file (think ~18GiB), filled with huge lines (think ~2MiB per ligne)[1].

I naïvely piped it through sed and I was quite shocked that it was CPU bound, and not I/O bound. The average rate was about 5 MiB/s (measured with pv, and the CPU was at almost 100%.The text file was gzipped on the filesystem, but with a 1/100 ratio, so the gzip process just took less than 2% CPU. I replaced then the sed -e with the Perl one-liner perl -lnpe, and .... tadaa, it was flying at a rate of 50MiB/s !

While I'm a big fan of Perl, and know its effectiveness to handle text streams, I'm was still astonished : being 10x faster than sed was something.

But in the good old saying Too good to be true means suspect, I remembered something about the character encoding of the regular expression. Since the system is entirely configured in UTF8, I suspected the infamous UTF8 overhead over plain ASCII.

I was right : a little LANG=C in front of the sed command line restored the rate to 50MiB/s.

So, beware of the performance impact of UTF8 strings, and try to avoid it if you can.

Notes

[1] For the record, it was a MySQL dump

Friday, 11 September 2009

Quickly replicate the clock between remote hosts with SSH

NTP is very handy for server clock synchronisation, but it can be cumbersome to deploy.

Sometimes you just need to do a one-shot clock synchronisation, so you use the standard date command. But there isn't a flag to easily copy a setting to another.

From a remote host

Quite easy :

# date `ssh remoteuser@remotehost date +%m%d%H%M%Y.%S`

To a remote host

It's also very easy[1] :

# ssh root@remotehost date `date +%m%d%H%M%Y.%S`

Notes

[1] Yes, I do know that logging remotely as root is a security pitfall...

Tuesday, 8 September 2009

Databases: Efficient Case-insensitive searches with Function-based Indexing

Doing a case insensitive search is a very common task, but is quite hard to optimize correctly. But since it's done via a UPPER(MY_COLUMN) = UPPER('MY_DATA'), it doesn't use the index that could be on MY_COLUMN.

Different RDMS means different approaches.

Continue reading...

- page 1 of 8