Personal Workflow Blog

To content | To menu | To search

sysadmin

Entries feed

Monday, 21 June 2010

CGI on steroids with FastCGI, but on a CGI-only server - The FastCGI wrapper

FastCGI is really CGI on steroids

FastCGI is very common way to increase performance of a CGI installation. It is based on the fact that usually the startup of CGI scripts is slow, whereas the response is quite fast.

So if you have a persistent process, you only have to take care of the startup once, and you then experience a real speedup.

FastCGI vs mod_perl (or mod_python, ...)

Once a big fan of mod_perl, I'm converted to FastCGI since. mod_perl was for a long time the answer for speeding up Perl CGI scripts. It has a very good track record of stability and has real hooks deep in the Apache processing requests.

FastCGI focuses on a different feature set that is more actual than mod_perl[1] :

  • It is much simpler to install and configure, especially when having multiple applications.
  • Able to connect to a distant server (running as a different UID, chrooted or even on a remote host)
  • Able to mix scripting languages without any need to compile some other apache modules.
  • Able to be used with several webservers, even closed-source ones : FastCGI is a protocol, not an API.

But steroids do have some side effects

CGI issues

One downside is that your CGI script should be adapted to FastCGI and the fact that the script doesn't end with the end of the request.

In the real world that's quite easy. Every language that is commonly used for CGI offers CGI-wrapper libraries that works in a FastCGI context as well as a plain CGI one.

Webserver issues

Another issue can also come from the webserver. Since CGI is dead simple to implement even the micro-webserver thttpd implements it.

FastCGI on the other hand is a little more difficult to implement, since the webserver needs to create a container that monitors and calls the FastCGI-enabled script.

A standalone FastCGI container

Fortunately, the FastCGI team provided us with a ready-to-use container and a very simple client that acts a plain CGI script, but proxies it to a full-blown container.

Since the plain CGI part is a very small native executable its overhead is negligible compared to the reply time, even without comparison with the startup time of the whole script.

Its installation is also quite straightforward. I just installed the libfcgi package on Debian : it provides /usr/bin/cgi-fcgi.

I created a simple CGI wrapper for my previous munin benchmarking needs :

#! /bin/sh

exec /usr/bin/cgi-fcgi -connect /tmp/munin-cgi.sock \
     /usr/lib/cgi-bin/munin-cgi-graph

Notes

[1] who really need deep apache hooks ?

Saturday, 14 November 2009

Sed is much slower than Perl, or not...

I wanted to do some text replacement with a huge file (think ~18GiB), filled with huge lines (think ~2MiB per ligne)[1].

I naïvely piped it through sed and I was quite shocked that it was CPU bound, and not I/O bound. The average rate was about 5 MiB/s (measured with pv, and the CPU was at almost 100%.The text file was gzipped on the filesystem, but with a 1/100 ratio, so the gzip process just took less than 2% CPU. I replaced then the sed -e with the Perl one-liner perl -lnpe, and .... tadaa, it was flying at a rate of 50MiB/s !

While I'm a big fan of Perl, and know its effectiveness to handle text streams, I'm was still astonished : being 10x faster than sed was something.

But in the good old saying Too good to be true means suspect, I remembered something about the character encoding of the regular expression. Since the system is entirely configured in UTF8, I suspected the infamous UTF8 overhead over plain ASCII.

I was right : a little LANG=C in front of the sed command line restored the rate to 50MiB/s.

So, beware of the performance impact of UTF8 strings, and try to avoid it if you can.

Notes

[1] For the record, it was a MySQL dump

Friday, 11 September 2009

Sychronize clock between hosts with SSH

NTP is very handy for server clock synchronisation, but it can be cumbersome to deploy.

Sometimes you just need to do a one-shot clock synchronisation, so you use the standard date command. But there isn't a flag to easily copy a setting to another.

From a remote host

Quite easy :

# date `ssh remoteuser@remotehost date +%m%d%H%M%Y.%S`

To a remote host

It's also very easy[1] :

# ssh root@remotehost date `date +%m%d%H%M%Y.%S`

Notes

[1] Yes, I do know that logging remotely as root is a security pitfall...

Monday, 10 August 2009

A Simple Dns Server for a SOHO Network

I'm in search of a very simple DNS Server for a small network. It should be :

  • recursive & caching (can be used as a proxy)
  • very simple administration (parsing /etc/hosts would be perfect, raw DNS zones like BIND would be a little bit overkill)
  • quite lightweight (aka no dependency on an SQL engine like MySQL, such as MyDNS)
  • Seamless integration to Windows lookups (nmblookup) via proxying functions (DNS to/from NMB)