Django PostgreSQL ORM Overhead

This is another post I’ve been sitting on for the better part of a year. I’m putting it out there in case the raw numbers are useful to anybody.

So I’ve been dealing with some database performance issues with a fairly large Django application and have been trying to track down exactly where the bottlenecks are. Interestingly, neither the application servers nor the database server displays high cpu utilization, so something is locking outside of pure CPU.

All tests were run on an AWS EC2 image, running CentOS 5.3, Python 2.6 and the PGDG PostgreSQL83 RPM packages. The database server is an identical AWS image running PGDG postgresql83.

The quick takeaways from all this are the following:

  1. PostgreSQL singleton selects are pretty fast
  2. The libpq library imposes more overhead than the database server does
  3. For the trivial case of an application that uses PostgreSQL as a key-value store containing a working set less than 1/2 the system RAM, you’ll need many application servers to saturate it. No, I don’t know how many, but it doesn’t really matter because for a non-trivial application the PostgreSQL client library overhead will become negligible.
  4. The Django ORM and psycopg2 drivers add approximately 50% overhead vs. pure c+libpq program.
  5. There are some interesting bottlenecks out there that will prevent CPU saturation of a trivial workload. No, I do not know what they are (yet).
  6. PgBouncer adds a bit of overhead to maximum throughput, around 5% in this case.
  7. Local connections are faster than network connections. Add this to the fact there seem to be some wierd bottlenecks and you might find that your app runs faster on 1 server than 2. Interesting.

I put together some very simple tests to try to figure out what’s going on.

First, the “Django” program:

#!/usr/bin/python26
from pprint import pprint
import sys
import os

from django.core.management import execute_manager

try:
import local_settings
except ImportError:
import sys
sys.stderr.write("Unable to find settings file")
sys.exit(1)

from django.contrib.sites.models import Site

def stest():
s = Site.objects.get(domain = 'example.com')

if __name__ == '__main__':
from timeit import Timer
s = Timer(stmt=stest)
print 'different query 1'
print s.timeit(number=1)
print 'different query 2'
print s.timeit(number=1)
print 'different query 10 more'
print s.timeit(number=1000000)

The C program:

/*
* testlibpq.c
*
* Test the C version of libpq, the PostgreSQL frontend library.
*/
#include
#include
#include "libpq-fe.h"

static void
exit_nicely(PGconn *conn)
{
PQfinish(conn);
exit(1);
}

int
main(int argc, char **argv)
{
const char *conninfo;
PGconn *conn;
PGresult *res;
int nFields;
int count,
i,
j;

/*
* If the user supplies a parameter on the command line, use it as the
* conninfo string; otherwise default to setting dbname=postgres and using
* environment variables or defaults for all other connection parameters.
*/
if (argc > 1)
conninfo = argv[1];
else
conninfo = "dbname = ngdm_wpf_content";

/* Make a connection to the database */
conn = PQconnectdb(conninfo);

/* Check to see that the backend connection was successfully made */
if (PQstatus(conn) != CONNECTION_OK)
{
fprintf(stderr, "Connection to database failed: %s",
PQerrorMessage(conn));
exit_nicely(conn);
}

/*
* Our test case here involves using a cursor, for which we must be inside
* a transaction block. We could do the whole thing with a single
* PQexec() of "select * from pg_database", but that's too trivial to make
* a good example.
*/

for ( count=0; count < 100; count++) {
/* Start a transaction block */
res = PQexec(conn, "SELECT * FROM django_site WHERE django_site.domain = 'example.com' ORDER BY django_site.domain;");
if (PQresultStatus(res) != PGRES_TUPLES_OK)
{
fprintf(stderr, "SELECT command failed (%d): %s", PQresultStatus(res), PQerrorMessage(conn));
PQclear(res);
exit_nicely(conn);
}

PQclear(res);

}

/* close the connection to the database and cleanup */
PQfinish(conn);

return 0;
}


gcc -o bench -lpq bench.c
time ./bench "dbname=content hostaddr=127.0.0.1 user=user"

Running 8 concurrent python programs, via pgbouncer, client:

top - 22:05:16 up  3:50,  5 users,  load average: 3.94, 3.42, 2.45
Tasks: 135 total,   4 running, 131 sleeping,   0 stopped,   0 zombie
Cpu(s): 12.3%us,  1.0%sy,  0.0%ni, 84.2%id,  0.0%wa,  0.0%hi,  0.8%si,  1.7%st
Mem:   7347752k total,  5608152k used,  1739600k free,    70732k buffers
Swap:        0k total,        0k used,        0k free,   452504k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 3124 root      15   0  762m 623m 4976 S   40  8.7   5:07.23 bench.py
 3125 root      15   0  722m 583m 4976 S   14  8.1   4:15.34 bench.py
 3122 root      15   0  701m 562m 4976 S   14  7.8   3:29.58 bench.py
 2523 postgres  15   0 17092 1240  792 S   10  0.0   3:31.90 pgbouncer
 3126 root      15   0  699m 560m 4976 R    8  7.8   3:10.76 bench.py
 3121 root      15   0  822m 683m 4976 S    8  9.5   6:03.10 bench.py
 3123 root      15   0  700m 561m 4976 R    7  7.8   2:47.05 bench.py
 3128 root      15   0  700m 561m 4976 S    7  7.8   3:23.07 bench.py
 3127 root      15   0  714m 575m 4976 R    6  8.0   4:52.92 bench.py

And the server:

top - 21:39:50 up  6:36,  4 users,  load average: 7.44, 3.44, 1.49
Tasks: 142 total,  11 running, 131 sleeping,   0 stopped,   0 zombie
top - 22:06:04 up  7:02,  4 users,  load average: 3.15, 3.19, 4.10
Tasks: 127 total,   4 running, 123 sleeping,   0 stopped,   0 zombie
Cpu(s):  2.2%us,  1.3%sy,  0.0%ni, 95.5%id,  0.4%wa,  0.0%hi,  0.3%si,  0.3%st
Mem:   7347752k total,  2313112k used,  5034640k free,     4124k buffers
Swap:        0k total,        0k used,        0k free,  1996524k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 6910 postgres  15   0 2160m 5748 3888 S    6  0.1   1:15.31 postmaster
 6902 postgres  15   0 2160m 5740 3880 S    5  0.1   1:17.22 postmaster
 6913 postgres  15   0 2160m 5748 3888 R    4  0.1   1:08.97 postmaster
 6911 postgres  15   0 2160m 5748 3888 S    3  0.1   1:08.78 postmaster
 6912 postgres  15   0 2160m 5756 3896 S    3  0.1   1:07.76 postmaster
 6908 postgres  15   0 2160m 5740 3880 S    3  0.1   1:05.38 postmaster
 6909 postgres  15   0 2160m 5748 3888 S    3  0.1   1:05.88 postmaster
 6917 postgres  15   0 2160m 5748 3888 R    3  0.1   1:06.88 postmaster

Running the the Django benchmark with 10,000 iterations via pgbouncer demonstrates a 300ms startup time, and a sustained trivial query rate of almost exactly 1000/sec:

[root@domU-12-31-38-04-59-91 ~]# time ./bench.py
initial
0.306927919388
second
0.00195598602295
10 more
0.0144731998444
different query 1
0.00151586532593
different query 2
0.00109100341797
different query 10 more
9.89285588264

real    0m10.650s
user    0m4.217s
sys     0m0.229s

Exactly the same test, without pgbouncer, results in approximately 5% improvement in throughput:

[root@domU-12-31-38-04-59-91 ~]# time ./bench.py
initial
0.304031133652
second
0.00196003913879
10 more
0.0196559429169
different query 1
0.00163292884827
different query 2
0.000943899154663
different query 10 more
9.48121595383

real    0m10.238s
user    0m4.188s
sys     0m0.219s

And finally, running it directly on the database server knocks off almost 30%:

[root@domU-12-31-38-04-58-E1 data]# time ./bench.py
initial
0.651133060455
second
0.00181007385254
10 more
0.0109980106354
different query 1
0.00134086608887
different query 2
0.000818014144897
different query 10 more
6.95133709908

real    0m9.535s
user    0m5.814s
sys     0m0.275s

And now,running 10,000 queries via the c program:

[root@domU-12-31-38-04-59-91 ~]# time ./bench "dbname=content hostaddr=127.0.0.1 user=user"

real    0m4.648s
user    0m0.003s
sys     0m0.031s

And via pgbouncer:

[root@domU-12-31-38-04-59-91 ~]# time ./bench "dbname=content hostaddr=10.220.91.15 user=user"

real    0m3.799s
user    0m0.004s
sys     0m0.005s

And directly on the server:

[root@domU-12-31-38-04-58-E1 tmp]#  time ./bench "dbname=ngdm_wpf_content hostaddr=127.0.0.1 user=ngdm_wpf"

real    0m1.758s
user    0m0.048s
sys     0m0.062s

Now, let’s try to melt things down:
8 clients via pgbouncer:

8 clients directly

8 C clients with pgbouncer, client load:
top – 22:23:49 up 4:08, 5 users, load average: 1.36, 0.50, 0.94
Tasks: 148 total, 3 running, 144 sleeping, 1 stopped, 0 zombie
Cpu(s): 1.1%us, 4.4%sy, 0.0%ni, 91.5%id, 0.0%wa, 0.0%hi, 2.4%si, 0.6%st
Mem: 7347752k total, 822812k used, 6524940k free, 72052k buffers
Swap: 0k total, 0k used, 0k free, 452656k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2523 postgres 15 0 17092 1240 792 R 50 0.0 4:10.58 pgbouncer
3771 root 15 0 48840 1884 1464 S 6 0.0 0:00.70 bench
3765 root 15 0 48836 1888 1464 S 3 0.0 0:00.72 bench
3761 root 15 0 48836 1888 1464 R 1 0.0 0:01.86 bench
3773 root 15 0 48836 1884 1464 S 1 0.0 0:00.21 bench
3780 root 15 0 48836 1884 1464 S 1 0.0 0:00.11 bench
3759 root 15 0 48836 1884 1464 S 0 0.0 0:00.25 bench
3763 root 15 0 48840 1888 1464 S 0 0.0 0:00.19 bench

8 C clients, server load:
top – 22:23:35 up 7:19, 4 users, load average: 2.52, 0.79, 1.58
Tasks: 129 total, 9 running, 120 sleeping, 0 stopped, 0 zombie
Cpu(s): 18.9%us, 7.0%sy, 0.0%ni, 71.8%id, 0.0%wa, 0.0%hi, 2.1%si, 0.3%st
Mem: 7347752k total, 2466176k used, 4881576k free, 6736k buffers
Swap: 0k total, 0k used, 0k free, 2143620k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
7148 postgres 15 0 2160m 4920 3240 R 23 0.1 0:10.94 postmaster
7152 postgres 15 0 2160m 4916 3236 R 21 0.1 0:08.88 postmaster
7150 postgres 15 0 2160m 4916 3236 S 20 0.1 0:08.65 postmaster
7147 postgres 15 0 2160m 4916 3236 S 20 0.1 0:11.63 postmaster
7155 postgres 15 0 2160m 4916 3236 R 20 0.1 0:03.98 postmaster
7151 postgres 15 0 2160m 4916 3236 R 19 0.1 0:08.14 postmaster
7154 postgres 15 0 2160m 4920 3240 S 19 0.1 0:04.05 postmaster
7145 postgres 15 0 2160m 4916 3236 R 18 0.1 0:11.57 postmaster
7144 postgres 15 0 2160m 4912 3232 R 17 0.1 0:12.08 postmaster
6435 postgres 15 0 60920 1012 320 S 15 0.0 3:32.23 postmaster

And 8 C clients, eliminating pgbouncer, client load:
top – 22:25:55 up 4:10, 5 users, load average: 0.76, 0.64, 0.94
Tasks: 136 total, 2 running, 133 sleeping, 1 stopped, 0 zombie
Cpu(s): 0.6%us, 1.3%sy, 0.0%ni, 97.6%id, 0.0%wa, 0.0%hi, 0.2%si, 0.3%st
Mem: 7347752k total, 821436k used, 6526316k free, 72184k buffers
Swap: 0k total, 0k used, 0k free, 452656k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3791 root 15 0 48840 1888 1464 R 3 0.0 0:00.88 bench
3784 root 15 0 48836 1884 1464 S 3 0.0 0:00.83 bench
3788 root 15 0 48836 1884 1464 S 3 0.0 0:00.89 bench
3790 root 15 0 48836 1884 1464 S 3 0.0 0:00.88 bench
3785 root 15 0 48836 1884 1464 S 2 0.0 0:00.82 bench
3789 root 15 0 48836 1884 1464 S 2 0.0 0:00.88 bench
3787 root 15 0 48840 1888 1464 S 2 0.0 0:00.97 bench
3786 root 15 0 48840 1888 1464 S 1 0.0 0:00.68 bench

Server load:
Tasks: 132 total, 5 running, 127 sleeping, 0 stopped, 0 zombie
Cpu(s): 16.5%us, 7.0%sy, 0.0%ni, 74.2%id, 0.0%wa, 0.0%hi, 1.5%si, 0.7%st
Mem: 7347752k total, 2726576k used, 4621176k free, 7340k buffers
Swap: 0k total, 0k used, 0k free, 2392360k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
7165 postgres 15 0 2160m 4916 3236 S 27 0.1 0:11.50 postmaster
7166 postgres 15 0 2160m 4916 3236 R 26 0.1 0:11.05 postmaster
7171 postgres 15 0 2160m 4916 3236 S 25 0.1 0:10.56 postmaster
7164 postgres 15 0 2160m 4916 3236 S 24 0.1 0:11.62 postmaster
7169 postgres 15 0 2160m 4920 3240 R 24 0.1 0:10.86 postmaster
7170 postgres 15 0 2160m 4916 3236 S 23 0.1 0:10.66 postmaster
7168 postgres 15 0 2160m 4916 3236 R 21 0.1 0:10.69 postmaster
7167 postgres 15 0 2160m 4916 3236 R 18 0.1 0:10.72 postmaster

16 c clients, no pgbouncer:
top – 22:28:21 up 4:13, 5 users, load average: 1.06, 0.80, 0.95
Tasks: 144 total, 2 running, 141 sleeping, 1 stopped, 0 zombie
Cpu(s): 1.1%us, 3.0%sy, 0.0%ni, 94.8%id, 0.0%wa, 0.0%hi, 0.9%si, 0.3%st
Mem: 7347752k total, 824680k used, 6523072k free, 72344k buffers
Swap: 0k total, 0k used, 0k free, 452660k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3785 root 15 0 48836 1884 1464 S 5 0.0 0:04.30 bench
3788 root 15 0 48836 1884 1464 S 4 0.0 0:04.58 bench
3798 root 15 0 48840 1888 1464 S 4 0.0 0:01.40 bench
3799 root 15 0 48836 1884 1464 S 4 0.0 0:01.38 bench
3784 root 15 0 48836 1884 1464 S 3 0.0 0:04.16 bench
3794 root 15 0 48840 1888 1464 S 3 0.0 0:01.35 bench
3787 root 15 0 48840 1888 1464 R 3 0.0 0:04.73 bench
3790 root 15 0 48836 1884 1464 S 3 0.0 0:04.25 bench
3789 root 15 0 48836 1884 1464 S 3 0.0 0:04.27 bench
3795 root 15 0 48840 1888 1464 S 3 0.0 0:01.07 bench
3786 root 15 0 48840 1888 1464 S 2 0.0 0:03.88 bench
3800 root 15 0 48836 1884 1464 S 2 0.0 0:01.19 bench
3793 root 15 0 48840 1888 1464 S 2 0.0 0:01.40 bench
3796 root 15 0 48840 1888 1464 S 2 0.0 0:01.13 bench
3791 root 15 0 48840 1888 1464 S 1 0.0 0:04.29 bench

16 c clients, no pgbouncer, server load:
top – 22:28:28 up 7:24, 4 users, load average: 6.68, 3.73, 2.57
Tasks: 138 total, 10 running, 128 sleeping, 0 stopped, 0 zombie
Cpu(s): 23.5%us, 9.6%sy, 0.0%ni, 63.7%id, 0.4%wa, 0.0%hi, 2.2%si, 0.7%st
Mem: 7347752k total, 3111620k used, 4236132k free, 8060k buffers
Swap: 0k total, 0k used, 0k free, 2756304k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
6435 postgres 16 0 60920 1012 320 R 27 0.0 4:17.22 postmaster
7168 postgres 15 0 2160m 4920 3240 S 20 0.1 0:40.52 postmaster
7296 postgres 15 0 2160m 4928 3240 R 20 0.1 0:08.51 postmaster
7292 postgres 15 0 2160m 4924 3240 S 19 0.1 0:08.49 postmaster
7166 postgres 15 0 2160m 4920 3240 S 19 0.1 0:41.11 postmaster
7290 postgres 15 0 2160m 4924 3244 S 19 0.1 0:08.41 postmaster
7295 postgres 15 0 2160m 4924 3240 S 19 0.1 0:08.21 postmaster
7170 postgres 15 0 2160m 4920 3240 R 18 0.1 0:40.96 postmaster
7169 postgres 15 0 2160m 4924 3244 S 18 0.1 0:40.23 postmaster
7171 postgres 15 0 2160m 4920 3240 R 18 0.1 0:40.17 postmaster
7167 postgres 15 0 2160m 4920 3240 S 17 0.1 0:40.05 postmaster
7293 postgres 15 0 2160m 4924 3240 R 17 0.1 0:08.74 postmaster
7297 postgres 15 0 2160m 4936 3248 R 16 0.1 0:07.92 postmaster
7165 postgres 16 0 2160m 4920 3240 R 15 0.1 0:41.41 postmaster
7291 postgres 15 0 2160m 4924 3240 R 15 0.1 0:08.61 postmaster
7294 postgres 15 0 2160m 4924 3240 S 14 0.1 0:08.13 postmaster
7164 postgres 15 0 2160m 4920 3240 R 13 0.1 0:41.61 postmaster
6440 postgres 15 0 61468 1444 444 S 0 0.0 0:00.17 postmaster

4 local c processes:
top – 22:31:31 up 7:27, 4 users, load average: 4.52, 4.18, 2.97
Tasks: 130 total, 7 running, 123 sleeping, 0 stopped, 0 zombie
Cpu(s): 33.5%us, 15.4%sy, 0.0%ni, 48.5%id, 1.0%wa, 0.0%hi, 0.0%si, 1.5%st
Mem: 7347752k total, 3594636k used, 3753116k free, 9000k buffers
Swap: 0k total, 0k used, 0k free, 3239112k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
7324 postgres 25 0 2160m 4888 3208 R 84 0.1 0:51.21 postmaster
7328 postgres 25 0 2160m 4888 3208 R 81 0.1 0:51.21 postmaster
7326 postgres 25 0 2160m 4888 3208 R 81 0.1 0:53.49 postmaster
7322 postgres 25 0 2160m 4888 3208 R 78 0.1 1:03.22 postmaster
6435 postgres 15 0 60920 1012 320 R 23 0.0 4:50.81 postmaster
7321 root 15 0 48836 1864 1444 S 14 0.0 0:08.83 bench
7325 root 15 0 48840 1868 1444 S 11 0.0 0:07.32 bench
7327 root 15 0 48840 1864 1444 S 11 0.0 0:06.04 bench
7323 root 15 0 48836 1864 1444 R 9 0.0 0:05.58 bench

8 local processes:
top – 22:32:25 up 7:28, 4 users, load average: 6.16, 4.62, 3.18
Tasks: 138 total, 6 running, 132 sleeping, 0 stopped, 0 zombie
Cpu(s): 25.9%us, 16.0%sy, 0.0%ni, 55.0%id, 1.6%wa, 0.0%hi, 0.0%si, 1.5%st
Mem: 7347752k total, 3821680k used, 3526072k free, 9352k buffers
Swap: 0k total, 0k used, 0k free, 3451556k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
6435 postgres 16 0 60920 1012 320 R 43 0.0 5:09.25 postmaster
7339 postgres 15 0 2160m 4892 3212 D 39 0.1 0:06.23 postmaster
7328 postgres 25 0 2160m 4888 3208 R 34 0.1 1:24.86 postmaster
7324 postgres 25 0 2160m 4888 3208 D 33 0.1 1:24.64 postmaster
7326 postgres 25 0 2160m 4888 3208 D 32 0.1 1:26.53 postmaster
7322 postgres 25 0 2160m 4888 3208 D 30 0.1 1:34.39 postmaster
7337 postgres 16 0 2160m 4888 3208 R 28 0.1 0:06.02 postmaster
7343 postgres 16 0 2160m 4888 3208 R 28 0.1 0:05.04 postmaster
7341 postgres 15 0 2160m 4888 3208 S 27 0.1 0:05.56 postmaster
7338 root 15 0 48840 1864 1444 S 7 0.0 0:01.05 bench
7325 root 15 0 48840 1868 1444 S 6 0.0 0:12.12 bench
7342 root 15 0 48836 1868 1444 S 6 0.0 0:01.02 bench
7321 root 15 0 48836 1864 1444 S 6 0.0 0:11.73 bench
7340 root 15 0 48836 1864 1444 R 5 0.0 0:00.77 bench
7323 root 15 0 48836 1864 1444 S 4 0.0 0:10.10 bench
7327 root 15 0 48840 1864 1444 S 4 0.0 0:10.24 bench
7336 root 15 0 48836 1864 1444 S 3 0.0 0:00.76 bench

16 local processes:
top – 22:33:11 up 7:29, 4 users, load average: 10.49, 5.91, 3.69
Tasks: 154 total, 9 running, 145 sleeping, 0 stopped, 0 zombie
Cpu(s): 20.8%us, 9.3%sy, 0.0%ni, 59.7%id, 1.0%wa, 0.0%hi, 0.0%si, 9.1%st
Mem: 7347752k total, 4032604k used, 3315148k free, 9660k buffers
Swap: 0k total, 0k used, 0k free, 3639952k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
7347 postgres 15 0 2160m 4892 3212 D 17 0.1 0:02.80 postmaster
7328 postgres 15 0 2160m 4888 3208 D 15 0.1 1:35.36 postmaster
6435 postgres 15 0 60920 1012 320 S 15 0.0 5:22.28 postmaster
7357 postgres 16 0 2160m 4892 3208 D 14 0.1 0:02.46 postmaster
7359 postgres 16 0 2160m 4892 3208 R 14 0.1 0:02.46 postmaster
7361 postgres 15 0 2160m 4900 3212 D 13 0.1 0:02.37 postmaster
7343 postgres 16 0 2160m 4888 3208 R 13 0.1 0:16.00 postmaster
7353 postgres 16 0 2160m 4892 3208 D 12 0.1 0:02.47 postmaster
7322 postgres 16 0 2160m 4888 3208 R 12 0.1 1:45.13 postmaster
7326 postgres 16 0 2160m 4888 3208 R 12 0.1 1:37.69 postmaster
7339 postgres 15 0 2160m 4892 3212 D 12 0.1 0:16.68 postmaster
7349 postgres 15 0 2160m 4892 3208 D 12 0.1 0:02.52 postmaster
7324 postgres 16 0 2160m 4888 3208 D 11 0.1 1:34.90 postmaster
7355 postgres 16 0 2160m 4892 3208 R 11 0.1 0:02.41 postmaster
7341 postgres 16 0 2160m 4888 3208 D 11 0.1 0:16.48 postmaster
7351 postgres 16 0 2160m 4892 3208 R 10 0.1 0:02.36 postmaster
7337 postgres 15 0 2160m 4888 3208 D 9 0.1 0:15.76 postmaster
7348 root 15 0 48840 1864 1444 S 3 0.0 0:00.42 bench
7338 root 15 0 48840 1864 1444 S 3 0.0 0:02.80 bench
7325 root 15 0 48840 1868 1444 R 2 0.0 0:13.61 bench
7342 root 15 0 48836 1868 1444 S 2 0.0 0:02.64 bench
7356 root 15 0 48840 1868 1444 S 2 0.0 0:00.31 bench
7358 root 15 0 48836 1868 1444 S 2 0.0 0:00.39 bench
7346 root 15 0 48840 1868 1444 S 2 0.0 0:00.43 bench
7350 root 15 0 48840 1864 1444 S 2 0.0 0:00.38 bench
7327 root 15 0 48840 1864 1444 S 2 0.0 0:11.59 bench
7336 root 15 0 48836 1864 1444 S 2 0.0 0:02.33 bench
7340 root 15 0 48836 1864 1444 S 2 0.0 0:02.51 bench
7321 root 15 0 48836 1864 1444 S 1 0.0 0:13.01 bench
7323 root 15 0 48836 1864 1444 S 1 0.0 0:11.48 bench
7354 root 15 0 48836 1868 1444 S 1 0.0 0:00.24 bench
7360 root 15 0 48840 1864 1444 S 1 0.0 0:00.33 bench

32 local processes:
top – 22:34:50 up 7:31, 4 users, load average: 23.32, 11.03, 5.69
Tasks: 181 total, 4 running, 177 sleeping, 0 stopped, 0 zombie
Cpu(s): 18.1%us, 7.2%sy, 0.0%ni, 66.6%id, 1.4%wa, 0.0%hi, 0.0%si, 6.7%st
Mem: 7347752k total, 4381808k used, 2965944k free, 10244k buffers
Swap: 0k total, 0k used, 0k free, 3952404k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
6435 postgres 16 0 60920 1012 320 S 9 0.0 5:33.12 postmaster
7347 postgres 16 0 2160m 4904 3224 D 8 0.1 0:10.34 postmaster
7353 postgres 16 0 2160m 4904 3220 D 8 0.1 0:09.48 postmaster
7349 postgres 16 0 2160m 4904 3220 D 8 0.1 0:10.25 postmaster
7375 postgres 15 0 2160m 4908 3220 D 8 0.1 0:03.45 postmaster
7326 postgres 16 0 2160m 4900 3220 D 7 0.1 1:45.29 postmaster
7373 postgres 16 0 2160m 4908 3220 D 7 0.1 0:03.65 postmaster
7383 postgres 16 0 2160m 4908 3220 D 7 0.1 0:03.06 postmaster
7391 postgres 15 0 2160m 4912 3224 D 7 0.1 0:03.02 postmaster
7324 postgres 16 0 2160m 4900 3220 D 7 0.1 1:41.96 postmaster
7369 postgres 16 0 2160m 4908 3220 R 6 0.1 0:03.81 postmaster
7385 postgres 16 0 2160m 4912 3224 D 6 0.1 0:02.90 postmaster
7361 postgres 16 0 2160m 4912 3224 D 5 0.1 0:10.54 postmaster
7379 postgres 16 0 2160m 4912 3224 D 5 0.1 0:03.40 postmaster
7393 postgres 16 0 2160m 4908 3220 D 5 0.1 0:02.84 postmaster
7395 postgres 16 0 2160m 4912 3224 D 5 0.1 0:03.15 postmaster
7397 postgres 16 0 2160m 4912 3224 R 5 0.1 0:03.12 postmaster
7343 postgres 16 0 2160m 4900 3220 D 5 0.1 0:23.27 postmaster
7328 postgres 16 0 2160m 4900 3220 D 5 0.1 1:42.52 postmaster
7341 postgres 16 0 2160m 4900 3220 D 5 0.1 0:23.69 postmaster
7355 postgres 16 0 2160m 4904 3220 D 5 0.1 0:09.82 postmaster
7359 postgres 16 0 2160m 4904 3220 D 5 0.1 0:09.84 postmaster
7381 postgres 16 0 2160m 4908 3220 D 5 0.1 0:03.22 postmaster
7337 postgres 16 0 2160m 4900 3220 D 4 0.1 0:23.36 postmaster
7339 postgres 16 0 2160m 4904 3224 D 4 0.1 0:24.46 postmaster
7367 postgres 15 0 2160m 4908 3220 D 4 0.1 0:03.26 postmaster
7357 postgres 16 0 2160m 4904 3220 D 4 0.1 0:10.21 postmaster
7371 postgres 16 0 2160m 4912 3224 D 4 0.1 0:02.97 postmaster
7387 postgres 16 0 2160m 4908 3220 D 4 0.1 0:02.87 postmaster
7389 postgres 16 0 2160m 4908 3220 D 3 0.1 0:03.03 postmaster
7377 postgres 15 0 2160m 4908 3220 D 3 0.1 0:03.03 postmaster
7351 postgres 16 0 2160m 4904 3220 D 2 0.1 0:09.71 postmaster
7358 root 15 0 48836 1868 1444 S 2 0.0 0:01.62 bench
7325 root 15 0 48840 1868 1444 S 1 0.0 0:14.73 bench
7346 root 15 0 48840 1868 1444 S 1 0.0 0:01.44 bench
7352 root 15 0 48836 1864 1444 S 1 0.0 0:01.45 bench
7374 root 15 0 48840 1864 1444 S 1 0.0 0:00.48 bench
7323 root 15 0 48836 1864 1444 S 1 0.0 0:12.52 bench
7342 root 15 0 48836 1868 1444 S 1 0.0 0:03.75 bench
7366 root 15 0 48836 1868 1444 S 1 0.0 0:00.48 bench
7370 root 15 0 48840 1868 1444 S 1 0.0 0:00.43 bench
7378 root 15 0 48840 1868 1444 S 1 0.0 0:00.36 bench
7382 root 15 0 48840 1868 1444 S 1 0.0 0:00.31 bench
7388 root 15 0 48836 1864 1444 S 1 0.0 0:00.43 bench
7390 root 15 0 48836 1868 1444 S 1 0.0 0:00.36 bench
2455 root 18 0 247m 1580 860 S 1 0.0 0:12.46 collectd
7336 root 15 0 48836 1864 1444 S 1 0.0 0:03.32 bench
7348 root 15 0 48840 1864 1444 S 1 0.0 0:01.38 bench
7350 root 15 0 48840 1864 1444 S 1 0.0 0:01.26 bench
7354 root 15 0 48836 1868 1444 S 1 0.0 0:01.09 bench
7356 root 15 0 48840 1868 1444 S 1 0.0 0:01.47 bench
7368 root 15 0 48840 1864 1444 S 1 0.0 0:00.42 bench
7386 root 15 0 48840 1864 1444 S 1 0.0 0:00.40 bench
7392 root 15 0 48836 1868 1444 S 1 0.0 0:00.42 bench
7327 root 15 0 48840 1864 1444 S 0 0.0 0:12.57 bench
7340 root 15 0 48836 1864 1444 S 0 0.0 0:03.56 bench
7376 root 15 0 48836 1868 1444 S 0 0.0 0:00.45 bench
7380 root 15 0 48840 1864 1444 S 0 0.0 0:00.38 bench
7394 root 15 0 48840 1864 1444 S 0 0.0 0:00.59 bench

JavaScript QuickStart Reference

I wrote this quite some time ago when I had to re-acquiant myself with JavaScript, but never posted it. It’s not complete, but it’s got some of the basics so I’m going to go ahead and hit publish anyway. Javascript (actually ECMAScript nowadays)  is everywhere. If you’ve never had to use it, you’ve somehow managed to avoid writing any modern web apps. The good thing is, it’s really very simple! A few things you should know before getting started:

  1. JavaScript uses (more or less) the standard Algol (C-like) syntax, including most C programming constructs.
    
    if (myVar == True) { doMyFunction(1, 2); myVar = False; }
    while (myVar == True) { doMyFunction(2,3); myVar = False; }
    for ( var i = 0; i < 10; i++ ) { doMyFunction(i, 2); }
    switch(myVar) {
        case "A": doMyFunction(1, 3);
        case "B": doMyFunction(2, 1);
    }
  2. In JavaScript, functions are “first-class”, simply meaning they are objects. As objects, the can be passed around and modified like any other object. This functionality is key to most JavaScript idioms, so make sure you understand it.
    
    function doMyFunction(x, y) {
        if (x % 1 != 0)
            throw new TypeError(x + " is not an integer");
      return x * y;
    }
    
    /*
     * This callback variable probably should be global to be useful!
     * You'll see this idiom a LOT with any sort of AJAX programming
     */
    function setCallBack(callback) {
        var callbackVar = callback;
    }
    
  3. JavaScript itself does not provide a class library or define any runtime services. This results in JavaScript often looking more complicated than it really is because it’s always intertwined with a bunch of wierd browser runtime stuff. For instance, the following is probably the easiest way to write a “Hello World” in  JavaScript, but note that the document object is provided by the browser!
    
    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
    "http://www.w3.org/TR/html4/strict.dtd"></code>
    <html>
      <head><title>simple page</title></head>
      <body>
        <script type="text/javascript">
          document.write('Hello World!');
        </script>
        <noscript>
          <p>Your browser either does not support JavaScript, or you have JavaScript turned off.</p>
        </noscript>
      </body>
    </html>
    
  4. JavaScript array and object literals are form the basis for JavaScript Object Notation or JSON, which is commonly used as a data transport in AJAX applications. The syntax  looks like this:
    
    {
        "pos": {
            "x": 5,
            "y": 7
         },
        "name": "JavaScript",
        "numbers": [ "first", "second", "third" ]
    }
    
  5. JavaScript is a dynamic language. As a dynamic language, you get the “eval” function. Eval opens the door to everything that makes JavaScript powerful, and also most of the possible security holes. Do not eval anything unless you have complete confidence that is is trustworthy code!
    
    str = "{ "pos": { "x": 5, "y": 7 } }";
    var myObj = eval(str);
    alert(myObj.pos.x);
    

How to show hidden files in Mac OS Finder

I’m not sure why this command doesn’t get more play, so I’m gonna put it out there so google picks it up. At least in Snow Leopard, simply hitting cmd-shift-. will show hidden files for that browse session.


defaults write com.apple.finder AppleShowAllFiles TRUE
killall Finder

At least for me, that’s a much more usable solution than permanently showing all hidden files, which is what you get by executing the more commonly documented solution:

See also:
[1] http://www.macworld.com/article/142884/2009/09/106seehidden.html
[2] http://www.mactricksandtips.com/2008/04/show-hidden-files.html

Get started using Mercurial source control in 5 minutes or less

While, in a previous post I talked about how DVCS is the modern form of source control and promised I’d show you how to do it, quickly and easily. So let’s get started! I’m going to use Mercurial because, well, I am.

First, you need to download the Mercurial package for your system. If you use a mac with macports you can just use type sudo port install mercurial. You could also use the very nice mac .dmg packages from berkwood. On Ubuntu, you should be able to sudo apt-get install mercurial. On windows, you’ll probably want to download and install TortoiseHG. BitBucket makes it complicated to find the download link so just click this one instead. You’ll want the file in the top of the list. Right now, that is TortoiseHg-0.7.5-hg-1.2.1.exe.

So, you should now have a working mercurial command line executable. To try it out, open your shell of choice and type hg.You should get something like this:

loki:dtest erik$ hg
Mercurial Distributed SCM

basic commands:

 add        add the specified files on the next commit
 annotate   show changeset information per file line
 clone      make a copy of an existing repository
 commit     commit the specified files or all outstanding changes
 diff       diff repository (or selected files)
 export     dump the header and diffs for one or more changesets
 init       create a new repository in the given directory
 log        show revision history of entire repository or files
 merge      merge working directory with another revision
 parents    show the parents of the working dir or revision
 pull       pull changes from the specified source
 push       push changes to the specified destination
 remove     remove the specified files on the next commit
 serve      export the repository via HTTP
 status     show changed files in the working directory
 update     update working directory

use "hg help" for the full list of commands or "hg -v" for details

Now comes the fun part. Simply navigate to the directory you want to put under revision control and run

hg init

. This will create the .hg directory which stores your local repository.

Your next step should be to create a .hgignore file. This file will tell mercurial which file types to ignore. It can use two syntaxes, standard shell globs and also regular expressions. This should give you enough flexibility to eliminate all those pesky auto-generated files, movies, etc from your project directory. Here’s what I’ve been using for drupal projects lately, it should give you a good idea of what sort of patterns you might use.


syntax: glob
*.pyc
*~
hostmeta.ini
Thumbs.db
.DS_Store
*.exe
*.flv
*.mov
*.zip
*.avi
*.wmv
*.dv
*.psd
*.LCK

syntax: regexp
.*\#.*\#$
^files.*
^web/files.*
.*CVS.*

Now that we’ve got an .hgignore file, let’s check it into revision control. Simply execute


hg add .hgignore

and then


hg commit -m 'added .hgignore file'

. The add tells mercurial to flag the file revision control. The commit command will actually push the contents of the file into the revision control repository.

Now, let’s put your files under revision control. At this point, since you have a .hgignore file that eliminates all the files you don’t want controlled, you can run the


hg status

command. It will show you all the status of all the files in the revision controlled tree. File which are checked in and already up to date or ignored will not show up on the listing. For a newly created Django project with a single app in it, you might see something like this:


loki:dtest erik$ hg status
? __init__.py
? manage.py
? myapp/__init__.py
? myapp/models.py
? myapp/tests.py
? myapp/views.py
? settings.py
? urls.py
loki:dtest eri

Now, if all the with ? in front of them are ones you want to add to revision control, simply execute


hg addremove

. This will recurse the tree and add all the missing files, and mark any files that have disappeared from your local tree as deleted in the repository. Then, you just run


hg commit -m 'added first set of files'

in order check everything in.

If you had files with ? that you don’t want under revision control, you will need to add expressions to your .hgignore file to ignore them and re-run status. You can also just use add manually on your files, but in my opinion the addremove feature is such a nice addition and hg status is such a powerful feature you will be much better off taking the time to maintain an ignore file.

So, you’ve now got a copy of your code in revision control. A simple hg status should return blank, indicating that your working copy is in sync with the repository. So let’s check out that safety net.

Let’s make a random change to our urls.py.


echo "# this comment is really lame" >> urls.py

And now run hg status one more time. You should see something like this:


loki:dtest erik$ hg status
M urls.py

The M prefix indicates that the file has been modified. Now, let’s see what exactly was modified. Run hg diff. You should get a result like this:


loki:dtest erik$ hg diff
diff -r 7844b323276e urls.py
--- a/urls.py   Tue May 12 02:56:05 2009 -0400
+++ b/urls.py   Tue May 12 02:58:47 2009 -0400
@@ -15,3 +15,4 @@
     # Uncomment the next line to enable the admin:
     # (r'^admin/', include(admin.site.urls)),
 )
+# this comment is really lame

As you can see, we have a nicely unified diff indicating that we added a single line. If you installed a GUI package, you can probably use the GUI to bring up a much more nicely formatted GUI change viewer.

So, now we know what we changed. That’s pretty useful, but how do we get rid of that change? Again, really easy, simply use the

hg revert

command. If you don’t want to revert all your changes, you can run

hg revert urls.py

for instance, which will only revert changes to urls.py.

If you revert a file and then run hg status, you’ll note that the file is no longer marked as modified and there is a new ? urls.py.orig file which mercurial has nicely decided to keep in case you change your mind. I guess .orig would be a good suffix to add to your .hgnore file!

Obviously we’ve just barely begun to scratch the surface of mercurial and DVCS’s in general, but there’s plenty of time for more learning. Even if you just use it for diffs and revert, you’re getting great value from your DVCS and are ready to add in more functionality as you need it. Good luck!

DVCS: Modern Source Control aka the Programmers Safety Net

Revision control is a key tool for modern software engineers. It provides a safety net for the individual developer, and provides a collaborative framework that allows many developers to work on the same project without fear of stepping on each others toes.

Revision control or isn’t a new idea. RCS and it’s descendant, CVS, date back to the early 80′s, and they in turn were based on even older systems. That said, many programmers still aren’t using it. Eric Sink blames it on lack of training. Ben Collins-Sussman thinks it’s because 80% of programmers aren’t "Alphas". Andrew Smith (the number one hit on Google, I might add) thinks it’s because takes too long to learn and it’s hard to set up a server. I’ll plead the fifth and say I hope I can be a part of the solution instead of the problem!

In any case, up until the last few years, revision control systems were centralized. That is, there was a single central repository of code to which contributors connected, checking out code and checkin in their changes. Subversion is the latest of these centralized systems. It was developed specifically to be CVS without the worst of the bugs, and to that end it is very successful. If you want great tools support, have a reasonable sized team, like non-mind-bending behavior, and you only work across a local network anyway, subversion is a great system.

However, many developers have become frustrated with centralized version control. Nobody wants to be accused of ‘breaking the build’, so naturally the frequency of checkins decreases. To the same end, to avoid newbies breaking the build, project administrators don’t give out commit access lightly. The end result is that developers lose the safety-net aspect revision control. I’ve been witness to developers making a copy of their source code, out of revision control, because they’re so afraid they might check in something bad.

In addition, since core contributors are the only ones with commit access to the revision control system, most contributions must come as patches. These patches can be tricky to create in the best of times, but with scale this problem becomes untenable. Just check out the linux kernel mailing list to get a sense of the problem.

The answer to these problems is called a Distributed Version Control System, or DVCS. There are quite a few of these animals out there. Most recently, it seems as if the open source playing field is being dominated by three: Bazaar, Git, and Mercurial. All of these systems have their plusses and minusses, but they are all open source and work well enough to get the job done.

Distributed version control systems share quite a few things in common. Instead of using a line or tree with named revision numbers to store the change history, distributed revision control systems use directed acyclic graphs. This basically means that you can have multiple valid lines or trees at the same time. Hence, distributed.

What this means to you (the developer) is that you get a local copy of the entire repository available to you at all times. That means you can check in, revert, merge, create branches, etc without a network connection.

It also means that you always have access to that revision control sandbox. It allows you to ‘check in early, check in often’, and still not live in fear of breaking the build or disrupting somebody elses work with your bad code. When your code is good and ready, you can review it’s entire change history, merge in any changes, and submit the entire changeset directly to the central repository or to a core committer as a patch.

Having a local copy of the repository also means that you have a more complete copy of your source code at every developer location  with a DVCS than you would with a traditional VCS.

I’ll get into the nitty-gritty of how to actually start using DVCS (and how’s it’s arguably faster and easer than svn) in another post, but for now, just get out there and use something. Not using source control is like skydiving without a parachute.

References:

Developers for Smarties: 5 Traits Of Great Programmers

While the wonderful folk in the world of information technology would have us all believe that all we need to do is buy their latest “solution”, the facts of the matter are that if you’re serious about leveraging technology eventually you are going to need go beyond what came free in the box.

Hacker Barbie

Image by nic221 via Flickr

Going beyond what came free in the box typically means hiring somebody calling themselves a Programmer, Developer, Software Engineer or even Software Architect. If they worked someplace interesting or for themselves, they might prefer terms like Hacker, Geek or Programming Rock Star. Unfortunately, none of this stuff really means anything! Having worked with and around these guys for a few years I can say with great confidence that their ability to program is in no way related to these accolades. Unfortunately, involvement with large companies or ostensibly successful high-profile projects doesn’t necessarily mean anything either. It’s not that hard to hitch your wagon to a train.

So what are you looking for?

  1. Engaged with Technology. I’ve never met a good programmer who didn’t get excited about almost any technical project. It could be simple like “hey did you see the new Google fizzbuzz application” or complicated like “hey, did you hear they added support for asyncronous transaction replication to the latest version of MySQL”? Either way, a good programmer will engage. Bear in mind, the first response is likely to be “That’s BS, MySQL sucks, use MS SQL Server”, but they’ll engage. If they sit there like a dead fish and don’t even ask you what you are talking about, be afraid. Caveat: You need to be able to be sufficiently technical to engage in a discussion with a programmer. If you can’t, chances are you aren’t equipped to manage one either, so why try to hire one?
  2. Problem Solving. To a degree beyond most professions, programming is about problem solving. See my previous post on Levels of Work, but frankly, you can’t afford to have many Stratum I programmers on your staff unless you’re running an internship program. You want the guys that can engage problems head on, and come out with solutions. Good programmers love to hear about what your problems in an interview, and would love to offer ideas as to how to solve them. Give them the chance.
  3. Language Agnostic. Great programmers are familiar with more than one language. Actually, more like more than ten. It should be obvious, then, that looking for somebody with 10 years of C++ experience is a sure-fire way to eliminate great candidates. Bear in mind, if your goal is to create the next revision of the C++ standard, you really do need somebody who is deeply involved in the C++ community and has been for 10 years. Most of us, however, are much better off using the right tool for the job, and frankly, sometimes that means using a new programming language. Anybody who hasn’t even dabbled in another environment for 10 years probably won’t adapt well to the future. Computer languages aren’t that complicated, and even better, learning another language adds a ton of great perspective to every program you write. The obvious extension to this rule is that they should also be familiar with more than one development framework. Every Rails programer should have at least checked out what Django has to offer by perusing their website. J2EE programmers should at least know that they aren’t just WebLogic or WebSphere programmers, but that they are also Glassfish or JBoss programmers.
  4. Understands Development Process. If you haven’t seen The Joel Test by now, you’re missing out. Ditto for any programmer that isn’t familiar with Specifications, Revision Control, Defect Tracking and Automated Build Systems. Good programmers whine about how they never get complete enough specs, or say they gave up on specs in favor of agile methods. Good programmers will happily bore you to tears with the reasons you should use distributed version control instead of that lame old CVS clone you’re using. If they say something dumb like “I don’t use revision control, I make backups to floppy discs”, run away. Quickly.
  5. A selection of programming language textbooks ...
    Image via Wikipedia

    Continuous Learning. Would you want a brain surgeon who last updated his technique in 1985? Maybe you would, but you definitely don’t want a programmer with skills that old. Technology is the fastest moving set of skills in the workplace today, and among technology, software is arguably the fastest moving of all. Programmers are the mechanics of software, and they need to be up to date. Good programmers are always looking for an excuse to try something new. Suggest a possible Rails app to a guy who’s been doing J2EE and, all things being equal, the correct response is “I’m familiar with it, but never had the chance to use it for a Project”. Now, bear in mind she may be afraid to sidetrack her career by becoming the Junior Rails Programmer (TM), but you wouldn’t be so silly as to try to pull that stunt anyway, right?

In short, hiring a programmer is hiring somebody to think. They have to be able to think out of the box of convention while still staying in the box of what can actually be done using the technology available. They have to continually update their skills, and enjoy doing it. I’ve barely scratched the surface of what makes a great programmer, but here’s some other opinions to consider as well:

Reblog this post [with Zemanta]

Virtualization Technology Overview

One of the hottest buzzwords in technology today is virtualization. Unforrtunately, virtualization by itself covers a vast array of potential technologies.

Let’s look at the word itself first. Virtualization implies taking something “real” and “virtualizing” it, or making it “virtual”. Typically, this is exactly what is going on with virtualization technologies, the differences lie in what exactly is being virtualized.

Storage Virtualization

At the very bottom of the technology stack lies storage virtualization. Unfortunately, the storage industry is probably the most arcane of all IT sectors, and the inability to agree on what truly comprises storage virtualization technology is a great case in point.

A typical server "rack", commonly se...
Image via Wikipedia

From a layman’s perspective storage virtualization should simply mean that your OS images don’t have to worry about where their storage comes from, it’s just there. Virtualization should allow it to be resized, move to new physical hardware, and isolated from hardware failure, all without effecting the running operating system and storage. In practice, vendors will call almost anything storage virtualization, so make sure you put on your skeptics hat when you hear them claiming to support it.

Coupled with server-side operating system virtualization, true storage virtualization delivers a one-two knockout punch to old fashioned IT. Expect to see this technology stabilize quickly as the industry gets excited about cloud compuing.

Operating System Hardware Virtualization

Lately there’s been lots of news over OS virtualization. VMWare, Xen, Parralels, Sun xVM, and Microsoft Hyper-V are all this sort of technology. Newer Intel and AMD processors even have native instructions to make this sort of virtualization more efficient. In short, OS virtualization allows you to run several complete virtual systems on a single virtual machine host.

Operating system virtualization is revolutionary when it comes to maintaining datacenters. Most server applications are not cpu, memory, or even I/O bound. Rather, they are IT staff bound, meaning that every application gets to run on it’s own server because IT doesn’t dare install it any other way.

OS Virtualization seperates the OS image from the hardware support, allowing you to provision a host operating system as specified by a specific application, while migrating it from machine to machine as needs dictate.

There is a ton of R&D going into OS virtualization, both on the client and server side. Modern hypervisors allow you to migrate a running virtual machine from one physical host to another, allow for automatic failover and load balancing, and integrate with backup software. At this point, if you are provisioning new server systems without using virtualization, you ought to be asking yourself why.

On the client side, OS virtualization is pretty cool for tech people, but in my book still isn’t where it needs to be. There is no reason for most computer users to run their OS on native hardware, yet that hasn’t happened. Until it does, you’ll be stuck dealing with rebuilding your system every year or so to clear out the cruft or to handle a system migration. Technologies like VMWare ACE are trying to tackle this proble, but the technology has a long way to go before it catches up with server virtulization with respect hardware isolation.

Operating System Partitioning

Another very interesting technology in the operating sytem world is system partitioning. OpenSolaris Zones, FreeBSD Jails, IBM LPARS and Linux vServers are all variants of this technology. In short, the runtime environment of the operating system is partitioned off from the host system while continuing to share a kernel.

Since kernels tend to be pretty reliable, this allows you to have some of the principle benefits of hardware virtualization (application isolation) without many of the costs (performance and memory overhead).

Unfortunately, this type of technology hasn’t really made any headway in the Microsoft stack, so it’s not used very heavily. Nonetheless, it’s something to keep your eyes on as the world moves towards cloud computing etc.

Runtime Virtualization

Moving farther up the application stack is runtime virtualization. Microsoft .NET and Java are the 800lb gorilla’s in thie space. Runtime virtualization brings much of the application isolation benefits of OS-level virtualization schemes without incurring the much of the complexity of maintaining multiple operating system images.

The downside is obviously that software has to be explicitly targetted at a virtualization friendly runtime. However, you should understand the development and deployment benefits of targetting such an environment before choosing a development platform for a new project. They are by no means insignificant, and a lot of very smart people are working very hard at moving this technology forward.

Projects using Runtime Virtualization that I find particularly interesting are JRuby and Jython for Java and IronRuby/IronPython for .NET. These projects bring the development benefits of modern scripting languages to the deployment benefits of runtime virtualization. Look for a lot more good stuff from these guys.


Reblog this post [with Zemanta]

Vendor negotiation tips

Stephen Foskett and Martin Glassborow have some great posts about pricing and negotiation lately, and I thought I’d jump into the fray. In my experience, there are a few other things to consider, especially if you’re new at this. Much of this is standard negotiation technique, but tech folks don’t necessarily come with the same level of negotiation experience as some other disciplines so I think it bears repeating.

NetApp FAS270
Image by mondopiccolo via Flickr
  1. Everything is negotiable. The corollary of this is “you will not get what you do not ask for”. Bear in mind you may not get it even by asking, but you won’t know if you don’t ask.
  2. Keep your options open. Make sure you keep as broad a scope on the negotiation as possible. It is in the vendor’s best interest to lock you into thinking you need some aspect of his solution, which he can then use to lever you into considering competitive products that are out of your league. Instead, when you’re negotiating for a new internet pipe, negotiate from the stance that you could downgrade to a cable-modem and co-locate your services.
  3. Focus on your business needs. Often-times you may think you need one thing, but you actually need another. While you need to know the specifics of what you’re buying, if you continually bring the negotiation back to the business problems you are trying to solve, you may find your vendors are able to offer you a better product from a different part of the house. This is another form of keeping your options open, but it’s very important, because it keeps the onus on the vendor to meet your needs rather than just quote you a bill of goods.
  4. Put time on your side. Simply put, start negotiations early and make sure your deadlines aren’t hard ones. Like Martin says, align your purchasing cycles with your vendors quarterly sales targets. The more time you can afford to take, the more time you have for the vendor to get desperate to close the deal and scare up some better pricing. Don’t overdo it though, your long term relationship will suffer if you are always taking vendors for a ride and never buying.
  5. Understand what you are buying. Unfortunately, at least at the scale I am usually buying, I know more about the technical details of the product class and even the specific products being discussed than the vendor reps. I’ve found if I don’t know more, I’m likely to get taken for a ride with things I don’t need. This plays heavily into the next point as well. You simply can’t afford to buy stuff that won’t meet your needs fully, and nobody but you can really ensure that.
  6. Demand a demo. When you’re spending big bucks, this usually isn’t too hard. But you can still get demo’s if you’re buying less, and frankly, especially with the economy as it is lately, if the vendor won’t take you seriously enough to get you in front of the kit they want to sell you, you don’t want to do business. There’s really no reason to purchase something you aren’t 110% confident is going to meet your needs, so prove it to yourself before signing anything.

In conclusion, if there’s a single rule I’ve found, it’s that there is no shortcut. A perfect RFP is still just the first step towards getting a good solution for a great price. You’ll still have to learn all there is to know about the product, build a relationship with a vendor, and make sure you’re buying the right thing. Don’t be seduced by the siren’s call of a “write the check solution”, the only problem it will solve is your vendor’s cash-flow.

Reblog this post [with Zemanta]

Level of Work: Choose the right person for the task

In the managerial world, at least in some circles, Elliott Jacques is well known for his theory of requisite organization. While his work runs against some of the conventional wisdom of organizational theory, his Level of Work concept has merit for anyone. Bear in mind that this is a very cursory overview of a complex theory, so in order to really put it to use you’ll need to do some more background research.

In short, Jacques found that human’s ability to handle task complexity increases in step-wise manner, each level of which he called a Level of Work. Each level of work can be defined in terms of its maximum Time Span of Discretion (or time span), and also by the problem solving methodology required to be able to operate effectively at that level. In addition, since the levels of work are discrete, the crux of requisite organization theory is that each level must be “stacked” appropriately within the organization for the organization to function effectively. Jacques uses the term “Cognitive Capability” to describe the level of work a given person is capable of.

Hard at Work!

Hard at Work!

Regardless, organizational theory aside, the concept of work levels brings much to bear to technical work as well. In a technical environment, Jacques theory provides a way to determine the best person for a given technical task. In addition, understanding a person’s cognitive capability makes it possible to ensure that you assign them tasks in chunks they are capable of handling.

Stratum I – Engineers

In a traditional environment, Stratum I is tasked with “getting the work done”. Their time-span is limited to less than 3 months, and they solve problems by trial and error. To put this into perspective, I don’t think I’ve ever seen a programming team that didn’t have at least one fabled Stratum I member. In technical terms, a person capable of Stratum I level of work could handle filling in some function definitions that were already defined, or installing a bunch of operating systems based on a procedure they were given.

The things a Stratum I worker can’t do is where they become fabled. This is the guy that was tasked with putting together a new OS image but forgot to check whether or not automatic updates were enabled. Or the gal tasked with creating a program to extract information from some production logs, but who didn’t bother to check if the parsing libraries they used was compatible with the production environment.

Bear in mind that this doesn’t make them incompetent. They may know more than most people about  OS images or log parsing, but their limited time span makes it hard for them to see a bigger picture. They are focused on the immediate task, and nothing else.

Stratum IISenior Engineers

In a traditional environment, Stratum II workers are called Supervisors and charged with “making sure the work gets done”. Their time span upper bound is 1 year, and they are capable of problem solving by information gathering. In technical terms, these folk are probably considered “Senior Engineers”. They are capable of working at a higher level, such as “design and implement a system to integrate product A and B”. However, it’s important to note their limitations.

A time span of 1 year is still not very long. It is long enough to implement a significant chunk of code, but probably not long enough to design a chunk of code that will underpin an entire system for many years. In addition, while they are capable of problem solving by gathering information from multiple sources, they are still not really capable of serial processing. This means they will still have trouble seeing the results of their actions on a larger scope.

Bringing us back to the previous example, assigning the OS image creation task to a Stratum II worker would probably result in properly functioning automatic updates, but would likely fail to anticipate the need to roll out the next service pack or hardware generation. Note that this may well be ok, we don’t always have a crystal ball for the future!

Stratum III – Architects

In a traditional environment, Stratum III workers are called Managers and charged with “creating systems”. They have a maximum time span of 2 years and can problem solve using serial processing. In technical terms, these folk are probably your system architects, principal engineers, and perhaps VP of engineering types.

A 2 year time-span combined with serial processing capability is a powerful combination. These are the guys that join your team and start creating systems to speed things up. If you don’t have revision control, defect tracking, or change management, they won’t rest until they’re in place because they can see the long term benefits of these systems. They are also likely to ask lots of big-picture questions and wonder where your specifications and test procedures are.

Unfortunately, the ability to see the big picture isn’t always a benefit. Task a Stratum III worker with a simple task like “refactor this class” and there is a good chance it will take him 3 times as long as necessary to finish as he creates the perfect system to solve your trivial problem and whole host of other associated problems. On the other hand, these kind of system builders are the reason we have automated refactoring tools!

Stratum IV – Integrators

Traditionally, Stratum IV workers are called integrators. They are often general managers, or C-Level execs with cross functional responsibilities such as operations, financial, and technical officers. They have a maximum time span of 5 years and can handle problem solving using parallel reasoning. In technical terms, you’d better hope your CTO can operate at this level. Many very senior engineers operate at this level and above as well. The defining trait of Stratum IV ability is following multiple lines of serial reasoning concurrently.

For instance, the choice of a development platform in a reasonably large company requires at least Stratum IV work level capability. There will be potentially dozens of ramifications of such a choice, from your ability to hire in talent, to the efficiency of development, to the ease of deployment, to the production environment security and finally to the usability of the final product. Each of these lines of reasoning will unfold for years, and the ability to “zoom out” in perspective enough to focus on what is important while ignoring the mass of irrelevant details is of paramount importance.

There are lots of other technical tasks that fall under Stratum IV and higher time span. Language design, API decisions and Enterprise Systems Architecture come to mind immediately. What others can you think of?

Conclusion

In conclusion, it’s important to recognize the work level capability of the folks on your team, and the complexity of the tasks you assign. It’s possible to break down a given task to any level necessary, and if the level of detail provided matches the capability of the worker you will get much better results. As I said above, I highly recommend learning more about Jacques Capability Model to anyone involved in project management or working with teams.

Unfortunately, I’ve only been able to scratch the surface with this post. I’ve included a few links to other blogs that feature Jacques capability model, in addition to some amazon links to Jacques original books (which are not for the faint-hearted).

Other Resources

  • Michelle Malay-Carter’s mission minded management blog has some great material about requisitive organization and work levels. Check out her latest post about work levels, and this post talking about the dangers of over-hiring I mentioned under Stratum III.
  • Tom Foster’s management skills blog is another great resource for information about work levels and job strata. He’s also a management consultant and teacher. I’ve taken his online course and recommend it highly. Check out this post (or this more recent one) for one entry to his large catalog of material. Tom takes an interesting story-telling approach, but stick with it.  Following the stories will pay off.
  • Jacques himself on Executive Leadership: A Practical Guide to Managing Complexity (Developmental Management)
Reblog this post [with Zemanta]

A blog. About stuff.