SimonHF's Blog

Just another WordPress.com site

libsxe, shared-memory, and parallel state-driven algorithms February 27, 2011

I recently came across the following paper: “Memory Models: A Case for Rethinking Parallel Languages and Hardware” by Sarita V. Adve and Hans-J. Boehm

The paper starts off:

The era of parallel computing for the masses is here, but writing correct parallel programs remains far more difficult than writing sequential programs. Aside from a few domains, most parallel programs are written using a shared-memory approach. The memory model, which specifies the meaning of shared variables, is at the heart of this programming model. Unfortunately, it has involved a tradeoff between programmability and performance, and has arguably been one of the most challenging and contentious areas in both hardware architecture and programming language specification. Recent broad  community-scale efforts have finally led to a convergence in this debate, with popular languages such as Java and C++ and most hardware vendors publishing compatible memory model specifications. Although this convergence is a dramatic improvement, it has exposed fundamental shortcomings in current popular languages and systems that prevent achieving the vision of structured and safe parallel programming. This paper discusses the path to the above convergence, the hard lessons learned, and their implications. …

And then introduces the idea of “disciplined shared-memory models”:

Moving forward, we believe a critical research agenda to enable “parallelism for the masses” is to develop and promote disciplined shared-memory models that:
• are simple enough to be easily teachable to undergraduates; i.e., minimally provide sequential consistency to programs that obey the required discipline;
• enable the enforcement of the discipline; i.e., violations of the discipline should not have undefined or horrendously complex semantics, but should be caught and returned back to the programmer as illegal;
• are general-purpose enough to express important parallel algorithms and patterns; and
• enable high and scalable performance.

This is interesting because libsxe has a disciplined shared-memory model which goes a long way towards fulfilling the criteria above in the form of the sxe pool library. So what is a sxe pool and how does it offer us a disciplined shared-memory model?

The sxe pool library was invented for different reasons than offering a disciplined shared-memory model. A shared-memory option was added later as a pool construction option. In short, sxe pools offer a way to create C arrays of structs with the following generic benefits:
• The size of the array is persisted
• Each element of the array gets its own state which is persisted outside the element struct in a linked list
• Each element of the array gets its own timestamp which is persisted outside the element struct in a linked list
• Each element is accessed using regular & concise C code, e.g. myarray[myindex].mystructmember

The sxe pool library caller can manipulate the state & timestamp element properties using the following API:

sxe_pool_get_number_in_state(void * array, unsigned state)
sxe_pool_index_to_state(void * array, unsigned id)
sxe_pool_set_indexed_element_state(void * array, unsigned id, unsigned old_state, unsigned new_state)
sxe_pool_set_oldest_element_state(void * array, unsigned old_state, unsigned new_state)
sxe_pool_get_oldest_element_index(void * array, unsigned state)
sxe_pool_get_oldest_element_time(void * array, unsigned state)
sxe_pool_get_element_time_by_index(void * array, unsigned element)
sxe_pool_touch_indexed_element(void * array, unsigned id)

Converting the sxe pool library API to support shared-memory was relatively simple. The sxe_pool_new() function got an option to share the pool memory. The API functions to change the pool element state use atomic assembler instructions if the pool was constructed as a shared-memory pool. It’s also interesting to note that sxe pools can be shared between processes as well as between threads in the same process. This is because the sxe pool library internal implementation avoids absolute pointers; which is also something that I encourage from libsxe developers and C developers in general.

This API is “general-purpose enough to express important parallel algorithms and patterns” and most interestingly is the same API whether the algorithm is threaded or not. It’s also “simple enough to be easily teachable to undergraduates” or even junior developers as we have found out at Sophos. The atomic sxe_pool_set_[indexed|oldest]_element_state() API functions “enable the enforcement of the discipline” by requiring both the old state and the new state of the array element; if the developer supplies the wrong old state then sxe pool will assert. Because the sxe pool library manages the element states itself then an assert is very unlikely when using a single pool. However, more complicated algorithms often make use of chains of pools in order to implement multiplexing and/or combining of parallel results, etc. In these cases, it is common to keep references to pool array element indexes and/or pool array element states in the caller supplied pool element structs. Finally, by implementing algorithms using the sxe pool API then it is possible to “enable high and scalable performance” using a minimum of simple to understand C source code. The developer is forced into thinking about the algorithm as a state model which often simplifies the hardest problems. And the generic part of the implementation complexity — e.g. locking, shared memory, keeping state, double linked lists, timeout handling, & memory allocation — is all handled by the sxe pool library and backed by automated tests with 100% library code coverage. The resulting performance is excellent as can be seen by the figures published in earlier blog entries; tens of thousands of network requests per second per core.

As you can see, the sxe pool is an incredibly powerful and code saving and code simplifying generic data structure. It’s a sort of Swiss Army knife for parallel, event driven algorithms. In a future article I’ll show some of the implementation patterns.

 

Thought Experiment: Scaling Facebook January 30, 2011

I was recently asked the question:

“Let’s imagine it’s circa 2004 and we’ve just launched a website that is about to evolve into Facebook. We’re “pre-architecture” – we have a single instance of MySQL running on a linux machine out of a dorm room at Harvard. The site starts to take off. Status updates are hitting the database at ferocious pace. Now what???!!! If we don’t do something soon, the system will crash. What kinds of caching layers do we build, and how will they interact with the database backend? Do we transition to a colo, or keep everything in house? Let’s see if we can put our heads together and think through Facebook’s many scaling challenges, all the way from the ground up …”

Here’s my reply:

I love this thought experiment 🙂 It’s one of the things which I dream about when I go to bed at night and which my mind first thinks about when I wake up in the morning. No kidding. I also had to architect and lead develop a smaller cloud system for my employer which scales up to the tens of millions of Sophos users… it would scale further but we don’t have e.g. one billion users! But I dream about much more complex clouds which scale to a billion or more users… even though such systems don’t exist yet. I believe that to create such systems we have to go back to computer science fundamentals and be prepared to think outside the box if necessary. For example, I love the emerging NoSQL technology… but all of the systems are flawed in some way, and performance varies enormously but orders of magnitude. Few off the shelf technologies appear to come close to BerkeleyDB performance which is a disappointment. Back to scaling Facebook: One of the fundamental problems is striving to use the smallest number of servers while serving the largest population of users. The cost of the servers and the server traffic is going to play an enormous role in the business plan. Therefore I need to think in terms of creating servers or groups of servers which act as building blocks which can be duplicated in order to scale the cloud as large as required. Because we’ll be scaling up to using hundreds or thousands of servers, then it becomes cost effective to develop efficient technology which doesn’t exist yet. For example, the sexy new node.js technology is wonderful for prototyping cloud services or running clouds which don’t have to be scaled too big. However, node.js uses nearly ten times more memory than, say, C does (https://simonhf.wordpress.com/2010/10/01/node-js-versus-sxe-hello-world-complexity-speed-and-memory-usage/) for the same task… and this may mean having to use ten times more servers. So for a monster cloud service like Facebook then I’d forget about all the scripting language solutions like node.js and Ruby On Rails, etc. Instead I’d go for the most efficient solution which also happens to be the same challenge that my employer gave me; to achieve the most efficient run-time performance per server while allowing as many servers as necessary to work in parallel. This can be done efficiently by using the C language mixed with kernel asynchronous events. However, in order to make working in C fast and productive then some changes need to be made. The C code needs to work with a fixed memory architecture — almost unheard of in the computing world. This is really thinking out of the box. Without the overhead of constantly allocating memory and/or garbage collecting then the C code becomes faster than normally imaginable. The next thing is to make the development of the C code much faster… nearing the speed of development of scripting languages. Some of the things which make C development slow is the constant editing of header and makefiles. So I designed a system where this is largely done automatically. Next, C pointers cause confusion for programmers young and old so I removed the necessity to use pointers a lot in regular code. Another problem, is how to develop protocols, keep state, and debug massively parallel and asynchronous code while keeping the code as concise and readable as possible. Theses problem have also been solved. In short, Facebook helped to solve their scaling problem by developing HipHop technology which allowed them to program in PHP and compile the PHP to ‘highly optimized C++’. According to the blurb then compiled PHP runs 50% faster. So in theory this reduces the number of servers necessary also by 50%. My approach is from the other direction; make programming in C so comfortable that using a scripting language isn’t necessary. Also, use C instead of C++ because C++ (generally) relies on dynamic memory allocation which is also an unnecessary overhead at run-time. Languages which support dynamic memory allocation are great for general purpose programs which are generally not designed to use all the memory on a box. In contrast, in our cloud we will have clusters of servers running daemons which already have all the memory allocated at run-time that they will ever use. So there is no need for example to have any ‘swap’. If a box has, say, 16GB RAM then a particular daemon might be using 15.5GB of that RAM all the time. This technique also has some useful side-effects; we don’t have to worry about garbage collection or ever debug memory leaks because the process memory does not change at run-time. Also, DDOS attacks will not send the servers into some sort of unstable, memory swap nightmare. Instead, as soon as the DDOS passes then everything immediately returns to business as usual without instability problems etc. So being able to rapidly develop the Facebook business logic in C using a fixed memory environment is going to enable the use of less servers (because of faster code using less memory) and result in a commercial advantage. But there is still a problem: Where do we store all the data that all the Facebook users are generating? Any SQL solution is not going to scale cost effectively. The NoSQL offerings also have their own limitations. Amazon storage looks good but can I do it cheaper myself? Again, I create an own technology which is a hierarchical, distributed, redundant hash table. But unlike a big virtual hard disk, the HDRHT can store both very small and very large files without being wasteful and we can make it bigger as fast as we can hook up new drives and servers. The files would be compressed and chunked and stored via SHA1 (similar-ish to Venti although more efficient; http://doc.cat-v.org/plan_9/4th_edition/papers/venti/) in order to avoid duplication of data. I’d probably take one of these (http://www.engadget.com/2009/07/23/cambrionix-49-port-usb-hub-for-professionals-nerds/) per server and connect 49 2TB external drives to it, giving 49TB of of redundant data per server (although in deployment the data would redundant across different servers in different geographic locations). There would be 20 such servers to provide one Petabyte of redundant data, 200 for ten Petabytes, etc. Large parts of the system can be and should be in a Colo in order to keep the costs minimal. Other parts need to be in-house. The easy way to build robust & scalable network programs without comprising run-time performance or memory usage is called SXE (https://simonhf.wordpress.com/2010/10/09/what-is-sxe/) and is a work-in-progress recently open-sourced by my employer, Sophos. Much of what I’ve written about exists right now. The other stuff is right around the corner… 🙂

 

node.js versus Lua “Hello World” October 13, 2010

Neil Watkiss — known among other things for many cool Perl modules — has created a non-optimized, experimental version of SXE (pronounced ‘sexy’) containing embedded Lua called SXELua. So I thought it would be fun to redo the familiar – to readers of this blog – ‘Hello World’ benchmark using SXELua. And here is the Lua source code:

do
    local connect = function (sxe) end
    local read = function (sxe, content)
        if content:match("\r\n\r\n", -4) then sxe_write(sxe,"HTTP/1.0 200 OK\r\nConnection: Close\r\nContent-Type: text/html\r\nContent-Length: 14\r\n\r\nHello World\n\r\n")
        end
    end
    local close = function (sxe) end
    sxe_register(10001, function () sxe_listen(sxe_new_tcp("127.0.0.1", 8000, connect, read, close)) end)
end

Compare this with the slightly longer node.js equivalent from the last blog:

var net = require('net');
var server = net.createServer(function (stream) {
  stream.on('connect', function () {});
  stream.on('data', function (data) {
    var l = data.length;
    if (l >= 4 && data[l - 4] == 0xd && data [l - 3] == 0xa && data[l - 2] == 0xd && data[l - 1] == 0xa) {
      stream.write('HTTP/1.0 200 OK\r\nConnection: Keep-Alive\r\nContent-Type: text/html\r\nContent-Length: 13\r\n\r\nHello World\r\n');
    }
  });
  stream.on('end', function () {stream.end();});
});
server.listen(8000, 'localhost');

And now the updated results:

“Hello World”    Queries/ % Speed
Server           Second   of SXE
---------------- -------- -------
node.js+http     12,344    16%
Node.js+net+crcr 23,224    30% <-- *1
Node.js+net      28,867    37%
SXELua           66,731    85%
SXE              78,437   100%

In conclusion, calling Lua functions from C and vice-versa is very fast… close  to the speed of C itself. I am very excited by how well Lua performed in the benchmark. The Lua “Hello World” program performed 3.6 times better than the node.js equivalent. After a quick Google it looks like this isn’t the first time that JavaScript V8 has gone up against Lua; these results suggest that SXELua could get even faster after optimization. It looks like Lua will become part of SXE soon. Lua seems ideal for creating tests for SXE & SXELua programs alike, and prototyping programs. Stay tuned…!

*1 Update: Somebody who knows JavaScript better than me offered faster code to detect the “\n\r\n\r”. I updated the script above and the resulting queries per second and % speed of SXE.

 

Nginx versus SXE “Hello World” October 2, 2010

After my last post then a colleague offered the criticism that comparing C to a script language is a bit like shooting fish in a barrel 🙂 I think the colleague missed the point which is that often the main reason for choosing to use a scripting language in the first place is to achieve rapid application development at the expense of run-time performance and memory usage. The purpose of the post was to try to dispel this myth and show how few lines of C source code can be necessary to achieve ultimate performance. However, in order to keep the colleague happy, here is a similar head to head between nginx and SXE. What is nginx? Here’s what Wikipedia says about nginx: “Nginx quickly delivers static content with efficient use of system resources.” Now on with the “Hello World” comparison…

Here is the nginx.conf:

# cat /etc/nginx/nginx.conf
worker_processes  1;
events {
    worker_connections  10240;
}
http {
    server {
        listen 8000;
        access_log off;
        server_name  localhost;
        location / {
            root   html;
            index  index.html index.htm;
        }
    }
}

 And here is the index.html file:

# cat /usr/html/index.html
Hello World

I use the same http.c from the previous post in order to load test nginx. Here are the results:

# ./http -i 127.0.0.1 -p 8000 -n 50 -c 10000
20101002 181142.250 P00006a5f ------ 1 - connecting via ramp 10000 sockets to peer 127.0.0.1:8000
20101002 181142.290 P00006a5f    999 1 - connected: 1000
20101002 181142.328 P00006a5f   1999 1 - connected: 2000
20101002 181142.367 P00006a5f   2999 1 - connected: 3000
20101002 181142.406 P00006a5f   3999 1 - connected: 4000
20101002 181142.445 P00006a5f   4999 1 - connected: 5000
20101002 181142.484 P00006a5f   5999 1 - connected: 6000
20101002 181142.523 P00006a5f   6999 1 - connected: 7000
20101002 181142.562 P00006a5f   7999 1 - connected: 8000
20101002 181142.602 P00006a5f   8999 1 - connected: 9000
20101002 181142.641 P00006a5f   9999 1 - connected: 10000
20101002 181142.641 P00006a5f ------ 1 - starting writes: 500000 (= 10000 sockets * 50 queries/socket) queries
20101002 181142.641 P00006a5f ------ 1 - using query of 199 bytes:
20101002 181142.641 P00006a5f ------ 1 - 080552a0 47 45 54 20 2f 31 32 33 34 35 36 37 38 39 2f 31 GET /123456789/1
20101002 181142.641 P00006a5f ------ 1 - 080552b0 32 33 34 35 36 37 38 39 2f 31 32 33 34 35 36 37 23456789/1234567
20101002 181142.641 P00006a5f ------ 1 - 080552c0 38 39 2f 31 32 33 34 35 36 37 38 39 2f 31 32 33 89/123456789/123
20101002 181142.641 P00006a5f ------ 1 - 080552d0 34 35 36 37 38 39 2f 31 32 33 34 35 36 37 38 39 456789/123456789
20101002 181142.641 P00006a5f ------ 1 - 080552e0 2f 31 32 33 34 35 36 37 38 39 2f 31 32 33 34 35 /123456789/12345
20101002 181142.641 P00006a5f ------ 1 - 080552f0 36 37 2e 68 74 6d 20 48 54 54 50 2f 31 2e 31 0d 67.htm HTTP/1.1.
20101002 181142.641 P00006a5f ------ 1 - 08055300 0a 43 6f 6e 6e 65 63 74 69 6f 6e 3a 20 4b 65 65 .Connection: Kee
20101002 181142.641 P00006a5f ------ 1 - 08055310 70 2d 41 6c 69 76 65 0d 0a 48 6f 73 74 3a 20 31 p-Alive..Host: 1
20101002 181142.641 P00006a5f ------ 1 - 08055320 32 37 2e 30 2e 30 2e 31 3a 38 30 30 30 0d 0a 55 27.0.0.1:8000..U
20101002 181142.641 P00006a5f ------ 1 - 08055330 73 65 72 2d 41 67 65 6e 74 3a 20 53 58 45 2d 68 ser-Agent: SXE-h
20101002 181142.641 P00006a5f ------ 1 - 08055340 74 74 70 2d 6c 6f 61 64 2d 6b 65 65 70 61 6c 69 ttp-load-keepali
20101002 181142.641 P00006a5f ------ 1 - 08055350 76 65 2f 31 2e 30 0d 0a 41 63 63 65 70 74 3a 20 ve/1.0..Accept:
20101002 181142.641 P00006a5f ------ 1 - 08055360 2a 2f 2a 0d 0a 0d 0a                            */*....
20101002 181202.794 P00006a5f   9128 1 - read all expected http responses
20101002 181202.794 P00006a5f   9128 1 - time for all connections: 0.391057 seconds or 25571.718778 per second
20101002 181202.794 P00006a5f   9128 1 - time for all queries    : 20.152358 seconds or 24810.992567 per second
20101002 181202.794 P00006a5f   9128 1 - time for all            : 20.543415 seconds or 24338.699486 per second

Where nginx manages 25,571 connections per second, the SXE implementation manages 25,009 connections per second; a performance tie. Further, where nginx manages 24,810 queries per second, the SXE implementation manages 59,171 queries per second; a 2.4 fold increase. This is an especially great result for SXE because there is still scope for optimizing its code further.

During the test I also monitored memory usage of both the client and server processes:

# top -b -d1 | egrep "(nginx|http)"
27215 root      25   0 40788  836  344 S    0  0.0   0:00.00 nginx
27216 nobody    25   0 44712 5160  692 S    0  0.1   0:18.46 nginx
27231 root      15   0 18064  16m  516 R   79  0.4   0:00.79 http
27216 nobody    16   0 57468  17m  692 R   67  0.4   0:19.13 nginx
27215 root      25   0 40788  836  344 S    0  0.0   0:00.00 nginx
27216 nobody    17   0 60064  20m  692 R   98  0.5   0:20.12 nginx
27231 root      15   0 18064  16m  516 S   58  0.4   0:01.37 http
27215 root      25   0 40788  836  344 S    0  0.0   0:00.00 nginx
27216 nobody    19   0 60064  20m  692 R   96  0.5   0:21.09 nginx
27231 root      15   0 18064  16m  516 R   68  0.4   0:02.05 http
27215 root      25   0 40788  836  344 S    0  0.0   0:00.00 nginx
27216 nobody    20   0 60064  20m  692 R   97  0.5   0:22.07 nginx
27231 root      15   0 18064  16m  516 R   64  0.4   0:02.69 http
27215 root      25   0 40788  836  344 S    0  0.0   0:00.00 nginx
27216 nobody    24   0 60064  20m  692 R   97  0.5   0:23.05 nginx
27231 root      15   0 18064  16m  516 R   66  0.4   0:03.35 http
27215 root      25   0 40788  836  344 S    0  0.0   0:00.00 nginx
27216 nobody    25   0 60064  20m  692 R   86  0.5   0:23.91 nginx
27231 root      15   0 18064  16m  516 R   42  0.4   0:03.77 http
27215 root      25   0 40788  836  344 S    0  0.0   0:00.00 nginx
27216 nobody    25   0 60064  20m  692 R   99  0.5   0:24.90 nginx
27231 root      15   0 18064  16m  516 R   50  0.4   0:04.27 http
27215 root      25   0 40788  836  344 S    0  0.0   0:00.00 nginx
27216 nobody    25   0 60064  20m  692 R   99  0.5   0:25.90 nginx
27231 root      15   0 18064  16m  516 R   50  0.4   0:04.77 http
27215 root      25   0 40788  836  344 S    0  0.0   0:00.00 nginx
27216 nobody    25   0 60064  20m  692 R  100  0.5   0:26.91 nginx
27231 root      15   0 18064  16m  516 R   50  0.4   0:05.27 http
27215 root      25   0 40788  836  344 S    0  0.0   0:00.00 nginx
27216 nobody    25   0 60064  20m  692 R   99  0.5   0:27.91 nginx
27231 root      15   0 18064  16m  516 R   54  0.4   0:05.81 http
27215 root      25   0 40788  836  344 S    0  0.0   0:00.00 nginx
27216 nobody    25   0 60064  20m  692 R   99  0.5   0:28.91 nginx
27231 root      15   0 18064  16m  516 R   53  0.4   0:06.34 http
27215 root      25   0 40788  836  344 S    0  0.0   0:00.00 nginx
27216 nobody    25   0 60064  20m  692 R   99  0.5   0:29.91 nginx
27231 root      15   0 18064  16m  516 R   50  0.4   0:06.84 http
27215 root      25   0 40788  836  344 S    0  0.0   0:00.00 nginx
27216 nobody    25   0 60064  20m  692 R   99  0.5   0:30.90 nginx
27231 root      15   0 18064  16m  516 R   51  0.4   0:07.35 http
27215 root      25   0 40788  836  344 S    0  0.0   0:00.00 nginx
27216 nobody    25   0 60064  20m  692 R  100  0.5   0:31.91 nginx
27231 root      15   0 18064  16m  516 R   50  0.4   0:07.85 http
27215 root      25   0 40788  836  344 S    0  0.0   0:00.00 nginx
27216 nobody    25   0 60064  20m  692 D   54  0.5   0:32.45 nginx
27231 root      15   0 18064  16m  516 S   28  0.4   0:08.13 http
27215 root      25   0 40788  836  344 S    0  0.0   0:00.00 nginx
27216 nobody    19   0 60064  20m  692 R   89  0.5   0:33.35 nginx
27231 root      15   0 18064  16m  516 S   61  0.4   0:08.74 http
27215 root      25   0 40788  836  344 S    0  0.0   0:00.00 nginx
27216 nobody    21   0 60064  20m  692 R   97  0.5   0:34.33 nginx
27231 root      15   0 18064  16m  516 R   66  0.4   0:09.40 http
27215 root      25   0 40788  836  344 S    0  0.0   0:00.00 nginx
27216 nobody    22   0 60064  20m  692 R   68  0.5   0:35.01 nginx
27231 root      15   0 18064  16m  516 R   34  0.4   0:09.74 http
27215 root      25   0 40788  836  344 S    0  0.0   0:00.00 nginx
27216 nobody    25   0 60064  20m  692 R   98  0.5   0:36.00 nginx
27231 root      15   0 18064  16m  516 S   66  0.4   0:10.40 http
27215 root      25   0 40788  836  344 S    0  0.0   0:00.00 nginx
27216 nobody    25   0 60064  20m  692 R   97  0.5   0:36.98 nginx
27231 root      15   0 18064  16m  516 R   52  0.4   0:10.92 http
27215 root      25   0 40788  836  344 S    0  0.0   0:00.00 nginx
27216 nobody    25   0 44712 5160  692 S   63  0.1   0:37.61 nginx
27215 root      25   0 40788  836  344 S    0  0.0   0:00.00 nginx
27215 root      25   0 40788  836  344 S    0  0.0   0:00.00 nginx

Unlike SXE, nginx uses a dynamic memory model and top shows me that peak memory usage is only 20MB which is very similar to the peak memory usage of SXE of 16MB; another tie.

In conclusion, if you’re planning to serve static content and CPU is your bottleneck then using nginx could cause you to employ up to 2.4 times as many servers as if you had implemented with SXE. It would be interesting to create a real static content delivery system using SXE and post a more realistic head to head comparison. If anybody has ideas on what the more realistic head to head comparison might look like then please comment below.

 

node.js versus SXE “Hello World”; complexity, speed, and memory usage October 1, 2010

A new technology that has been given a lot of press lately is node.js which describes itself as “an easy way to build scalable network programs”. Since I’ve designed and — for some time — have been working with some talented colleagues on technology (read: SXE) which is similar using plain old C instead of JavaScript, I thought it might be good to do a head to head comparison in terms of quantity & complexity of source code, run-time performance, and memory usage. What is SXE? SXE is “an easy way to build scalable network programs” but without comprising run-time performance or memory usage. One of the goals of SXE is to make developing in C almost as easy as developing in a high level script language, but how this is achieved is the subject for another time. Now on with the “Hello World” comparison…

Here is the source code for the node.js example “Hello World” HTTP server:


var http = require('http');
http.createServer(function (req, res) {
  res.writeHead(200, {'Content-Type': 'text/plain'});
  res.end('Hello World\n');
}).listen(8000, "127.0.0.1");

 And here is the equivalent source code implemented using C and SXE:

#include <errno.h>
#include <string.h>
#include "ev.h"
#include "sxe.h"
#include "sxe-log.h"
#include "sxe-util.h"
SXE * listener;
static char canned_reply_keep_alive[] = "HTTP/1.0 200 OK\r\n" "Connection: Keep-Alive\r\n" "Content-Type: text/html\r\nContent-Length: 14\r\n\r\nHello World\n\r\n";
static void event_read(SXE * this, int length) {
    SXE_UNUSED_ARGUMENT(length);
    if (! SXE_BUF_STRNSTR(this,"\r\n\r\n")) { goto SXE_EARLY_OUT; }
    sxe_write(this, (void *)&canned_reply_keep_alive[0], sizeof(canned_reply_keep_alive) - 1);
    SXE_BUF_CLEAR(this);
    SXE_EARLY_OR_ERROR_OUT:
}
int main(int argc, char *argv[]) {
    sxe_register(10100, 0);
    sxe_init();
    listener = sxe_new_tcp(NULL, "127.0.0.1", 8000, NULL, event_read, NULL);
    sxe_listen(listener);
    ev_loop(ev_default_loop(EVFLAG_AUTO), 0);
    return 0;
}

And here is the instrumented version (unlike JavaScript, an advantage of C is that it is able to offer release and more heavily instrumented debug versions of the same code without any run-time performance penalty for the release version) of the code used in the test below:

# cat httpd.c
/* Copyright (c) 2010 Simon Hardy-Francis.
 *
 * Permission is hereby granted, free of charge, to any person obtaining a copy
 * of this software and associated documentation files (the "Software"), to deal
 * in the Software without restriction, including without limitation the rights
 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 * copies of the Software, and to permit persons to whom the Software is
 * furnished to do so, subject to the following conditions:
 *
 * The above copyright notice and this permission notice shall be included in
 * all copies or substantial portions of the Software.
 *
 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 * THE SOFTWARE.
 */

#include <errno.h>
#include <string.h>

#include "ev.h"
#include "sxe.h"
#include "sxe-log.h"
#include "sxe-util.h"

/**
 * - Example http session (using keep-alive):
 *   - http -i 127.0.0.1 -p 8000 -n 50 -c 10000
 * - Example ab sessions:
 *   - Without keep-alive: ab -n 50000 -c 500    <a href="http://localhost:9090/">http://localhost:9090/</a>
 *   - With    keep-alive: ab -n 50000 -c 500 -k <a href="http://localhost:9090/">http://localhost:9090/</a>
 */

SXE * listener;

static char canned_reply_no_keep_alive[] = "HTTP/1.0 200 OK\r\n" "Connection: Close\r\n"      "Content-Type: text/html\r\nContent-Length: 14\r\n\r\nHello World\n\r\n";
static char canned_reply____keep_alive[] = "HTTP/1.0 200 OK\r\n" "Connection: Keep-Alive\r\n" "Content-Type: text/html\r\nContent-Length: 14\r\n\r\nHello World\n\r\n";

static void
event_close(SXE * this)
{
    SXEE60I("httpd::event_close()");
    SXE_UNUSED_ARGUMENT(this);
    SXEL60I("Peer disconnected; do nothing");
    SXER60I("return");
} /* event_close() */

static void
event_read(SXE * this, int length)
{
    SXEE61I("httpd::event_read(length=%d)", length);
    SXE_UNUSED_ARGUMENT(length);

    if (! SXE_BUF_STRNSTR(this,"\r\n\r\n")) {
        SXEL10I("Read partial header; waiting for remainder to be appended");
        goto SXE_EARLY_OUT;
    }

    if (SXE_BUF_STRNCASESTR(this,"Connection: Keep-Alive")) {
        (void)sxe_write(this, (void *)&canned_reply____keep_alive[0], sizeof(canned_reply____keep_alive) - 1);
        SXEL60I("Connection: Keep-Alive: found");
    }
    else {
        (void)sxe_write(this, (void *)&canned_reply_no_keep_alive[0], sizeof(canned_reply_no_keep_alive) - 1);
        SXEL60I("Connection: Keep-Alive: not found; closing");
        sxe_close(this);
    }

    SXE_BUF_CLEAR(this);

    SXE_EARLY_OR_ERROR_OUT:

    SXER60I("return");
} /* event_read() */

int
main(int argc, char *argv[]) {
    SXE_RETURN result;

    SXE_UNUSED_ARGUMENT(argc);
    SXE_UNUSED_ARGUMENT(argv);
    SXEL60("httpd starting");

    sxe_register(10100, 0);
    SXEA10((result = sxe_init()) == SXE_RETURN_OK, "sxe_init failed");
    SXEA10((listener  = sxe_new_tcp(NULL, "127.0.0.1", 8000, NULL, event_read, event_close)) != NULL, "sxe_new_tcp failed");
    SXEA10((result = sxe_listen(listener)) == SXE_RETURN_OK, "sxe_listen failed");

    SXEL60("httpd calling ev_loop()");
    ev_loop(ev_default_loop(EVFLAG_AUTO), 0);

    SXEL60("httpd exiting");
    return 0;
} /* main() */

When comparing quantity of source code then node.js wins. However, many readers may be surprised at how little C source code is necessary. And if I was to create an SXE deployment service similar to, e.g. Joyent or Heroku, then the C source code main() function and #include statements would disappear leaving the event handler which is barely more lines of code than the node.js counterpart.

Now let’s find out about performance and memory usage. I decided to create a simple HTTP load generator using C and SXE. On the command line I can specify which IP and port to connect to, how many simultaneous TCP sessions to connect, and how many queries to send over each connection. The HTTP load generator first creates all it’s connections, and then starts sending the queries. Here is the source code for http.c:

/* Copyright (c) 2010 Simon Hardy-Francis.
 *
 * Permission is hereby granted, free of charge, to any person obtaining a copy
 * of this software and associated documentation files (the "Software"), to deal
 * in the Software without restriction, including without limitation the rights
 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 * copies of the Software, and to permit persons to whom the Software is
 * furnished to do so, subject to the following conditions:
 *
 * The above copyright notice and this permission notice shall be included in
 * all copies or substantial portions of the Software.
 *
 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 * THE SOFTWARE.
 */

#include <errno.h>
#include <string.h>
#include <getopt.h>
#include <stdlib.h>

#include "ev.h"
#include "sxe.h"
#include "sxe-log.h"
#include "sxe-util.h"

#define SXE_CONCURRENCY_MAX 10000
#define SXE_WRITE_RAMP      10
#define SXE_CONNECTION_RAMP 16

char     source_ip_address[] = "127.0.0.1";
char     peer_ip_default[] = "127.0.0.1";
char   * peer_ip = peer_ip_default;
int      peer_port = 9090;
SXE    * sender[SXE_CONCURRENCY_MAX];
int      sender_index = 0;
int      connection_index = 0;
int      per_connection_sender_writes[SXE_CONCURRENCY_MAX];
double   per_connection_time_at_connect[SXE_CONCURRENCY_MAX];
double   per_connection_time_at_connected[SXE_CONCURRENCY_MAX];
int      response_count = 0;
int      connect_count = 0;
int      connect_batch_count = 0;
int      sxe_concurrency = 1000;
int      sxe_writes_per_connection = 1;
double   time_at_start;
double   time_at_all_connected;
double   time_at_all_responses;

/*                                              0         10        20        30        40        50        60        70        80 */
static char canned_query____keep_alive[] = "GET /123456789/123456789/123456789/123456789/123456789/123456789/123456789/1234567.htm HTTP/1.1\r\nConnection: Keep-Alive\r\nHost: 127.0.0.1:8000\r\nUser-Agent: SXE-http-load-keepalive/1.0\r\nAccept: */*\r\n\r\n";

static void
event_close(SXE * this)
{
    SXEE60I("http::event_close()");
    SXE_UNUSED_ARGUMENT(this);
    SXEL60I("peer disconnected; do nothing");
    SXER60I("return");
} /* event_close() */

static void event_connect(SXE * this);
static void event_read(SXE * this, int length);

static void
connect_ramp(SXE * this)
{
    int i;

    SXE_UNUSED_ARGUMENT(this);

    SXEE60I("http::connect_ramp()");

    SXEL63I("connecting max %d http connects starting with connection %d of %d", SXE_CONNECTION_RAMP, 1 + connection_index, sxe_concurrency);
    for (i = 0; i < SXE_CONNECTION_RAMP; i++)
    {
        if (connection_index == sxe_concurrency) {
            goto SXE_EARLY_OUT;
        }
        per_connection_sender_writes[connection_index] = 0;
        sender[connection_index] = sxe_new_tcp(NULL, &source_ip_address[0], 0, event_connect, event_read, event_close);
        per_connection_time_at_connect[connection_index] = sxe_get_time_in_seconds();
        sxe_connect(sender[connection_index], peer_ip, peer_port);
        connection_index ++;
    }

    SXE_EARLY_OR_ERROR_OUT:

    SXER60I("return");
} /* connect_ramp() */

static void
write_ramp(SXE * this)
{
    int i;

    SXEE60I("http::write_ramp()");
    SXE_UNUSED_ARGUMENT(this);

    SXEL63I("writing max %d http queries starting with query %d of %d", SXE_WRITE_RAMP, 1 + sender_index, sxe_concurrency);
    for (i = 0; i < SXE_WRITE_RAMP; i++)
    {
        if (sender_index == sxe_concurrency) {
            goto SXE_EARLY_OUT;
        }
        per_connection_sender_writes[sender_index] ++;
        sxe_write(sender[sender_index], canned_query____keep_alive, sizeof(canned_query____keep_alive) - 1);
        sender_index ++;
    }

    SXE_EARLY_OR_ERROR_OUT:

    SXER60I("return");
} /* write_ramp() */

static void
event_connect(SXE * this)
{
    SXEE60I("http::event_connect()");

    per_connection_time_at_connected[SXE_ID(this)] = sxe_get_time_in_seconds();
    double per_connection_seconds_to_connect = per_connection_time_at_connected[SXE_ID(this)] - per_connection_time_at_connect[SXE_ID(this)];
    if (per_connection_seconds_to_connect > 1.0) {
        SXEL11I("finished connection to peer in %f seconds (suspiciously long time)", per_connection_seconds_to_connect);
    }
    else {
        SXEL61I("finished connection to peer in %f seconds", per_connection_seconds_to_connect);
    }
    connect_count ++;
    if ((connect_count % 1000) == 0) {
        SXEL11I("connected: %d", connect_count);
    }
    if (connect_count == sxe_concurrency) {
        time_at_all_connected = sxe_get_time_in_seconds();
        SXEL13("starting writes: %d (= %d sockets * %d queries/socket) queries", sxe_concurrency * sxe_writes_per_connection, sxe_concurrency, sxe_writes_per_connection);
        SXEL11("using query of %d bytes:", strlen(canned_query____keep_alive));
        SXED10(canned_query____keep_alive, strlen(canned_query____keep_alive));
        write_ramp (NULL);
    }

    connect_batch_count ++;
    if (connect_batch_count == SXE_CONNECTION_RAMP) {
        connect_batch_count = 0;
        connect_ramp (NULL);
    }

    SXE_EARLY_OR_ERROR_OUT:

    SXER60I("return");
} /* event_connect() */
static void
event_read(SXE * this, int length)
{
    SXEE61I("http::event_read(length=%d)", length);
    SXE_UNUSED_ARGUMENT(length);

    if (sender_index < sxe_concurrency) {
        write_ramp(this);
    }

    if (! SXE_BUF_STRNSTR(this,"\r\n\r\n")) {
        SXEL10I("read partial header; waiting for remainder to be appended");
        goto SXE_EARLY_OUT;
    }

    if (per_connection_sender_writes[SXE_ID(this)] < sxe_writes_per_connection) {
        per_connection_sender_writes[SXE_ID(this)] ++;
        sxe_write(this, canned_query____keep_alive, sizeof(canned_query____keep_alive) - 1);
    }

    response_count ++;

    if (response_count == (sxe_concurrency * sxe_writes_per_connection)) {
        SXEL10I("read all expected http responses");
        time_at_all_responses = sxe_get_time_in_seconds();
        double seconds_for_connections = (time_at_all_connected - time_at_start        );
        double seconds_for_responses   = (time_at_all_responses - time_at_all_connected);
        double seconds_for_all         = seconds_for_connections + seconds_for_responses;
        SXEL12I("time for all connections: %f seconds or %f per second", seconds_for_connections, (sxe_concurrency                            ) / seconds_for_connections);
        SXEL12I("time for all queries    : %f seconds or %f per second", seconds_for_responses  , (sxe_concurrency * sxe_writes_per_connection) / seconds_for_responses  );
        SXEL12I("time for all            : %f seconds or %f per second", seconds_for_all        , (sxe_concurrency * sxe_writes_per_connection) / seconds_for_all        );
        exit(0);
    }

    SXE_BUF_CLEAR(this);

    SXE_EARLY_OR_ERROR_OUT:

    SXER60I("return");
} /* event_read() */

static void
usage(void)
{
    fprintf(stderr, "Usage   : http [-i ip] [-p port] [-n queries per socket] [-c sockets]\n");
    fprintf(stderr, "Defaults: http -i %s -p %d -n %d -c %d\n", peer_ip, peer_port, sxe_writes_per_connection, sxe_concurrency);
    exit(2);
} /* usage() */

int
main(int argc, char *argv[])
{
    int c;
    (void) argc;
    (void) argv;

    SXEL60("http starting");

    if (argc == 1) {
        usage();
    }

    while ((c = getopt(argc, argv, "i:p:n:c:")) != -1) {
        switch (c) {
        case 'i': peer_ip = optarg;
                  break;
        case 'p': peer_port = atoi(optarg);
                  if (peer_port < 1    ) { SXEL10("ERROR: -p must be >= 1"   ); }
                  if (peer_port > 65535) { SXEL10("ERROR: -p must be < 65536"); }
                  break;
        case 'n': sxe_writes_per_connection = atoi(optarg);
                  if (sxe_writes_per_connection < 1) { SXEL10("ERROR: -n must be >= 1"); }
                  break;
        case 'c': sxe_concurrency = atoi(optarg);
                  if (sxe_concurrency < 1                  ) { SXEL10("ERROR: -c must be >= 1"                     ); }
                  if (sxe_concurrency > SXE_CONCURRENCY_MAX) { SXEL11("ERROR: -c must be < %d", SXE_CONCURRENCY_MAX); }
                  break;
        default:  usage();
        }
    }

    sxe_register(1 + SXE_CONCURRENCY_MAX, 0);

    sxe_init();

    SXEL13("connecting via ramp %d sockets to peer %s:%d", sxe_concurrency, peer_ip, peer_port);
    time_at_start = sxe_get_time_in_seconds();
    connect_ramp(NULL);

    SXEL60("http calling ev_loop()");
    ev_loop(ev_default_loop(EVFLAG_AUTO), 0);

    SXEL60("http exiting");
    return 0;
} /* main() */

So I compiled http.c and used it to test the node.js example “Hello World” server. Here are the results:

# ./http -i 127.0.0.1 -p 8000 -n 50 -c 10000
20100929 215440.709 P00002d94 ------ 1 - connecting via ramp 10000 sockets to peer 127.0.0.1:8000
20100929 215443.718 P00002d94    184 1 - finished connection to peer in 3.000193 seconds (suspiciously long time)
20100929 215443.718 P00002d94    185 1 - finished connection to peer in 3.000282 seconds (suspiciously long time)
20100929 215443.718 P00002d94    187 1 - finished connection to peer in 3.000287 seconds (suspiciously long time)
20100929 215443.718 P00002d94    188 1 - finished connection to peer in 3.000316 seconds (suspiciously long time)
20100929 215443.718 P00002d94    190 1 - finished connection to peer in 3.000316 seconds (suspiciously long time)
20100929 215446.725 P00002d94    446 1 - finished connection to peer in 2.999944 seconds (suspiciously long time)
20100929 215449.741 P00002d94    756 1 - finished connection to peer in 3.000473 seconds (suspiciously long time)
20100929 215449.741 P00002d94    758 1 - finished connection to peer in 3.000475 seconds (suspiciously long time)
20100929 215449.741 P00002d94    760 1 - finished connection to peer in 3.000446 seconds (suspiciously long time)
20100929 215449.741 P00002d94    761 1 - finished connection to peer in 3.000442 seconds (suspiciously long time)
20100929 215449.741 P00002d94    763 1 - finished connection to peer in 3.000420 seconds (suspiciously long time)
20100929 215449.741 P00002d94    765 1 - finished connection to peer in 3.000389 seconds (suspiciously long time)
20100929 215449.741 P00002d94    767 1 - finished connection to peer in 3.000359 seconds (suspiciously long time)
20100929 215449.748 P00002d94    999 1 - connected: 1000
20100929 215452.751 P00002d94   1132 1 - finished connection to peer in 2.999929 seconds (suspiciously long time)
20100929 215452.751 P00002d94   1134 1 - finished connection to peer in 2.999924 seconds (suspiciously long time)
20100929 215452.751 P00002d94   1135 1 - finished connection to peer in 2.999931 seconds (suspiciously long time)
20100929 215455.763 P00002d94   1396 1 - finished connection to peer in 2.999883 seconds (suspiciously long time)
20100929 215455.763 P00002d94   1397 1 - finished connection to peer in 2.999903 seconds (suspiciously long time)
20100929 215455.764 P00002d94   1398 1 - finished connection to peer in 2.999897 seconds (suspiciously long time)
20100929 215455.764 P00002d94   1399 1 - finished connection to peer in 2.999888 seconds (suspiciously long time)
20100929 215455.764 P00002d94   1400 1 - finished connection to peer in 2.999875 seconds (suspiciously long time)
20100929 215455.764 P00002d94   1401 1 - finished connection to peer in 2.999867 seconds (suspiciously long time)
20100929 215455.764 P00002d94   1402 1 - finished connection to peer in 2.999858 seconds (suspiciously long time)
20100929 215455.764 P00002d94   1403 1 - finished connection to peer in 2.999850 seconds (suspiciously long time)
20100929 215455.764 P00002d94   1404 1 - finished connection to peer in 2.999842 seconds (suspiciously long time)
20100929 215455.764 P00002d94   1405 1 - finished connection to peer in 2.999832 seconds (suspiciously long time)
20100929 215455.764 P00002d94   1406 1 - finished connection to peer in 2.999825 seconds (suspiciously long time)
20100929 215455.764 P00002d94   1407 1 - finished connection to peer in 2.999818 seconds (suspiciously long time)
20100929 215458.772 P00002d94   1686 1 - finished connection to peer in 3.000733 seconds (suspiciously long time)
20100929 215458.772 P00002d94   1688 1 - finished connection to peer in 3.000710 seconds (suspiciously long time)
20100929 215458.772 P00002d94   1689 1 - finished connection to peer in 3.000703 seconds (suspiciously long time)
20100929 215458.772 P00002d94   1691 1 - finished connection to peer in 3.000672 seconds (suspiciously long time)
20100929 215458.772 P00002d94   1693 1 - finished connection to peer in 3.000650 seconds (suspiciously long time)
20100929 215458.772 P00002d94   1694 1 - finished connection to peer in 3.000642 seconds (suspiciously long time)
20100929 215458.781 P00002d94   1999 1 - connected: 2000
20100929 215501.784 P00002d94   2091 1 - finished connection to peer in 3.000436 seconds (suspiciously long time)
20100929 215501.784 P00002d94   2093 1 - finished connection to peer in 3.000445 seconds (suspiciously long time)
20100929 215501.784 P00002d94   2094 1 - finished connection to peer in 3.000451 seconds (suspiciously long time)
20100929 215504.793 P00002d94   2397 1 - finished connection to peer in 3.000772 seconds (suspiciously long time)
20100929 215504.793 P00002d94   2399 1 - finished connection to peer in 3.000767 seconds (suspiciously long time)
20100929 215507.803 P00002d94   2598 1 - finished connection to peer in 2.999878 seconds (suspiciously long time)
20100929 215507.803 P00002d94   2599 1 - finished connection to peer in 2.999888 seconds (suspiciously long time)
20100929 215507.803 P00002d94   2600 1 - finished connection to peer in 2.999882 seconds (suspiciously long time)
20100929 215507.803 P00002d94   2601 1 - finished connection to peer in 2.999873 seconds (suspiciously long time)
20100929 215507.803 P00002d94   2602 1 - finished connection to peer in 2.999866 seconds (suspiciously long time)
20100929 215507.803 P00002d94   2603 1 - finished connection to peer in 2.999860 seconds (suspiciously long time)
20100929 215507.803 P00002d94   2604 1 - finished connection to peer in 2.999851 seconds (suspiciously long time)
20100929 215507.803 P00002d94   2605 1 - finished connection to peer in 2.999848 seconds (suspiciously long time)
20100929 215507.803 P00002d94   2606 1 - finished connection to peer in 2.999843 seconds (suspiciously long time)
20100929 215507.803 P00002d94   2607 1 - finished connection to peer in 2.999832 seconds (suspiciously long time)
20100929 215510.812 P00002d94   2952 1 - finished connection to peer in 2.999587 seconds (suspiciously long time)
20100929 215510.812 P00002d94   2954 1 - finished connection to peer in 2.999570 seconds (suspiciously long time)
20100929 215510.812 P00002d94   2956 1 - finished connection to peer in 2.999538 seconds (suspiciously long time)
20100929 215510.812 P00002d94   2958 1 - finished connection to peer in 2.999508 seconds (suspiciously long time)
20100929 215510.816 P00002d94   2999 1 - connected: 3000
20100929 215513.823 P00002d94   3304 1 - finished connection to peer in 2.999448 seconds (suspiciously long time)
20100929 215513.824 P00002d94   3306 1 - finished connection to peer in 3.000435 seconds (suspiciously long time)
20100929 215513.824 P00002d94   3308 1 - finished connection to peer in 3.000412 seconds (suspiciously long time)
20100929 215513.824 P00002d94   3309 1 - finished connection to peer in 3.000419 seconds (suspiciously long time)
20100929 215513.824 P00002d94   3311 1 - finished connection to peer in 3.000398 seconds (suspiciously long time)
20100929 215516.834 P00002d94   3606 1 - finished connection to peer in 2.999794 seconds (suspiciously long time)
20100929 215516.834 P00002d94   3607 1 - finished connection to peer in 2.999819 seconds (suspiciously long time)
20100929 215516.834 P00002d94   3608 1 - finished connection to peer in 2.999822 seconds (suspiciously long time)
20100929 215516.834 P00002d94   3609 1 - finished connection to peer in 2.999820 seconds (suspiciously long time)
20100929 215516.834 P00002d94   3610 1 - finished connection to peer in 2.999812 seconds (suspiciously long time)
20100929 215516.834 P00002d94   3611 1 - finished connection to peer in 2.999810 seconds (suspiciously long time)
20100929 215516.834 P00002d94   3612 1 - finished connection to peer in 2.999814 seconds (suspiciously long time)
20100929 215516.834 P00002d94   3613 1 - finished connection to peer in 2.999805 seconds (suspiciously long time)
20100929 215516.834 P00002d94   3614 1 - finished connection to peer in 2.999794 seconds (suspiciously long time)
20100929 215516.834 P00002d94   3615 1 - finished connection to peer in 2.999796 seconds (suspiciously long time)
20100929 215519.844 P00002d94   3978 1 - finished connection to peer in 2.999932 seconds (suspiciously long time)
20100929 215519.844 P00002d94   3979 1 - finished connection to peer in 2.999958 seconds (suspiciously long time)
20100929 215519.844 P00002d94   3982 1 - finished connection to peer in 2.999906 seconds (suspiciously long time)
20100929 215519.844 P00002d94   3983 1 - finished connection to peer in 2.999900 seconds (suspiciously long time)
20100929 215519.845 P00002d94   3999 1 - connected: 4000
20100929 215522.853 P00002d94   4287 1 - finished connection to peer in 2.999896 seconds (suspiciously long time)
20100929 215525.861 P00002d94   4590 1 - finished connection to peer in 2.999847 seconds (suspiciously long time)
20100929 215528.870 P00002d94   4901 1 - finished connection to peer in 3.000493 seconds (suspiciously long time)
20100929 215528.870 P00002d94   4902 1 - finished connection to peer in 3.000520 seconds (suspiciously long time)
20100929 215528.870 P00002d94   4904 1 - finished connection to peer in 3.000509 seconds (suspiciously long time)
20100929 215528.870 P00002d94   4906 1 - finished connection to peer in 3.000480 seconds (suspiciously long time)
20100929 215528.870 P00002d94   4907 1 - finished connection to peer in 3.000472 seconds (suspiciously long time)
20100929 215528.870 P00002d94   4909 1 - finished connection to peer in 3.000441 seconds (suspiciously long time)
20100929 215528.870 P00002d94   4911 1 - finished connection to peer in 3.000409 seconds (suspiciously long time)
20100929 215528.872 P00002d94   4999 1 - connected: 5000
20100929 215531.878 P00002d94   5211 1 - finished connection to peer in 3.000253 seconds (suspiciously long time)
20100929 215531.878 P00002d94   5212 1 - finished connection to peer in 3.000276 seconds (suspiciously long time)
20100929 215531.878 P00002d94   5213 1 - finished connection to peer in 3.000270 seconds (suspiciously long time)
20100929 215531.878 P00002d94   5214 1 - finished connection to peer in 3.000265 seconds (suspiciously long time)
20100929 215531.878 P00002d94   5215 1 - finished connection to peer in 3.000269 seconds (suspiciously long time)
20100929 215534.889 P00002d94   5582 1 - finished connection to peer in 3.000186 seconds (suspiciously long time)
20100929 215537.898 P00002d94   5887 1 - finished connection to peer in 3.000733 seconds (suspiciously long time)
20100929 215537.901 P00002d94   5999 1 - connected: 6000
20100929 215540.907 P00002d94   6200 1 - finished connection to peer in 3.000404 seconds (suspiciously long time)
20100929 215540.907 P00002d94   6202 1 - finished connection to peer in 3.000402 seconds (suspiciously long time)
20100929 215540.907 P00002d94   6204 1 - finished connection to peer in 3.000378 seconds (suspiciously long time)
20100929 215540.907 P00002d94   6206 1 - finished connection to peer in 3.000350 seconds (suspiciously long time)
20100929 215540.907 P00002d94   6207 1 - finished connection to peer in 3.000347 seconds (suspiciously long time)
20100929 215543.916 P00002d94   6505 1 - finished connection to peer in 3.000724 seconds (suspiciously long time)
20100929 215543.916 P00002d94   6507 1 - finished connection to peer in 3.000727 seconds (suspiciously long time)
20100929 215543.916 P00002d94   6509 1 - finished connection to peer in 3.000703 seconds (suspiciously long time)
20100929 215543.916 P00002d94   6511 1 - finished connection to peer in 3.000674 seconds (suspiciously long time)
20100929 215546.922 P00002d94   6762 1 - finished connection to peer in 2.999642 seconds (suspiciously long time)
20100929 215546.922 P00002d94   6763 1 - finished connection to peer in 2.999665 seconds (suspiciously long time)
20100929 215546.922 P00002d94   6764 1 - finished connection to peer in 2.999662 seconds (suspiciously long time)
20100929 215546.922 P00002d94   6765 1 - finished connection to peer in 2.999657 seconds (suspiciously long time)
20100929 215546.922 P00002d94   6766 1 - finished connection to peer in 2.999656 seconds (suspiciously long time)
20100929 215546.922 P00002d94   6767 1 - finished connection to peer in 2.999651 seconds (suspiciously long time)
20100929 215546.929 P00002d94   6999 1 - connected: 7000
20100929 215549.933 P00002d94   7135 1 - finished connection to peer in 3.000380 seconds (suspiciously long time)
20100929 215552.944 P00002d94   7524 1 - finished connection to peer in 3.000323 seconds (suspiciously long time)
20100929 215552.944 P00002d94   7526 1 - finished connection to peer in 3.000320 seconds (suspiciously long time)
20100929 215552.944 P00002d94   7528 1 - finished connection to peer in 3.000304 seconds (suspiciously long time)
20100929 215552.944 P00002d94   7530 1 - finished connection to peer in 3.000274 seconds (suspiciously long time)
20100929 215552.944 P00002d94   7532 1 - finished connection to peer in 3.000242 seconds (suspiciously long time)
20100929 215552.944 P00002d94   7533 1 - finished connection to peer in 3.000236 seconds (suspiciously long time)
20100929 215552.944 P00002d94   7535 1 - finished connection to peer in 3.000204 seconds (suspiciously long time)
20100929 215555.955 P00002d94   7931 1 - finished connection to peer in 2.999945 seconds (suspiciously long time)
20100929 215555.955 P00002d94   7933 1 - finished connection to peer in 2.999942 seconds (suspiciously long time)
20100929 215555.955 P00002d94   7935 1 - finished connection to peer in 2.999919 seconds (suspiciously long time)
20100929 215555.957 P00002d94   7999 1 - connected: 8000
20100929 215558.966 P00002d94   8328 1 - finished connection to peer in 3.000006 seconds (suspiciously long time)
20100929 215558.966 P00002d94   8330 1 - finished connection to peer in 3.000006 seconds (suspiciously long time)
20100929 215558.966 P00002d94   8331 1 - finished connection to peer in 3.000007 seconds (suspiciously long time)
20100929 215558.966 P00002d94   8333 1 - finished connection to peer in 2.999987 seconds (suspiciously long time)
20100929 215558.966 P00002d94   8335 1 - finished connection to peer in 2.999958 seconds (suspiciously long time)
20100929 215601.978 P00002d94   8724 1 - finished connection to peer in 3.000369 seconds (suspiciously long time)
20100929 215601.978 P00002d94   8726 1 - finished connection to peer in 3.000387 seconds (suspiciously long time)
20100929 215601.978 P00002d94   8728 1 - finished connection to peer in 3.000369 seconds (suspiciously long time)
20100929 215601.978 P00002d94   8730 1 - finished connection to peer in 3.000336 seconds (suspiciously long time)
20100929 215601.978 P00002d94   8732 1 - finished connection to peer in 3.000306 seconds (suspiciously long time)
20100929 215601.978 P00002d94   8734 1 - finished connection to peer in 3.000278 seconds (suspiciously long time)
20100929 215601.985 P00002d94   8999 1 - connected: 9000
20100929 215604.986 P00002d94   9006 1 - finished connection to peer in 3.000625 seconds (suspiciously long time)
20100929 215604.986 P00002d94   9007 1 - finished connection to peer in 3.000642 seconds (suspiciously long time)
20100929 215607.997 P00002d94   9390 1 - finished connection to peer in 3.000640 seconds (suspiciously long time)
20100929 215611.003 P00002d94   9579 1 - finished connection to peer in 3.000221 seconds (suspiciously long time)
20100929 215611.003 P00002d94   9580 1 - finished connection to peer in 3.000249 seconds (suspiciously long time)
20100929 215611.003 P00002d94   9581 1 - finished connection to peer in 3.000256 seconds (suspiciously long time)
20100929 215611.003 P00002d94   9582 1 - finished connection to peer in 3.000249 seconds (suspiciously long time)
20100929 215611.003 P00002d94   9583 1 - finished connection to peer in 3.000238 seconds (suspiciously long time)
20100929 215614.014 P00002d94   9966 1 - finished connection to peer in 3.000609 seconds (suspiciously long time)
20100929 215614.015 P00002d94   9999 1 - connected: 10000
20100929 215614.015 P00002d94 ------ 1 - starting writes: 500000 (= 10000 sockets * 50 queries/socket) queries
20100929 215614.015 P00002d94 ------ 1 - using query of 198 bytes:
20100929 215614.015 P00002d94 ------ 1 - 080562c0 47 45 54 20 2f 31 32 33 34 35 36 37 38 39 2f 31 GET /123456789/1
20100929 215614.015 P00002d94 ------ 1 - 080562d0 32 33 34 35 36 37 38 39 2f 31 32 33 34 35 36 37 23456789/1234567
20100929 215614.015 P00002d94 ------ 1 - 080562e0 38 39 2f 31 32 33 34 35 36 37 38 39 2f 31 32 33 89/123456789/123
20100929 215614.015 P00002d94 ------ 1 - 080562f0 34 35 36 37 38 39 2f 31 32 33 34 35 36 37 38 39 456789/123456789
20100929 215614.015 P00002d94 ------ 1 - 08056300 2f 31 32 33 34 35 36 37 38 39 2f 31 32 33 34 35 /123456789/12345
20100929 215614.015 P00002d94 ------ 1 - 08056310 36 37 2e 68 74 6d 20 48 54 54 50 2f 31 2e 31 0d 67.htm HTTP/1.1.
20100929 215614.015 P00002d94 ------ 1 - 08056320 0a 43 6f 6e 6e 65 63 74 69 6f 6e 3a 20 4b 65 65 .Connection: Kee
20100929 215614.015 P00002d94 ------ 1 - 08056330 70 2d 41 6c 69 76 65 0d 0a 48 6f 73 74 3a 20 31 p-Alive..Host: 1
20100929 215614.015 P00002d94 ------ 1 - 08056340 32 37 2e 30 2e 30 2e 31 3a 38 30 30 30 0d 0a 55 27.0.0.1:8000..U
20100929 215614.015 P00002d94 ------ 1 - 08056350 73 65 72 2d 41 67 65 6e 74 3a 20 53 58 45 2d 68 ser-Agent: SXE-h
20100929 215614.015 P00002d94 ------ 1 - 08056360 74 74 70 2d 6c 6f 61 64 2d 6b 65 65 70 61 6c 69 ttp-load-keepali
20100929 215614.015 P00002d94 ------ 1 - 08056370 76 65 2f 31 2e 30 0d 0a 41 63 63 65 70 74 3a 20 ve/1.0..Accept:
20100929 215614.015 P00002d94 ------ 1 - 08056380 2a 2f 2a 0d 0a 0d                               */*...
20100929 215654.519 P00002d94   4010 1 - read all expected http responses
20100929 215654.519 P00002d94   4010 1 - time for all connections: 93.305586 seconds or 107.174719 per second
20100929 215654.519 P00002d94   4010 1 - time for all queries    : 40.504647 seconds or 12344.262617 per second
20100929 215654.519 P00002d94   4010 1 - time for all            : 133.810233 seconds or 3736.634997 per second

On the positive side, the node.js example “Hello World” server managed a respectable 12,344 queries per second at a concurrency of 10,000 connections. On the negative side, there seems to be some kind of bug with node.js concerning handling connections because node only managed 107 connections per second. Also, 134 out of the 10,000 connections decided to take about 3 seconds to connect.

During the test I also monitored memory usage of both the client and server processes:

# top -b -d1 | egrep "(node|http)"
11665 root      18   0  628m 9020 5100 S    0  0.2   0:00.05 node
11665 root      18   0  628m 9020 5100 S    0  0.2   0:00.05 node
11665 root      18   0  628m 9020 5100 S    0  0.2   0:00.05 node
11665 root      18   0  629m  10m 5108 S    1  0.3   0:00.06 node
11668 root      17   0 18032  16m  524 S    1  0.4   0:00.01 http
11665 root      18   0  629m  10m 5108 S    0  0.3   0:00.06 node
11668 root      17   0 18032  16m  524 S    0  0.4   0:00.01 http
11665 root      18   0  629m  10m 5108 S    0  0.3   0:00.06 node
11668 root      17   0 18032  16m  524 S    0  0.4   0:00.01 http
11665 root      15   0  629m  12m 5108 S    2  0.3   0:00.08 node
11668 root      18   0 18032  16m  552 S    1  0.4   0:00.02 http
11665 root      15   0  629m  12m 5108 S    0  0.3   0:00.08 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.02 http
11665 root      15   0  629m  12m 5108 S    0  0.3   0:00.08 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.02 http
11665 root      15   0  633m  17m 5108 S    2  0.4   0:00.10 node
11668 root      18   0 18032  16m  552 S    1  0.4   0:00.03 http
11665 root      15   0  633m  17m 5108 S    0  0.4   0:00.10 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.03 http
11665 root      15   0  633m  17m 5108 S    0  0.4   0:00.10 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.03 http
11665 root      15   0  633m  18m 5108 S    1  0.5   0:00.11 node
11668 root      18   0 18032  16m  552 S    1  0.4   0:00.04 http
11665 root      15   0  633m  18m 5108 S    0  0.5   0:00.11 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.04 http
11665 root      15   0  633m  18m 5108 S    0  0.5   0:00.11 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.04 http
11665 root      16   0  635m  21m 5108 S    3  0.5   0:00.14 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.04 http
11665 root      16   0  635m  21m 5108 S    0  0.5   0:00.14 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.04 http
11665 root      16   0  635m  21m 5108 S    0  0.5   0:00.14 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.04 http
11665 root      15   0  636m  24m 5108 S    2  0.6   0:00.16 node
11668 root      18   0 18032  16m  552 S    1  0.4   0:00.05 http
11665 root      15   0  636m  24m 5108 S    0  0.6   0:00.16 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.05 http
11665 root      15   0  636m  24m 5108 S    0  0.6   0:00.16 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.05 http
11665 root      15   0  636m  25m 5108 S    2  0.6   0:00.18 node
11668 root      18   0 18032  16m  552 S    1  0.4   0:00.06 http
11665 root      15   0  636m  25m 5108 S    0  0.6   0:00.18 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.06 http
11665 root      15   0  636m  25m 5108 S    0  0.6   0:00.18 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.06 http
11665 root      15   0  636m  27m 5108 S    1  0.7   0:00.19 node
11668 root      18   0 18032  16m  552 S    1  0.4   0:00.07 http
11665 root      15   0  636m  27m 5108 S    0  0.7   0:00.19 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.07 http
11665 root      15   0  636m  27m 5108 S    0  0.7   0:00.19 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.07 http
11665 root      15   0  643m  35m 5108 S    3  0.9   0:00.22 node
11668 root      18   0 18032  16m  552 S    1  0.4   0:00.08 http
11665 root      15   0  643m  35m 5108 S    0  0.9   0:00.22 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.08 http
11665 root      15   0  643m  35m 5108 S    0  0.9   0:00.22 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.08 http
11665 root      15   0  643m  35m 5108 S    3  0.9   0:00.25 node
11668 root      18   0 18032  16m  552 S    1  0.4   0:00.09 http
11665 root      15   0  643m  35m 5108 S    0  0.9   0:00.25 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.09 http
11665 root      15   0  643m  35m 5108 S    0  0.9   0:00.25 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.09 http
11665 root      16   0  643m  38m 5108 S    1  1.0   0:00.26 node
11668 root      18   0 18032  16m  552 S    1  0.4   0:00.10 http
11665 root      16   0  643m  38m 5108 S    0  1.0   0:00.26 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.10 http
11665 root      16   0  643m  38m 5108 S    0  1.0   0:00.26 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.10 http
11665 root      15   0  644m  40m 5108 S    2  1.0   0:00.28 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.10 http
11665 root      15   0  644m  40m 5108 S    0  1.0   0:00.28 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.10 http
11665 root      15   0  644m  40m 5108 S    0  1.0   0:00.28 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.10 http
11665 root      16   0  644m  40m 5108 S    2  1.0   0:00.30 node
11668 root      18   0 18032  16m  552 S    1  0.4   0:00.11 http
11665 root      15   0  644m  40m 5108 S    0  1.0   0:00.30 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.11 http
11665 root      15   0  644m  40m 5108 S    0  1.0   0:00.30 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.11 http
11665 root      15   0  644m  43m 5108 S    2  1.1   0:00.32 node
11668 root      18   0 18032  16m  552 S    1  0.4   0:00.12 http
11665 root      15   0  644m  43m 5108 S    0  1.1   0:00.32 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.12 http
11665 root      15   0  644m  43m 5108 S    0  1.1   0:00.32 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.12 http
11665 root      15   0  644m  45m 5108 S    1  1.1   0:00.33 node
11668 root      18   0 18032  16m  552 S    1  0.4   0:00.13 http
11665 root      15   0  644m  45m 5108 S    0  1.1   0:00.33 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.13 http
11665 root      15   0  644m  45m 5108 S    0  1.1   0:00.33 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.13 http
11665 root      15   0  644m  47m 5108 S    2  1.2   0:00.35 node
11668 root      18   0 18032  16m  552 S    1  0.4   0:00.14 http
11665 root      15   0  644m  47m 5108 S    0  1.2   0:00.35 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.14 http
11665 root      15   0  644m  47m 5108 S    0  1.2   0:00.35 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.14 http
11665 root      15   0  652m  56m 5108 S    4  1.4   0:00.39 node
11668 root      18   0 18032  16m  552 S    1  0.4   0:00.15 http
11665 root      15   0  652m  56m 5108 S    0  1.4   0:00.39 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.15 http
11665 root      15   0  652m  56m 5108 S    0  1.4   0:00.39 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.15 http
11665 root      15   0  652m  57m 5108 S    1  1.4   0:00.40 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.15 http
11665 root      15   0  652m  57m 5108 S    0  1.4   0:00.40 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.15 http
11665 root      15   0  652m  57m 5108 S    0  1.4   0:00.40 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.15 http
11665 root      15   0  652m  59m 5108 S    2  1.5   0:00.42 node
11668 root      18   0 18032  16m  552 S    1  0.4   0:00.16 http
11665 root      15   0  652m  59m 5108 S    0  1.5   0:00.42 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.16 http
11665 root      15   0  652m  59m 5108 S    0  1.5   0:00.42 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.16 http
11665 root      15   0  652m  61m 5108 S    1  1.6   0:00.43 node
11668 root      18   0 18032  16m  552 S    1  0.4   0:00.17 http
11665 root      15   0  652m  61m 5108 S    0  1.6   0:00.43 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.17 http
11665 root      15   0  652m  61m 5108 S    0  1.6   0:00.43 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.17 http
11665 root      15   0  652m  64m 5108 S    2  1.6   0:00.45 node
11668 root      18   0 18032  16m  552 S    1  0.4   0:00.18 http
11665 root      15   0  652m  64m 5108 S    0  1.6   0:00.45 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.18 http
11665 root      15   0  652m  64m 5108 S    0  1.6   0:00.45 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.18 http
11665 root      16   0  653m  64m 5108 S    7  1.6   0:00.52 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.18 http
11665 root      15   0  653m  64m 5108 S    0  1.6   0:00.52 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.18 http
11665 root      15   0  653m  64m 5108 S    0  1.6   0:00.52 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.18 http
11665 root      16   0  665m  76m 5108 S    5  1.9   0:00.57 node
11668 root      18   0 18032  16m  552 S    1  0.4   0:00.19 http
11665 root      15   0  665m  76m 5108 S    0  1.9   0:00.57 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.19 http
11665 root      15   0  665m  76m 5108 S    0  1.9   0:00.57 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.19 http
11665 root      15   0  665m  76m 5108 S    2  1.9   0:00.59 node
11668 root      18   0 18032  16m  552 S    2  0.4   0:00.21 http
11665 root      15   0  665m  76m 5108 S    0  1.9   0:00.59 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.21 http
11665 root      15   0  665m  76m 5108 S    0  1.9   0:00.59 node
11668 root      18   0 18032  16m  552 S    0  0.4   0:00.21 http
11665 root      16   0  665m  77m 5108 S    1  1.9   0:00.60 node
11668 root      18   0 18220  16m  552 S    1  0.4   0:00.22 http
11665 root      15   0  665m  77m 5108 S    0  1.9   0:00.60 node
11668 root      18   0 18220  16m  552 S    0  0.4   0:00.22 http
11665 root      15   0  665m  77m 5108 S    0  1.9   0:00.60 node
11668 root      18   0 18220  16m  552 S    0  0.4   0:00.22 http
11665 root      15   0  666m  77m 5108 S    3  2.0   0:00.63 node
11668 root      18   0 18220  16m  552 S    1  0.4   0:00.23 http
11665 root      15   0  666m  77m 5108 S    0  2.0   0:00.63 node
11668 root      18   0 18220  16m  552 S    0  0.4   0:00.23 http
11665 root      15   0  666m  77m 5108 S    0  2.0   0:00.63 node
11668 root      18   0 18220  16m  552 S    0  0.4   0:00.23 http
11665 root      15   0  666m  77m 5108 S    1  2.0   0:00.64 node
11668 root      18   0 18220  16m  552 S    1  0.4   0:00.24 http
11665 root      15   0  666m  77m 5108 S    0  2.0   0:00.64 node
11668 root      18   0 18220  16m  552 S    0  0.4   0:00.24 http
11665 root      15   0  666m  77m 5108 S    0  2.0   0:00.64 node
11668 root      18   0 18220  16m  552 S    0  0.4   0:00.24 http
11665 root      16   0  666m  77m 5108 S    8  2.0   0:00.72 node
11668 root      18   0 18220  16m  552 S    0  0.4   0:00.24 http
11665 root      15   0  666m  77m 5108 S    0  2.0   0:00.72 node
11668 root      18   0 18220  16m  552 S    0  0.4   0:00.24 http
11665 root      15   0  666m  77m 5108 S    0  2.0   0:00.72 node
11668 root      18   0 18220  16m  552 S    0  0.4   0:00.24 http
11665 root      15   0  666m  77m 5108 S    2  2.0   0:00.74 node
11668 root      18   0 18220  16m  552 S    1  0.4   0:00.25 http
11665 root      15   0  666m  77m 5108 S    0  2.0   0:00.74 node
11668 root      18   0 18220  16m  552 S    0  0.4   0:00.25 http
11665 root      15   0  666m  77m 5108 S    0  2.0   0:00.74 node
11668 root      18   0 18220  16m  552 S    0  0.4   0:00.25 http
11665 root      15   0  678m  89m 5108 S    5  2.3   0:00.79 node
11668 root      18   0 18220  16m  552 S    2  0.4   0:00.27 http
11665 root      15   0  678m  89m 5108 S    0  2.3   0:00.79 node
11668 root      18   0 18220  16m  552 S    0  0.4   0:00.27 http
11665 root      15   0  678m  89m 5108 S    0  2.3   0:00.79 node
11668 root      18   0 18220  16m  552 S    0  0.4   0:00.27 http
11665 root      15   0  678m  89m 5108 S    1  2.3   0:00.80 node
11668 root      18   0 18220  16m  552 S    0  0.4   0:00.27 http
11665 root      15   0  678m  89m 5108 S    0  2.3   0:00.80 node
11668 root      18   0 18220  16m  552 S    0  0.4   0:00.27 http
11665 root      15   0  678m  89m 5108 S    0  2.3   0:00.80 node
11668 root      18   0 18220  16m  552 S    0  0.4   0:00.27 http
11665 root      16   0  695m 105m 5120 R   66  2.7   0:01.46 node
11668 root      15   0 18220  16m  552 S   21  0.4   0:00.48 http
11665 root      17   0  711m 121m 5120 R  100  3.1   0:02.47 node
11668 root      15   0 18220  16m  552 S   24  0.4   0:00.72 http
11665 root      19   0  728m 138m 5120 R  100  3.5   0:03.48 node
11668 root      15   0 18220  16m  552 S   26  0.4   0:00.98 http
11665 root      21   0  736m 146m 5120 R   99  3.7   0:04.48 node
11668 root      15   0 18220  16m  552 S   13  0.4   0:01.11 http
11665 root      25   0  736m 146m 5120 R  100  3.7   0:05.49 node
11668 root      15   0 18220  16m  552 S   13  0.4   0:01.24 http
11665 root      25   0  737m 147m 5120 R   99  3.7   0:06.49 node
11668 root      15   0 18220  16m  552 S   11  0.4   0:01.35 http
11665 root      25   0  737m 147m 5120 R   99  3.7   0:07.49 node
11668 root      15   0 18220  16m  552 R   12  0.4   0:01.47 http
11665 root      25   0  717m 127m 5124 R   99  3.2   0:08.48 node
11668 root      15   0 18220  16m  552 S   13  0.4   0:01.60 http
11665 root      25   0  734m 143m 5124 R   99  3.6   0:09.48 node
11668 root      15   0 18220  16m  552 S   12  0.4   0:01.72 http
11665 root      25   0  737m 146m 5124 R  100  3.7   0:10.49 node
11668 root      15   0 18220  16m  552 S    8  0.4   0:01.80 http
11665 root      25   0  737m 146m 5124 R   99  3.7   0:11.49 node
11668 root      15   0 18220  16m  552 S   13  0.4   0:01.93 http
11665 root      25   0  737m 146m 5124 R   99  3.7   0:12.49 node
11668 root      15   0 18220  16m  552 R   11  0.4   0:02.04 http
11665 root      25   0  737m 146m 5124 R  101  3.7   0:13.50 node
11668 root      15   0 18220  16m  552 R   10  0.4   0:02.14 http
11665 root      25   0  737m 146m 5124 R   96  3.7   0:14.46 node
11668 root      15   0 18220  16m  552 R   16  0.4   0:02.30 http
11665 root      25   0  737m 146m 5124 R   84  3.7   0:15.30 node
11668 root      15   0 18220  16m  552 S   17  0.4   0:02.47 http
11665 root      25   0  737m 146m 5124 R   82  3.7   0:16.12 node
11668 root      15   0 18220  16m  552 S   18  0.4   0:02.65 http
11665 root      25   0  738m 147m 5124 R   94  3.7   0:17.06 node
11668 root      15   0 18220  16m  552 R   18  0.4   0:02.83 http
11665 root      25   0  738m 147m 5124 R   98  3.7   0:18.05 node
11668 root      15   0 18220  16m  552 S   21  0.4   0:03.04 http
11665 root      25   0  738m 148m 5124 R   97  3.7   0:19.03 node
11668 root      15   0 18220  16m  552 S   17  0.4   0:03.21 http
11665 root      25   0  738m 148m 5124 R   97  3.7   0:20.00 node
11668 root      15   0 18220  16m  552 S   20  0.4   0:03.41 http
11665 root      25   0  738m 147m 5124 R  100  3.7   0:21.01 node
11668 root      15   0 18220  16m  552 R   18  0.4   0:03.59 http
11665 root      25   0  738m 148m 5124 R  100  3.7   0:22.02 node
11668 root      15   0 18220  16m  552 S   29  0.4   0:03.88 http
11665 root      25   0  738m 148m 5124 R   99  3.7   0:23.02 node
11668 root      15   0 18220  16m  552 S   26  0.4   0:04.14 http
11665 root      25   0  738m 148m 5124 R  100  3.7   0:24.03 node
11668 root      15   0 18220  16m  552 R   32  0.4   0:04.46 http
11665 root      25   0  738m 148m 5124 R  100  3.7   0:25.04 node
11668 root      15   0 18220  16m  552 R   27  0.4   0:04.73 http
11665 root      25   0  738m 148m 5124 R   99  3.7   0:26.04 node
11668 root      15   0 18220  16m  552 S   28  0.4   0:05.01 http
11665 root      25   0  737m 147m 5124 R  100  3.7   0:27.05 node
11668 root      15   0 18220  16m  552 S   28  0.4   0:05.29 http
11665 root      25   0  737m 147m 5124 R  100  3.7   0:28.06 node
11668 root      15   0 18220  16m  552 S   30  0.4   0:05.59 http
11665 root      25   0  738m 148m 5124 R   99  3.7   0:29.06 node
11668 root      15   0 18220  16m  552 S   28  0.4   0:05.87 http
11665 root      25   0  719m 129m 5124 R  100  3.3   0:30.07 node
11668 root      15   0 18220  16m  552 R   21  0.4   0:06.08 http
11665 root      25   0  735m 145m 5124 R   99  3.7   0:31.07 node
11668 root      15   0 18220  16m  552 S   30  0.4   0:06.38 http
11665 root      25   0  735m 145m 5124 R   99  3.7   0:32.07 node
11668 root      15   0 18220  16m  552 S   26  0.4   0:06.64 http
11665 root      25   0  735m 145m 5124 R  100  3.7   0:33.08 node
11668 root      15   0 18220  16m  552 S   32  0.4   0:06.96 http
11665 root      25   0  737m 147m 5124 R  100  3.7   0:34.09 node
11668 root      15   0 18220  16m  552 R   27  0.4   0:07.23 http
11665 root      25   0  737m 147m 5124 R   99  3.7   0:35.09 node
11668 root      15   0 18220  16m  552 R   31  0.4   0:07.54 http
11665 root      25   0  737m 147m 5124 R  100  3.7   0:36.10 node
11668 root      15   0 18220  16m  552 S   28  0.4   0:07.82 http
11665 root      25   0  737m 147m 5124 R   96  3.7   0:37.07 node
11668 root      15   0 18220  16m  552 R   21  0.4   0:08.03 http
11665 root      25   0  737m 147m 5124 R   99  3.7   0:38.07 node
11668 root      15   0 18220  16m  552 S   25  0.4   0:08.28 http
11665 root      25   0  737m 147m 5124 R  100  3.7   0:39.08 node
11668 root      15   0 18220  16m  552 S   27  0.4   0:08.55 http
11665 root      25   0  738m 147m 5124 R   99  3.7   0:40.08 node
11668 root      15   0 18220  16m  552 R   24  0.4   0:08.79 http
11665 root      25   0  735m 117m 5124 R   99  3.0   0:41.08 node
11665 root      25   0  678m  57m 5124 S   29  1.5   0:41.37 node
11665 root      25   0  678m  57m 5124 S    0  1.5   0:41.37 node
11665 root      25   0  678m  57m 5124 S    0  1.5   0:41.37 node

During the connection part of the load test there was little CPU used (the possible node.js bug?) but memory of the node process steadily increased from about 9MB to about 89MB. This means that node is allocating about 8KB of dynamic memory per connection. During the query part of the load test the node process gets close to 100% CPU which is good. Also, the memory for the node process peak at about 148MB. This means that — in addition to the 8KB of dynamic memory already allocated per connection — upon receiving the query node allocates about an additional 6KB per connection; so about 14KB per connection in total.

Now I test the speed of the SXE example “Hello World” server:

# ./http -i 127.0.0.1 -p 8000 -n 50 -c 10000
20100929 220002.920 P00002dce ------ 1 - connecting via ramp 10000 sockets to peer 127.0.0.1:8000
20100929 220002.964 P00002dce    999 1 - connected: 1000
20100929 220003.003 P00002dce   1999 1 - connected: 2000
20100929 220003.043 P00002dce   2999 1 - connected: 3000
20100929 220003.082 P00002dce   3999 1 - connected: 4000
20100929 220003.122 P00002dce   4999 1 - connected: 5000
20100929 220003.161 P00002dce   5999 1 - connected: 6000
20100929 220003.201 P00002dce   6999 1 - connected: 7000
20100929 220003.240 P00002dce   7999 1 - connected: 8000
20100929 220003.281 P00002dce   8999 1 - connected: 9000
20100929 220003.320 P00002dce   9999 1 - connected: 10000
20100929 220003.320 P00002dce ------ 1 - starting writes: 500000 (= 10000 sockets * 50 queries/socket) queries
20100929 220003.320 P00002dce ------ 1 - using query of 198 bytes:
20100929 220003.320 P00002dce ------ 1 - 080562c0 47 45 54 20 2f 31 32 33 34 35 36 37 38 39 2f 31 GET /123456789/1
20100929 220003.320 P00002dce ------ 1 - 080562d0 32 33 34 35 36 37 38 39 2f 31 32 33 34 35 36 37 23456789/1234567
20100929 220003.320 P00002dce ------ 1 - 080562e0 38 39 2f 31 32 33 34 35 36 37 38 39 2f 31 32 33 89/123456789/123
20100929 220003.320 P00002dce ------ 1 - 080562f0 34 35 36 37 38 39 2f 31 32 33 34 35 36 37 38 39 456789/123456789
20100929 220003.320 P00002dce ------ 1 - 08056300 2f 31 32 33 34 35 36 37 38 39 2f 31 32 33 34 35 /123456789/12345
20100929 220003.320 P00002dce ------ 1 - 08056310 36 37 2e 68 74 6d 20 48 54 54 50 2f 31 2e 31 0d 67.htm HTTP/1.1.
20100929 220003.320 P00002dce ------ 1 - 08056320 0a 43 6f 6e 6e 65 63 74 69 6f 6e 3a 20 4b 65 65 .Connection: Kee
20100929 220003.320 P00002dce ------ 1 - 08056330 70 2d 41 6c 69 76 65 0d 0a 48 6f 73 74 3a 20 31 p-Alive..Host: 1
20100929 220003.320 P00002dce ------ 1 - 08056340 32 37 2e 30 2e 30 2e 31 3a 38 30 30 30 0d 0a 55 27.0.0.1:8000..U
20100929 220003.320 P00002dce ------ 1 - 08056350 73 65 72 2d 41 67 65 6e 74 3a 20 53 58 45 2d 68 ser-Agent: SXE-h
20100929 220003.320 P00002dce ------ 1 - 08056360 74 74 70 2d 6c 6f 61 64 2d 6b 65 65 70 61 6c 69 ttp-load-keepali
20100929 220003.320 P00002dce ------ 1 - 08056370 76 65 2f 31 2e 30 0d 0a 41 63 63 65 70 74 3a 20 ve/1.0..Accept:
20100929 220003.320 P00002dce ------ 1 - 08056380 2a 2f 2a 0d 0a 0d                               */*...
20100929 220011.770 P00002dce   3165 1 - read all expected http responses
20100929 220011.770 P00002dce   3165 1 - time for all connections: 0.399857 seconds or 25008.937931 per second
20100929 220011.770 P00002dce   3165 1 - time for all queries    : 8.450056 seconds or 59171.206630 per second
20100929 220011.770 P00002dce   3165 1 - time for all            : 8.849913 seconds or 56497.731297 per second

Where the node.js implementation manages 107 connections per second, the SXE implementation manages 25,009 connections per second; a 233.7 fold increase. I ignore this result because it’s so bad that it must be a bug in node.js. Further, where the node.js implementation manages 12,344 queries per second, the SXE implementation manages 59,171 queries per second; a 4.8 fold increase.

Like wise, I also monitor memory usage:

# top -b -d1 | egrep "(node|http)"
11715 root      17   0 17788  16m  380 S    0  0.4   0:00.01 httpd
11715 root      17   0 17788  16m  380 S    0  0.4   0:00.01 httpd
11726 root      18   0 18224  16m  524 R   84  0.4   0:00.84 http
11715 root      16   0 18180  16m  392 R   61  0.4   0:00.62 httpd
11715 root      17   0 18180  16m  392 R  100  0.4   0:01.63 httpd
11726 root      17   0 18224  16m  524 R   95  0.4   0:01.79 http
11715 root      18   0 18180  16m  392 R   99  0.4   0:02.63 httpd
11726 root      17   0 18224  16m  524 R   94  0.4   0:02.74 http
11715 root      20   0 18180  16m  392 R   99  0.4   0:03.63 httpd
11726 root      17   0 18224  16m  524 R   94  0.4   0:03.69 http
11715 root      23   0 18180  16m  392 R  100  0.4   0:04.64 httpd
11726 root      17   0 18224  16m  524 R   95  0.4   0:04.65 http
11715 root      25   0 18180  16m  392 R   99  0.4   0:05.64 httpd
11726 root      17   0 18224  16m  524 R   94  0.4   0:05.60 http
11715 root      25   0 18180  16m  392 R  101  0.4   0:06.65 httpd
11726 root      17   0 18224  16m  524 R   95  0.4   0:06.55 http
11715 root      25   0 18180  16m  392 R  100  0.4   0:07.65 httpd
11726 root      17   0 18224  16m  524 R   95  0.4   0:07.50 http
11715 root      25   0 18180  16m  396 R   99  0.4   0:08.65 httpd
11726 root      15   0     0    0    0 R   87  0.0   0:08.37 http
11715 root      25   0 18320  16m  396 S   15  0.4   0:08.80 httpd
11715 root      25   0 18320  16m  396 S    0  0.4   0:08.80 httpd
11715 root      25   0 18320  16m  396 S    0  0.4   0:08.80 httpd

While the node.js implementation peaks at 148MB memory usage, the SXE implementation stays at a constant 16MB memory usage which is 9.25 times smaller.

In conclusion, if you’re planning to build scalable network programs and memory is your bottleneck then implementing with node.js will cause you to employ 9.25 times as many servers as if you had implemented with SXE. Similarly, if CPU is your bottleneck then implementing with node.js will cause you to employ 4.8 times as many servers as if you had implemented with SXE.

Update: I minimized the amount of work the SXE “Hello World” server does when looking at the HTTP header for each query. Previously, it inefficiently searched along for the “Connection: Keep-Alive” header and obeyed that, so I removed this code so that it always assumes the connection is keep alive. It also inefficiently searched along for the end of the HTTP headers; it now just looks at the end of the accumulated data read on the socket; so no searching. The updated source code looks like this: 

    //ignore headers if (! SXE_BUF_STRNSTR(this,"\r\n\r\n")) {
    //ignore headers     SXEL10I("Read partial header; waiting for remainder to be appended");
    //ignore headers     goto SXE_EARLY_OUT;
    //ignore headers }
    //detect end of HTTP headers without searching
    if ((SXE_BUF(this)[SXE_BUF_USED(this)-4] != 0xd)
    ||  (SXE_BUF(this)[SXE_BUF_USED(this)-3] != 0xa)
    ||  (SXE_BUF(this)[SXE_BUF_USED(this)-2] != 0xd)
    ||  (SXE_BUF(this)[SXE_BUF_USED(this)-1] != 0xa)) {
        SXEL60I("Read partial header; waiting for remainder to be appended");
        goto SXE_EARLY_OUT;
    }
    //ignore headers if (SXE_BUF_STRNCASESTR(this,"Connection: Keep-Alive")) {
    //ignore headers     (void)sxe_write(this, (void *)&canned_reply____keep_alive[0], sizeof(canned_reply____keep_alive) - 1);
    //ignore headers     SXEL60I("Connection: Keep-Alive: found");
    //ignore headers }
    //ignore headers else {
    //ignore headers     (void)sxe_write(this, (void *)&canned_reply_no_keep_alive[0], sizeof(canned_reply_no_keep_alive) - 1);
    //ignore headers     SXEL60I("Connection: Keep-Alive: not found; closing");
    //ignore headers     sxe_close(this);
    //ignore headers }
    //assume HTTP request is always keep-alive
    (void)sxe_write(this, (void *)&canned_reply____keep_alive[0], sizeof(canned_reply____keep_alive) - 1);

After these optimizations the SXE “Hello World” server performs 78.437 queries per second. So that’s now 6.4 times faster than node.js (instead of only 4.8 times faster without the optimizations) 🙂

 

 
%d bloggers like this: