Saturday, April 10, 2010

More Information about Hearbeats and Timeouts Than I Really Care to Know

‹prev | My Chain | next›

For the past week or so, I have been exploring the CouchDB changes API. I have been using node.couch.js, a node.js library built on top of changes. Specifically, it makes requests for a continuous feed of CouchDB changes and then acts on changes functions defined in CouchDB design documents. The URL requesting the changes feed looks something like:
curl http://localhost:5984/test/_changes?heartbeat=20000&feed=continuous&since=706
For testing purposes, the change handling that I am doing simple prints out the change itself.

Reading (in more detail) the CouchDB book on the change API, I now understand that the heartbeat is empty data sent back from the server to keep the connection alive. A tiny bit of data is sent back, preventing the network from believing the connection has closed and dropping it. This is different from the timeout parameter on the changes API, which instructs the CouchDB server to end the connection after the specified time.

I had been confused by the behavior that I had seen when I was first playing with node.couch.js. When I started the changes listener, then waited for 60 seconds, I found:
cstrom@whitefall:~/repos/node.couch.js$ node ./changes/lib/service.js http://localhost:5984
{"last_seq":77}
After that, no more changes would be noticed by the listener. I surmise that this is the timeout behavior manifesting itself. I can verify that by setting the timeout to 2 seconds and observing the same behavior.

Nowadays, when I run the same command, I get
cstrom@whitefall:~/repos/node.couch.js$ node ./changes/lib/service.js http://localhsot:5984

Nothing. To be more specific, I get no timeout. This is because I added a heartbeat to node.couch.js the other day preventing network and/or server timeouts.

On a lark, I remove the heartbeat to verify that nothing has changed. After running the node.couch.js service again, I find:
cstrom@whitefall:~/repos/node.couch.js$ node ./changes/lib/service.js http://localhost:5984

Nothing?! No timeout...

Ugh. I have upgraded to Ubuntu 10.4 (Lucid) in the interim, could that be the problem? It seems unlikely since I am still using version 0.10.0:
cstrom@whitefall:~/repos/node.couch.js$ curl http://localhost:5984
{"couchdb":"Welcome","version":"0.10.0"}
I have CouchDB 0.11 installed in a VM, what happens when I try the non-heartbeat node.couch.js against a 0.11 version? Again, nothing:
cstrom@whitefall:~/repos/node.couch.js$ node ./changes/lib/service.js http://couch-011a.local:5984

Ugh.

After a while I realize that in both 0.10 or 0.11, changes are no longer being sent back to the node.couch.js listener. The connections have not timed out, but changes are no longer reported. In both, I find Erlang stacktraces like this:
[Sun, 11 Apr 2010 03:16:32 GMT] [error] [<0.17615.0>] Uncaught error in HTTP request: {exit,normal}

[Sun, 11 Apr 2010 03:16:32 GMT] [info] [<0.17615.0>] Stacktrace: [{mochiweb_request,send,2},
{couch_httpd,send_chunk,2},
{couch_httpd,end_json_response,1},
{couch_httpd_db,handle_changes_req,2},
{couch_httpd_db,do_db_req,2},
{couch_httpd,handle_request,5},
{mochiweb_http,headers,5},
{proc_lib,init_p_do_apply,3}]

[Sun, 11 Apr 2010 03:16:32 GMT] [debug] [<0.17615.0>] httpd 500 error response:
{"error":"unknown_error","reason":"normal"}
Bah! I doubt I will be able to work out this particular error. My best guess is that a network timeout is occurring before the server can officially timeout.

What I do know is that explicitly setting a timeout when requesting a continuous feed of the CouchDB changes API works, though it seems best to keep the timeout to less than a minute. Explicitly setting a hearbeat also works as expected—seemingly keeping the connection open indefinitely. Since this is what I expect when playing with the feed, I'll leave that setting in my forked copy of node.couch.js.

Tomorrow, I am definitely moving on to something else.

Day #69

2 comments:

  1. Glad I just found this post.

    I thought I was going crazy as I observed similar behaviour with my newly written continuous _changes feed. In a heartbeat my problem was solved

    Thanks

    ReplyDelete
  2. Cool! Glad it helped. I can't believe how long I spent trying to track this down. Good to know that documenting my many struggles helped someone else :)

    ReplyDelete