Zookeeper Leader election and timeouts

My cluster of 3 nodes running fine for a while till one of the nodes died. This node was the LEADER. I guessed the cluster would still be fine since 2/3 nodes were still healthy. However, it looked like it was unable to elect a leader a set up quorum properly.

Here’s what I was getting:

2014-11-11 12:09:36,101 [myid:1] - WARN  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when following the leader
java.net.ConnectException: Connection refused
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:382)
        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:241)
        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:228)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:365)
        at java.net.Socket.connect(Socket.java:527)
        at org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:225)
        at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:71)
        at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:786)

and

2014-11-11 12:09:36,102 [myid:1] - INFO  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@166] - shutdown called
java.lang.Exception: shutdown Follower
        at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
        at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:790)

There’s a configuration initLimit which defines the amount of time (in ticks) that the initial synchronization phase can take. This value defaults to 10 in zookeeper. However, it turns out that my cluster had enough data to sync in the initial phase, which took longer the initLimit specified. Increasing the initLimit to about 50 fixed the issue. However, I wonder the side effects of a much higher initLimit value on the cluster.

More details after searching the net:

What happened here was that the server that was being elected as leader did go through leader election process successfully. It then started to send a snapshot of the state to its follower, however, before that process could be completed and the follower could finish sync to the leader state, the initLimit timeout was reached, and the leader thread decided it had to give up. So increasing initLimit to a value that allowed the snapshot transfer to complete fixed this problem.