Class FD

Direct Known Subclasses:
FD_ICMP, FD_PING

public class FD extends Protocol
Failure detection based on simple heartbeat protocol. Regularly polls members for liveness. Multicasts SUSPECT messages when a member is not reachable. The simple algorithms works as follows: the membership is known and ordered. Each HB protocol periodically sends an 'are-you-alive' message to its *neighbor*. A neighbor is the next in rank in the membership list, which is recomputed upon a view change. When a response hasn't been received for n milliseconds and m tries, the corresponding member is suspected (and eventually excluded if faulty).

FD starts when it detects (in a view change notification) that there are at least 2 members in the group. It stops running when the membership drops below 2.

When a message is received from the monitored neighbor member, it causes the pinger thread to 'skip' sending the next are-you-alive message. Thus, traffic is reduced.

When we receive a ping from a member that's not in the membership list, we shun it by sending it a NOT_MEMBER message. That member will then leave the group (and possibly rejoin). This is only done if shun is true.

Version:
$Id: FD.java,v 1.58.2.3 2008/05/22 13:23:06 belaban Exp $
Author:
Bela Ban
  • Field Details

    • lock

      protected final Lock lock
    • num_heartbeats

      protected int num_heartbeats
    • num_suspect_events

      protected int num_suspect_events
    • bcast_task

      protected final FD.Broadcaster bcast_task
      Transmits SUSPECT message until view change or UNSUSPECT is received
  • Constructor Details

    • FD

      public FD()
  • Method Details

    • getName

      public String getName()
      Specified by:
      getName in class Protocol
    • getLocalAddress

      public String getLocalAddress()
    • getMembers

      public String getMembers()
    • getPingableMembers

      public String getPingableMembers()
    • getPingDest

      public String getPingDest()
    • getNumberOfHeartbeatsSent

      public int getNumberOfHeartbeatsSent()
    • getNumSuspectEventsGenerated

      public int getNumSuspectEventsGenerated()
    • getTimeout

      public long getTimeout()
    • setTimeout

      public void setTimeout(long timeout)
    • getMaxTries

      public int getMaxTries()
    • setMaxTries

      public void setMaxTries(int max_tries)
    • getCurrentNumTries

      public int getCurrentNumTries()
    • isShun

      public boolean isShun()
    • setShun

      public void setShun(boolean flag)
    • printSuspectHistory

      public String printSuspectHistory()
    • setProperties

      public boolean setProperties(Properties props)
      Description copied from class: Protocol
      Configures the protocol initially. A configuration string consists of name=value items, separated by a ';' (semicolon), e.g.:
       "loopback=false;unicast_inport=4444"
       
      Overrides:
      setProperties in class Protocol
    • resetStats

      public void resetStats()
      Overrides:
      resetStats in class Protocol
    • init

      public void init() throws Exception
      Description copied from class: Protocol
      Called after instance has been created (null constructor) and before protocol is started. Properties are already set. Other protocols are not yet connected and events cannot yet be sent.
      Overrides:
      init in class Protocol
      Throws:
      Exception - Thrown if protocol cannot be initialized successfully. This will cause the ProtocolStack to fail, so the channel constructor will throw an exception
    • stop

      public void stop()
      Description copied from class: Protocol
      This method is called on a Channel.disconnect(). Stops work (e.g. by closing multicast socket). Will be called from top to bottom. This means that at the time of the method invocation the neighbor protocol below is still working. This method will replace the STOP, STOP_OK, CLEANUP and CLEANUP_OK events. The ProtocolStack guarantees that when this method is called all messages in the down queue will have been flushed
      Overrides:
      stop in class Protocol
    • up

      public Object up(Event evt)
      Description copied from class: Protocol
      An event was received from the layer below. Usually the current layer will want to examine the event type and - depending on its type - perform some computation (e.g. removing headers from a MSG event type, or updating the internal membership list when receiving a VIEW_CHANGE event). Finally the event is either a) discarded, or b) an event is sent down the stack using down_prot.down() or c) the event (or another event) is sent up the stack using up_prot.up().
      Overrides:
      up in class Protocol
    • down

      public Object down(Event evt)
      Description copied from class: Protocol
      An event is to be sent down the stack. The layer may want to examine its type and perform some action on it, depending on the event's type. If the event is a message MSG, then the layer may need to add a header to it (or do nothing at all) before sending it down the stack using down_prot.down(). In case of a GET_ADDRESS event (which tries to retrieve the stack's address from one of the bottom layers), the layer may need to send a new response event back up the stack using up_prot.up().
      Overrides:
      down in class Protocol