Package org.jgroups.protocols
Class FD
java.lang.Object
org.jgroups.stack.Protocol
org.jgroups.protocols.FD
Failure detection based on simple heartbeat protocol. Regularly polls members for
liveness. Multicasts SUSPECT messages when a member is not reachable. The simple
algorithms works as follows: the membership is known and ordered. Each HB protocol
periodically sends an 'are-you-alive' message to its *neighbor*. A neighbor is the next in
rank in the membership list, which is recomputed upon a view change. When a response hasn't
been received for n milliseconds and m tries, the corresponding member is suspected (and
eventually excluded if faulty).
FD starts when it detects (in a view change notification) that there are at least 2 members in the group. It stops running when the membership drops below 2.
When a message is received from the monitored neighbor member, it causes the pinger thread to 'skip' sending the next are-you-alive message. Thus, traffic is reduced.
When we receive a ping from a member that's not in the membership list, we shun it by sending it a
NOT_MEMBER message. That member will then leave the group (and possibly rejoin). This is only done if
shun
is true.
- Version:
- $Id: FD.java,v 1.58.2.3 2008/05/22 13:23:06 belaban Exp $
- Author:
- Bela Ban
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionprotected final class
Task that periodically broadcasts a list of suspected members to the group.protected final class
static class
protected class
-
Field Summary
FieldsModifier and TypeFieldDescriptionprotected final FD.Broadcaster
Transmits SUSPECT message until view change or UNSUSPECT is receivedprotected final Lock
protected int
protected int
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionAn event is to be sent down the stack.int
int
getName()
int
int
long
void
init()
Called after instance has been created (null constructor) and before protocol is started.boolean
isShun()
void
void
setMaxTries
(int max_tries) boolean
setProperties
(Properties props) Configures the protocol initially.void
setShun
(boolean flag) void
setTimeout
(long timeout) void
stop()
This method is called on aChannel.disconnect()
.An event was received from the layer below.Methods inherited from class org.jgroups.stack.Protocol
destroy, downThreadEnabled, dumpStats, enableStats, getDownProtocol, getProperties, getProtocolStack, getThreadFactory, getTransport, getUpProtocol, printStats, providedDownServices, providedUpServices, requiredDownServices, requiredUpServices, setDownProtocol, setPropertiesInternal, setProtocolStack, setUpProtocol, start, statsEnabled, upThreadEnabled
-
Field Details
-
lock
-
num_heartbeats
protected int num_heartbeats -
num_suspect_events
protected int num_suspect_events -
bcast_task
Transmits SUSPECT message until view change or UNSUSPECT is received
-
-
Constructor Details
-
FD
public FD()
-
-
Method Details
-
getName
-
getLocalAddress
-
getMembers
-
getPingableMembers
-
getPingDest
-
getNumberOfHeartbeatsSent
public int getNumberOfHeartbeatsSent() -
getNumSuspectEventsGenerated
public int getNumSuspectEventsGenerated() -
getTimeout
public long getTimeout() -
setTimeout
public void setTimeout(long timeout) -
getMaxTries
public int getMaxTries() -
setMaxTries
public void setMaxTries(int max_tries) -
getCurrentNumTries
public int getCurrentNumTries() -
isShun
public boolean isShun() -
setShun
public void setShun(boolean flag) -
printSuspectHistory
-
setProperties
Description copied from class:Protocol
Configures the protocol initially. A configuration string consists of name=value items, separated by a ';' (semicolon), e.g.:"loopback=false;unicast_inport=4444"
- Overrides:
setProperties
in classProtocol
-
resetStats
public void resetStats()- Overrides:
resetStats
in classProtocol
-
init
Description copied from class:Protocol
Called after instance has been created (null constructor) and before protocol is started. Properties are already set. Other protocols are not yet connected and events cannot yet be sent. -
stop
public void stop()Description copied from class:Protocol
This method is called on aChannel.disconnect()
. Stops work (e.g. by closing multicast socket). Will be called from top to bottom. This means that at the time of the method invocation the neighbor protocol below is still working. This method will replace the STOP, STOP_OK, CLEANUP and CLEANUP_OK events. The ProtocolStack guarantees that when this method is called all messages in the down queue will have been flushed -
up
Description copied from class:Protocol
An event was received from the layer below. Usually the current layer will want to examine the event type and - depending on its type - perform some computation (e.g. removing headers from a MSG event type, or updating the internal membership list when receiving a VIEW_CHANGE event). Finally the event is either a) discarded, or b) an event is sent down the stack usingdown_prot.down()
or c) the event (or another event) is sent up the stack usingup_prot.up()
. -
down
Description copied from class:Protocol
An event is to be sent down the stack. The layer may want to examine its type and perform some action on it, depending on the event's type. If the event is a message MSG, then the layer may need to add a header to it (or do nothing at all) before sending it down the stack usingdown_prot.down()
. In case of a GET_ADDRESS event (which tries to retrieve the stack's address from one of the bottom layers), the layer may need to send a new response event back up the stack usingup_prot.up()
.
-