Tech-invite3GPPspaceIETFspace
96959493929190898887868584838281807978777675747372717069686766656463626160595857565554535251504948474645444342414039383736353433323130292827262524232221201918171615141312111009080706050403020100
in Index   Prev   Next

RFC 4601

Protocol Independent Multicast - Sparse Mode (PIM-SM): Protocol Specification (Revised)

Pages: 150
Obsoletes:  2362
Obsoleted by:  7761
Updated by:  505957966226
Part 2 of 6 – Pages 12 to 37
First   Prev   Next

ToP   noToC   RFC4601 - Page 12   prevText

4. Protocol Specification

The specification of PIM-SM is broken into several parts: o Section 4.1 details the protocol state stored. o Section 4.2 specifies the data packet forwarding rules. o Section 4.3 specifies Designated Router (DR) election and the rules for sending and processing Hello messages. o Section 4.4 specifies the PIM Register generation and processing rules. o Section 4.5 specifies the PIM Join/Prune generation and processing rules. o Section 4.6 specifies the PIM Assert generation and processing rules. o Section 4.7 specifies the RP discovery mechanisms. o The subset of PIM required to support Source-Specific Multicast, PIM-SSM, is described in Section 4.8. o PIM packet formats are specified in Section 4.9.
ToP   noToC   RFC4601 - Page 13
   o A summary of PIM-SM timers and their default values is given in
     Section 4.10.

   o Appendix A specifies the PIM Multicast Border Router behavior.

4.1. PIM Protocol State

This section specifies all the protocol state that a PIM implementation should maintain in order to function correctly. We term this state the Tree Information Base (TIB), as it holds the state of all the multicast distribution trees at this router. In this specification, we define PIM mechanisms in terms of the TIB. However, only a very simple implementation would actually implement packet forwarding operations in terms of this state. Most implementations will use this state to build a multicast forwarding table, which would then be updated when the relevant state in the TIB changes. Although we specify precisely the state to be kept, this does not mean that an implementation of PIM-SM needs to hold the state in this form. This is actually an abstract state definition, which is needed in order to specify the router's behavior. A PIM-SM implementation is free to hold whatever internal state it requires and will still be conformant with this specification so long as it results in the same externally visible protocol behavior as an abstract router that holds the following state. We divide TIB state into four sections: (*,*,RP) state State that maintains per-RP trees, for all groups served by a given RP. (*,G) state State that maintains the RP tree for G. (S,G) state State that maintains a source-specific tree for source S and group G. (S,G,rpt) state State that maintains source-specific information about source S on the RP tree for G. For example, if a source is being received on the source-specific tree, it will normally have been pruned off the RP tree. This prune state is (S,G,rpt) state.
ToP   noToC   RFC4601 - Page 14
   The state that should be kept is described below.  Of course,
   implementations will only maintain state when it is relevant to
   forwarding operations; for example, the "NoInfo" state might be
   assumed from the lack of other state information rather than being
   held explicitly.

4.1.1. General Purpose State

A router holds the following non-group-specific state: For each interface: o Effective Override Interval o Effective Propagation Delay o Suppression state: One of {"Enable", "Disable"} Neighbor State: For each neighbor: o Information from neighbor's Hello o Neighbor's GenID. o Neighbor Liveness Timer (NLT) Designated Router (DR) State: o Designated Router's IP Address o DR's DR Priority The Effective Override Interval, the Effective Propagation Delay and the Interface suppression state are described in Section 4.3.3. Designated Router state is described in Section 4.3.
ToP   noToC   RFC4601 - Page 15

4.1.2. (*,*,RP) State

For every RP, a router keeps the following state: (*,*,RP) state: For each interface: PIM (*,*,RP) Join/Prune State: o State: One of {"NoInfo" (NI), "Join" (J), "Prune- Pending" (PP)} o Prune-Pending Timer (PPT) o Join/Prune Expiry Timer (ET) Not interface specific: Upstream (*,*,RP) Join/Prune State: o State: One of {"NotJoined(*,*,RP)", "Joined(*,*,RP)"} o Upstream Join/Prune Timer (JT) o Last RPF Neighbor towards RP that was used PIM (*,*,RP) Join/Prune state is the result of receiving PIM (*,*,RP) Join/Prune messages on this interface and is specified in Section 4.5.1. The upstream (*,*,RP) Join/Prune State reflects the state of the upstream (*,*,RP) state machine described in Section 4.5.5. The upstream (*,*,RP) Join/Prune Timer is used to send out periodic Join(*,*,RP) messages, and to override Prune(*,*,RP) messages from peers on an upstream LAN interface. The last RPF neighbor towards the RP is stored because if the MRIB changes, then the RPF neighbor towards the RP may change. If it does so, then we need to trigger a new Join(*,*,RP) to the new upstream neighbor and a Prune(*,*,RP) to the old upstream neighbor. Similarly, if a router detects through a changed GenID in a Hello message that the upstream neighbor towards the RP has rebooted, then it should re-instantiate state by sending a Join(*,*,RP). These mechanisms are specified in Section 4.5.5.
ToP   noToC   RFC4601 - Page 16

4.1.3. (*,G) State

For every group G, a router keeps the following state: (*,G) state: For each interface: Local Membership: State: One of {"NoInfo", "Include"} PIM (*,G) Join/Prune State: o State: One of {"NoInfo" (NI), "Join" (J), "Prune- Pending" (PP)} o Prune-Pending Timer (PPT) o Join/Prune Expiry Timer (ET) (*,G) Assert Winner State o State: One of {"NoInfo" (NI), "I lost Assert" (L), "I won Assert" (W)} o Assert Timer (AT) o Assert winner's IP Address (AssertWinner) o Assert winner's Assert Metric (AssertWinnerMetric) Not interface specific: Upstream (*,G) Join/Prune State: o State: One of {"NotJoined(*,G)", "Joined(*,G)"} o Upstream Join/Prune Timer (JT) o Last RP Used o Last RPF Neighbor towards RP that was used Local membership is the result of the local membership mechanism (such as IGMP or MLD) running on that interface. It need not be kept if this router is not the DR on that interface unless this router won a (*,G) assert on this interface for this group, although implementations may optionally keep this state in case they become the DR or assert winner. We recommend storing this information if
ToP   noToC   RFC4601 - Page 17
   possible, as it reduces latency converging to stable operating
   conditions after a failure causing a change of DR.  This information
   is used by the pim_include(*,G) macro described in Section 4.1.6.

   PIM (*,G) Join/Prune state is the result of receiving PIM (*,G)
   Join/Prune messages on this interface and is specified in Section
   4.5.2.  The state is used by the macros that calculate the outgoing
   interface list in Section 4.1.6, and in the JoinDesired(*,G) macro
   (defined in Section 4.5.6) that is used in deciding whether a
   Join(*,G) should be sent upstream.

   (*,G) Assert Winner state is the result of sending or receiving (*,G)
   Assert messages on this interface.  It is specified in Section 4.6.2.

   The upstream (*,G) Join/Prune State reflects the state of the
   upstream (*,G) state machine described in Section 4.5.6.

   The upstream (*,G) Join/Prune Timer is used to send out periodic
   Join(*,G) messages, and to override Prune(*,G) messages from peers on
   an upstream LAN interface.

   The last RP used must be stored because if the RP-Set changes
   (Section 4.7), then state must be torn down and rebuilt for groups
   whose RP changes.

   The last RPF neighbor towards the RP is stored because if the MRIB
   changes, then the RPF neighbor towards the RP may change.  If it does
   so, then we need to trigger a new Join(*,G) to the new upstream
   neighbor and a Prune(*,G) to the old upstream neighbor.  Similarly,
   if a router detects through a changed GenID in a Hello message that
   the upstream neighbor towards the RP has rebooted, then it should
   re-instantiate state by sending a Join(*,G).  These mechanisms are
   specified in Section 4.5.6.

4.1.4. (S,G) State

For every source/group pair (S,G), a router keeps the following state: (S,G) state: For each interface: Local Membership: State: One of {"NoInfo", "Include"}
ToP   noToC   RFC4601 - Page 18
             PIM (S,G) Join/Prune State:

                  o State: One of {"NoInfo" (NI), "Join" (J), "Prune-
                    Pending" (PP)}

                  o Prune-Pending Timer (PPT)

                  o Join/Prune Expiry Timer (ET)

             (S,G) Assert Winner State

                  o State: One of {"NoInfo" (NI), "I lost Assert" (L),
                    "I won Assert" (W)}

                  o Assert Timer (AT)

                  o Assert winner's IP Address (AssertWinner)

                  o Assert winner's Assert Metric (AssertWinnerMetric)

        Not interface specific:

             Upstream (S,G) Join/Prune State:

                  o State: One of {"NotJoined(S,G)", "Joined(S,G)"}

             o Upstream (S,G) Join/Prune Timer (JT)

             o Last RPF Neighbor towards S that was used

             o SPTbit (indicates (S,G) state is active)

             o (S,G) Keepalive Timer (KAT)


             Additional (S,G) state at the DR:

                  o Register state: One of {"Join" (J), "Prune" (P),
                    "Join-Pending" (JP), "NoInfo" (NI)}

                  o Register-Stop timer

             Additional (S,G) state at the RP:

                  o PMBR: the first PMBR to send a Register for this
                    source with the Border bit set.
ToP   noToC   RFC4601 - Page 19
   Local membership is the result of the local source-specific
   membership mechanism (such as IGMP version 3) running on that
   interface and specifying that this particular source should be
   included.  As stored here, this state is the resulting state after
   any IGMPv3 inconsistencies have been resolved.  It need not be kept
   if this router is not the DR on that interface unless this router won
   a (S,G) assert on this interface for this group.  However, we
   recommend storing this information if possible, as it reduces latency
   converging to stable operating conditions after a failure causing a
   change of DR.  This information is used by the pim_include(S,G) macro
   described in Section 4.1.6.

   PIM (S,G) Join/Prune state is the result of receiving PIM (S,G)
   Join/Prune messages on this interface and is specified in Section
   4.5.2.  The state is used by the macros that calculate the outgoing
   interface list in Section 4.1.6, and in the JoinDesired(S,G) macro
   (defined in Section 4.5.7) that is used in deciding whether a
   Join(S,G) should be sent upstream.

   (S,G) Assert Winner state is the result of sending or receiving (S,G)
   Assert messages on this interface.  It is specified in Section 4.6.1.

   The upstream (S,G) Join/Prune State reflects the state of the
   upstream (S,G) state machine described in Section 4.5.7.

   The upstream (S,G) Join/Prune Timer is used to send out periodic
   Join(S,G) messages, and to override Prune(S,G) messages from peers on
   an upstream LAN interface.

   The last RPF neighbor towards S is stored because if the MRIB
   changes, then the RPF neighbor towards S may change.  If it does so,
   then we need to trigger a new Join(S,G) to the new upstream neighbor
   and a Prune(S,G) to the old upstream neighbor.  Similarly, if the
   router detects through a changed GenID in a Hello message that the
   upstream neighbor towards S has rebooted, then it should re-
   instantiate state by sending a Join(S,G).  These mechanisms are
   specified in Section 4.5.7.

   The SPTbit is used to indicate whether forwarding is taking place on
   the (S,G) Shortest Path Tree (SPT) or on the (*,G) tree.  A router
   can have (S,G) state and still be forwarding on (*,G) state during
   the interval when the source-specific tree is being constructed.
   When SPTbit is FALSE, only (*,G) forwarding state is used to forward
   packets from S to G.  When SPTbit is TRUE, both (*,G) and (S,G)
   forwarding state are used.
ToP   noToC   RFC4601 - Page 20
   The (S,G) Keepalive Timer is updated by data being forwarded using
   this (S,G) forwarding state.  It is used to keep (S,G) state alive in
   the absence of explicit (S,G) Joins.  Amongst other things, this is
   necessary for the so-called "turnaround rules" -- when the RP uses
   (S,G) joins to stop encapsulation, and then (S,G) prunes to prevent
   traffic from unnecessarily reaching the RP.

   On a DR, the (S,G) Register State is used to keep track of whether to
   encapsulate data to the RP on the Register Tunnel; the (S,G)
   Register-Stop timer tracks how long before encapsulation begins again
   for a given (S,G).

   On an RP, the PMBR value must be cleared when the Keepalive Timer
   expires.

4.1.5. (S,G,rpt) State

For every source/group pair (S,G) for which a router also has (*,G) state, it also keeps the following state: (S,G,rpt) state: For each interface: Local Membership: State: One of {"NoInfo", "Exclude"} PIM (S,G,rpt) Join/Prune State: o State: One of {"NoInfo", "Pruned", "Prune- Pending"} o Prune-Pending Timer (PPT) o Join/Prune Expiry Timer (ET) Not interface specific: Upstream (S,G,rpt) Join/Prune State: o State: One of {"RPTNotJoined(G)", "NotPruned(S,G,rpt)", "Pruned(S,G,rpt)"} o Override Timer (OT) Local membership is the result of the local source-specific membership mechanism (such as IGMPv3) running on that interface and specifying that although there is (*,G) Include state, this
ToP   noToC   RFC4601 - Page 21
   particular source should be excluded.  As stored here, this state is
   the resulting state after any IGMPv3 inconsistencies between LAN
   members have been resolved.  It need not be kept if this router is
   not the DR on that interface unless this router won a (*,G) assert on
   this interface for this group.  However, we recommend storing this
   information if possible, as it reduces latency converging to stable
   operating conditions after a failure causing a change of DR.  This
   information is used by the pim_exclude(S,G) macro described in
   Section 4.1.6.

   PIM (S,G,rpt) Join/Prune state is the result of receiving PIM
   (S,G,rpt) Join/Prune messages on this interface and is specified in
   Section 4.5.4.  The state is used by the macros that calculate the
   outgoing interface list in Section 4.1.6, and in the rules for adding
   Prune(S,G,rpt) messages to Join(*,G) messages specified in Section
   4.5.8.

   The upstream (S,G,rpt) Join/Prune state is used along with the
   Override Timer to send the correct override messages in response to
   Join/Prune messages sent by upstream peers on a LAN.  This state and
   behavior are specified in Section 4.5.9.

4.1.6. State Summarization Macros

Using this state, we define the following "macro" definitions, which we will use in the descriptions of the state machines and pseudocode in the following sections. The most important macros are those that define the outgoing interface list (or "olist") for the relevant state. An olist can be "immediate" if it is built directly from the state of the relevant type. For example, the immediate_olist(S,G) is the olist that would be built if the router only had (S,G) state and no (*,G) or (S,G,rpt) state. In contrast, the "inherited" olist inherits state from other types. For example, the inherited_olist(S,G) is the olist that is relevant for forwarding a packet from S to G using both source- specific and group-specific state. There is no immediate_olist(S,G,rpt) as (S,G,rpt) state is negative state; it removes interfaces in the (*,G) olist from the olist that is actually used to forward traffic. The inherited_olist(S,G,rpt) is therefore the olist that would be used for a packet from S to G forwarding on the RP tree. It is a strict subset of (immediate_olist(*,*,RP) (+) immediate_olist(*,G)). Generally speaking, the inherited olists are used for forwarding, and the immediate_olists are used to make decisions about state maintenance.
ToP   noToC   RFC4601 - Page 22
   immediate_olist(*,*,RP) =
       joins(*,*,RP)

   immediate_olist(*,G) =
       joins(*,G) (+) pim_include(*,G) (-) lost_assert(*,G)

   immediate_olist(S,G) =
       joins(S,G) (+) pim_include(S,G) (-) lost_assert(S,G)

   inherited_olist(S,G,rpt) =
           ( joins(*,*,RP(G)) (+) joins(*,G) (-) prunes(S,G,rpt) )
       (+) ( pim_include(*,G) (-) pim_exclude(S,G))
       (-) ( lost_assert(*,G) (+) lost_assert(S,G,rpt) )

   inherited_olist(S,G) =
       inherited_olist(S,G,rpt) (+)
       joins(S,G) (+) pim_include(S,G) (-) lost_assert(S,G)

   The macros pim_include(*,G) and pim_include(S,G) indicate the
   interfaces to which traffic might be forwarded because of hosts that
   are local members on that interface.  Note that normally only the DR
   cares about local membership, but when an assert happens, the assert
   winner takes over responsibility for forwarding traffic to local
   members that have requested traffic on a group or source/group pair.

   pim_include(*,G) =
      { all interfaces I such that:
        ( ( I_am_DR( I ) AND lost_assert(*,G,I) == FALSE )
          OR AssertWinner(*,G,I) == me )
        AND  local_receiver_include(*,G,I) }

   pim_include(S,G) =
       { all interfaces I such that:
         ( (I_am_DR( I ) AND lost_assert(S,G,I) == FALSE )
           OR AssertWinner(S,G,I) == me )
          AND  local_receiver_include(S,G,I) }

   pim_exclude(S,G) =
       { all interfaces I such that:
         ( (I_am_DR( I ) AND lost_assert(*,G,I) == FALSE )
           OR AssertWinner(*,G,I) == me )
          AND  local_receiver_exclude(S,G,I) }

   The clause "local_receiver_include(S,G,I)" is true if the IGMP/MLD
   module or other local membership mechanism has determined that local
   members on interface I desire to receive traffic sent specifically by
   S to G.  "local_receiver_include(*,G,I)" is true if the IGMP/MLD
   module or other local membership mechanism has determined that local
ToP   noToC   RFC4601 - Page 23
   members on interface I desire to receive all traffic sent to G
   (possibly excluding traffic from a specific set of sources).
   "local_receiver_exclude(S,G,I) is true if
   "local_receiver_include(*,G,I)" is true but none of the local members
   desire to receive traffic from S.

   The set "joins(*,*,RP)" is the set of all interfaces on which the
   router has received (*,*,RP) Joins:

   joins(*,*,RP) =
       { all interfaces I such that
         DownstreamJPState(*,*,RP,I) is either Join or
             Prune-Pending }

   DownstreamJPState(*,*,RP,I) is the state of the finite state machine
   in Section 4.5.1.

   The set "joins(*,G)" is the set of all interfaces on which the router
   has received (*,G) Joins:

   joins(*,G) =
       { all interfaces I such that
         DownstreamJPState(*,G,I) is either Join or Prune-Pending }

   DownstreamJPState(*,G,I) is the state of the finite state machine in
   Section 4.5.2.

   The set "joins(S,G)" is the set of all interfaces on which the router
   has received (S,G) Joins:

   joins(S,G) =
       { all interfaces I such that
         DownstreamJPState(S,G,I) is either Join or Prune-Pending }

   DownstreamJPState(S,G,I) is the state of the finite state machine in
   Section 4.5.3.

   The set "prunes(S,G,rpt)" is the set of all interfaces on which the
   router has received (*,G) joins and (S,G,rpt) prunes.

   prunes(S,G,rpt) =
       { all interfaces I such that
         DownstreamJPState(S,G,rpt,I) is Prune or PruneTmp }

   DownstreamJPState(S,G,rpt,I) is the state of the finite state machine
   in Section 4.5.4.
ToP   noToC   RFC4601 - Page 24
   The set "lost_assert(*,G)" is the set of all interfaces on which the
   router has received (*,G) joins but has lost a (*,G) assert.  The
   macro lost_assert(*,G,I) is defined in Section 4.6.5.

   lost_assert(*,G) =
       { all interfaces I such that
         lost_assert(*,G,I) == TRUE }

   The set "lost_assert(S,G,rpt)" is the set of all interfaces on which
   the router has received (*,G) joins but has lost an (S,G) assert.
   The macro lost_assert(S,G,rpt,I) is defined in Section 4.6.5.

   lost_assert(S,G,rpt) =
       { all interfaces I such that
         lost_assert(S,G,rpt,I) == TRUE }

   The set "lost_assert(S,G)" is the set of all interfaces on which the
   router has received (S,G) joins but has lost an (S,G) assert.  The
   macro lost_assert(S,G,I) is defined in Section 4.6.5.

   lost_assert(S,G) =
       { all interfaces I such that
         lost_assert(S,G,I) == TRUE }

   The following pseudocode macro definitions are also used in many
   places in the specification.  Basically, RPF' is the RPF neighbor
   towards an RP or source unless a PIM-Assert has overridden the normal
   choice of neighbor.

     neighbor RPF'(*,G) {
         if ( I_Am_Assert_Loser(*, G, RPF_interface(RP(G))) ) {
              return AssertWinner(*, G, RPF_interface(RP(G)) )
         } else {
              return NBR( RPF_interface(RP(G)), MRIB.next_hop( RP(G) ) )
         }
     }

     neighbor RPF'(S,G,rpt) {
         if( I_Am_Assert_Loser(S, G, RPF_interface(RP(G)) ) ) {
              return AssertWinner(S, G, RPF_interface(RP(G)) )
         } else {
              return RPF'(*,G)
         }
     }
ToP   noToC   RFC4601 - Page 25
     neighbor RPF'(S,G) {
         if ( I_Am_Assert_Loser(S, G, RPF_interface(S) )) {
              return AssertWinner(S, G, RPF_interface(S) )
         } else {
              return NBR( RPF_interface(S), MRIB.next_hop( S ) )
         }
     }

   RPF'(*,G) and RPF'(S,G) indicate the neighbor from which data packets
   should be coming and to which joins should be sent on the RP tree and
   SPT, respectively.

   RPF'(S,G,rpt) is basically RPF'(*,G) modified by the result of an
   Assert(S,G) on RPF_interface(RP(G)).  In such a case, packets from S
   will be originating from a different router than RPF'(*,G).  If we
   only have active (*,G) Join state, we need to accept packets from
   RPF'(S,G,rpt) and add a Prune(S,G,rpt) to the periodic Join(*,G)
   messages that we send to RPF'(*,G) (see Section 4.5.8).

   The function MRIB.next_hop( S ) returns an address of the next-hop
   PIM neighbor toward the host S, as indicated by the current MRIB.  If
   S is directly adjacent, then MRIB.next_hop( S ) returns NULL.  At the
   RP for G, MRIB.next_hop( RP(G)) returns NULL.

   The function NBR( I, A ) uses information gathered through PIM Hello
   messages to map the IP address A of a directly connected PIM neighbor
   router on interface I to the primary IP address of the same router
   (Section 4.3.4).  The primary IP address of a neighbor is the address
   that it uses as the source of its PIM Hello messages.  Note that a
   neighbor's IP address may be non-unique within the PIM neighbor
   database due to scope issues.  The address must, however, be unique
   amongst the addresses of all the PIM neighbors on a specific
   interface.

   I_Am_Assert_Loser(S, G, I) is true if the Assert state machine (in
   Section 4.6.1) for (S,G) on Interface I is in "I am Assert Loser"
   state.

   I_Am_Assert_Loser(*, G, I) is true if the Assert state machine (in
   Section 4.6.2) for (*,G) on Interface I is in "I am Assert Loser"
   state.
ToP   noToC   RFC4601 - Page 26

4.2. Data Packet Forwarding Rules

The PIM-SM packet forwarding rules are defined below in pseudocode. iif is the incoming interface of the packet. S is the source address of the packet. G is the destination address of the packet (group address). RP is the address of the Rendezvous Point for this group. RPF_interface(S) is the interface the MRIB indicates would be used to route packets to S. RPF_interface(RP) is the interface the MRIB indicates would be used to route packets to RP, except at the RP when it is the decapsulation interface (the "virtual" interface on which register packets are received). First, we restart (or start) the Keepalive Timer if the source is on a directly connected subnet. Second, we check to see if the SPTbit should be set because we've now switched from the RP tree to the SPT. Next, we check to see whether the packet should be accepted based on TIB state and the interface that the packet arrived on. If the packet should be forwarded using (S,G) state, we then build an outgoing interface list for the packet. If this list is not empty, then we restart the (S,G) state Keepalive Timer. If the packet should be forwarded using (*,*,RP) or (*,G) state, then we just build an outgoing interface list for the packet. We also check if we should initiate a switch to start receiving this source on a shortest path tree. Finally we remove the incoming interface from the outgoing interface list we've created, and if the resulting outgoing interface list is not empty, we forward the packet out of those interfaces.
ToP   noToC   RFC4601 - Page 27
   On receipt of data from S to G on interface iif:
    if( DirectlyConnected(S) == TRUE AND iif == RPF_interface(S) ) {
         set KeepaliveTimer(S,G) to Keepalive_Period
         # Note: a register state transition or UpstreamJPState(S,G)
         # transition may happen as a result of restarting
         # KeepaliveTimer, and must be dealt with here.
    }

   if( iif == RPF_interface(S) AND UpstreamJPState(S,G) == Joined AND
      inherited_olist(S,G) != NULL ) {
          set KeepaliveTimer(S,G) to Keepalive_Period
   }

   Update_SPTbit(S,G,iif)
   oiflist = NULL

   if( iif == RPF_interface(S) AND SPTbit(S,G) == TRUE ) {
      oiflist = inherited_olist(S,G)
   } else if( iif == RPF_interface(RP(G)) AND SPTbit(S,G) == FALSE) {
     oiflist = inherited_olist(S,G,rpt)
     CheckSwitchToSpt(S,G)
   } else {
      # Note: RPF check failed
      # A transition in an Assert FSM may cause an Assert(S,G)
      # or Assert(*,G) message to be sent out interface iif.
      # See section 4.6 for details.
      if ( SPTbit(S,G) == TRUE AND iif is in inherited_olist(S,G) ) {
         send Assert(S,G) on iif
      } else if ( SPTbit(S,G) == FALSE AND
                  iif is in inherited_olist(S,G,rpt) {
         send Assert(*,G) on iif
      }
   }

   oiflist = oiflist (-) iif
   forward packet on all interfaces in oiflist

   This pseudocode employs several "macro" definitions:

   DirectlyConnected(S) is TRUE if the source S is on any subnet that is
   directly connected to this router (or for packets originating on this
   router).

   inherited_olist(S,G) and inherited_olist(S,G,rpt) are defined in
   Section 4.1.
ToP   noToC   RFC4601 - Page 28
   Basically, inherited_olist(S,G) is the outgoing interface list for
   packets forwarded on (S,G) state, taking into account (*,*,RP) state,
   (*,G) state, asserts, etc.

   inherited_olist(S,G,rpt) is the outgoing interface list for packets
   forwarded on (*,*,RP) or (*,G) state, taking into account (S,G,rpt)
   prune state, asserts, etc.

   Update_SPTbit(S,G,iif) is defined in Section 4.2.2.

   CheckSwitchToSpt(S,G) is defined in Section 4.2.1.

   UpstreamJPState(S,G) is the state of the finite state machine in
   Section 4.5.7.

   Keepalive_Period is defined in Section 4.10.

   Data-triggered PIM-Assert messages sent from the above forwarding
   code should be rate-limited in a implementation-dependent manner.

4.2.1. Last-Hop Switchover to the SPT

In Sparse-Mode PIM, last-hop routers join the shared tree towards the RP. Once traffic from sources to joined groups arrives at a last-hop router, it has the option of switching to receive the traffic on a shortest path tree (SPT). The decision for a router to switch to the SPT is controlled as follows: void CheckSwitchToSpt(S,G) { if ( ( pim_include(*,G) (-) pim_exclude(S,G) (+) pim_include(S,G) != NULL ) AND SwitchToSptDesired(S,G) ) { # Note: Restarting the KAT will result in the SPT switch set KeepaliveTimer(S,G) to Keepalive_Period } } SwitchToSptDesired(S,G) is a policy function that is implementation defined. An "infinite threshold" policy can be implemented by making SwitchToSptDesired(S,G) return false all the time. A "switch on first packet" policy can be implemented by making SwitchToSptDesired(S,G) return true once a single packet has been received for the source and group.
ToP   noToC   RFC4601 - Page 29

4.2.2. Setting and Clearing the (S,G) SPTbit

The (S,G) SPTbit is used to distinguish whether to forward on (*,*,RP)/(*,G) or on (S,G) state. When switching from the RP tree to the source tree, there is a transition period when data is arriving due to upstream (*,*,RP)/(*,G) state while upstream (S,G) state is being established, during which time a router should continue to forward only on (*,*,RP)/(*,G) state. This prevents temporary black-holes that would be caused by sending a Prune(S,G,rpt) before the upstream (S,G) state has finished being established. Thus, when a packet arrives, the (S,G) SPTbit is updated as follows: void Update_SPTbit(S,G,iif) { if ( iif == RPF_interface(S) AND JoinDesired(S,G) == TRUE AND ( DirectlyConnected(S) == TRUE OR RPF_interface(S) != RPF_interface(RP(G)) OR inherited_olist(S,G,rpt) == NULL OR ( ( RPF'(S,G) == RPF'(*,G) ) AND ( RPF'(S,G) != NULL ) ) OR ( I_Am_Assert_Loser(S,G,iif) ) { Set SPTbit(S,G) to TRUE } } Additionally, a router can set SPTbit(S,G) to TRUE in other cases, such as when it receives an Assert(S,G) on RPF_interface(S) (see Section 4.6.1). JoinDesired(S,G) is defined in Section 4.5.7 and indicates whether we have the appropriate (S,G) Join state to wish to send a Join(S,G) upstream. Basically, Update_SPTbit will set the SPTbit if we have the appropriate (S,G) join state, and if the packet arrived on the correct upstream interface for S, and if one or more of the following conditions applies: 1. The source is directly connected, in which case the switch to the SPT is a no-op. 2. The RPF interface to S is different from the RPF interface to the RP. The packet arrived on RPF_interface(S), and so the SPT must have been completed. 3. Noone wants the packet on the RP tree.
ToP   noToC   RFC4601 - Page 30
   4.  RPF'(S,G) == RPF'(*,G).  In this case, the router will never be
       able to tell if the SPT has been completed, so it should just
       switch immediately.

   In the case where the RPF interface is the same for the RP and for S,
   but RPF'(S,G) and RPF'(*,G) differ, we wait for an Assert(S,G), which
   indicates that the upstream router with (S,G) state believes the SPT
   has been completed.  However, item (3) above is needed because there
   may not be any (*,G) state to trigger an Assert(S,G) to happen.

   The SPTbit is cleared in the (S,G) upstream state machine (see
   Section 4.5.7) when JoinDesired(S,G) becomes FALSE.

4.3. Designated Routers (DR) and Hello Messages

A shared-media LAN like Ethernet may have multiple PIM-SM routers connected to it. A single one of these routers, the DR, will act on behalf of directly connected hosts with respect to the PIM-SM protocol. Because the distinction between LANs and point-to-point interfaces can sometimes be blurred, and because routers may also have multicast host functionality, the PIM-SM specification makes no distinction between the two. Thus, DR election will happen on all interfaces, LAN or otherwise. DR election is performed using Hello messages. Hello messages are also the way that option negotiation takes place in PIM, so that additional functionality can be enabled, or parameters tuned.

4.3.1. Sending Hello Messages

PIM Hello messages are sent periodically on each PIM-enabled interface. They allow a router to learn about the neighboring PIM routers on each interface. Hello messages are also the mechanism used to elect a Designated Router (DR), and to negotiate additional capabilities. A router must record the Hello information received from each PIM neighbor. Hello messages MUST be sent on all active interfaces, including physical point-to-point links, and are multicast to the 'ALL-PIM- ROUTERS' group address ('224.0.0.13' for IPv4 and 'ff02::d' for IPv6). We note that some implementations do not send Hello messages on point-to-point interfaces. This is non-compliant behavior. A compliant PIM router MUST send Hello messages, even on point-to- point interfaces.
ToP   noToC   RFC4601 - Page 31
   A per-interface Hello Timer (HT(I)) is used to trigger sending Hello
   messages on each active interface.  When PIM is enabled on an
   interface or a router first starts, the Hello Timer of that interface
   is set to a random value between 0 and Triggered_Hello_Delay.  This
   prevents synchronization of Hello messages if multiple routers are
   powered on simultaneously.  After the initial randomized interval,
   Hello messages must be sent every Hello_Period seconds.  The Hello
   Timer should not be reset except when it expires.

   Note that neighbors will not accept Join/Prune or Assert messages
   from a router unless they have first heard a Hello message from that
   router.  Thus, if a router needs to send a Join/Prune or Assert
   message on an interface on which it has not yet sent a Hello message
   with the currently configured IP address, then it MUST immediately
   send the relevant Hello message without waiting for the Hello Timer
   to expire, followed by the Join/Prune or Assert message.

   The DR_Priority Option allows a network administrator to give
   preference to a particular router in the DR election process by
   giving it a numerically larger DR Priority.  The DR_Priority Option
   SHOULD be included in every Hello message, even if no DR Priority is
   explicitly configured on that interface.  This is necessary because
   priority-based DR election is only enabled when all neighbors on an
   interface advertise that they are capable of using the DR_Priority
   Option.  The default priority is 1.

   The Generation_Identifier (GenID) Option SHOULD be included in all
   Hello messages.  The GenID option contains a randomly generated
   32-bit value that is regenerated each time PIM forwarding is started
   or restarted on the interface, including when the router itself
   restarts.  When a Hello message with a new GenID is received from a
   neighbor, any old Hello information about that neighbor SHOULD be
   discarded and superseded by the information from the new Hello
   message.  This may cause a new DR to be chosen on that interface.

   The LAN Prune Delay Option SHOULD be included in all Hello messages
   sent on multi-access LANs.  This option advertises a router's
   capability to use values other than the defaults for the
   Propagation_Delay and Override_Interval, which affect the setting of
   the Prune-Pending, Upstream Join, and Override Timers (defined in
   Section 4.10).

   The Address List Option advertises all the secondary addresses
   associated with the source interface of the router originating the
   message.  The option MUST be included in all Hello messages if there
   are secondary addresses associated with the source interface and MAY
   be omitted if no secondary addresses exist.
ToP   noToC   RFC4601 - Page 32
   To allow new or rebooting routers to learn of PIM neighbors quickly,
   when a Hello message is received from a new neighbor, or a Hello
   message with a new GenID is received from an existing neighbor, a new
   Hello message should be sent on this interface after a randomized
   delay between 0 and Triggered_Hello_Delay.  This triggered message
   need not change the timing of the scheduled periodic message.  If a
   router needs to send a Join/Prune to the new neighbor or send an
   Assert message in response to an Assert message from the new neighbor
   before this randomized delay has expired, then it MUST immediately
   send the relevant Hello message without waiting for the Hello Timer
   to expire, followed by the Join/Prune or Assert message.  If it does
   not do this, then the new neighbor will discard the Join/Prune or
   Assert message.

   Before an interface goes down or changes primary IP address, a Hello
   message with a zero HoldTime should be sent immediately (with the old
   IP address if the IP address changed).  This will cause PIM neighbors
   to remove this neighbor (or its old IP address) immediately.  After
   an interface has changed its IP address, it MUST send a Hello message
   with its new IP address.  If an interface changes one of its
   secondary IP addresses, a Hello message with an updated Address_List
   option and a non-zero HoldTime should be sent immediately.  This will
   cause PIM neighbors to update this neighbor's list of secondary
   addresses immediately.

4.3.2. DR Election

When a PIM Hello message is received on interface I, the following information about the sending neighbor is recorded: neighbor.interface The interface on which the Hello message arrived. neighbor.primary_ip_address The IP address that the PIM neighbor used as the source address of the Hello message. neighbor.genid The Generation ID of the PIM neighbor. neighbor.dr_priority The DR Priority field of the PIM neighbor, if it is present in the Hello message. neighbor.dr_priority_present A flag indicating if the DR Priority field was present in the Hello message.
ToP   noToC   RFC4601 - Page 33
     neighbor.timeout
          A timer value to time out the neighbor state when it becomes
          stale, also known as the Neighbor Liveness Timer.

          The Neighbor Liveness Timer (NLT(N,I)) is reset to
          Hello_Holdtime (from the Hello Holdtime option) whenever a
          Hello message is received containing a Holdtime option, or to
          Default_Hello_Holdtime if the Hello message does not contain
          the Holdtime option.

          Neighbor state is deleted when the neighbor timeout expires.

   The function for computing the DR on interface I is:

     host
     DR(I) {
         dr = me
         for each neighbor on interface I {
             if ( dr_is_better( neighbor, dr, I ) == TRUE ) {
                 dr = neighbor
             }
         }
         return dr
     }

   The function used for comparing DR "metrics" on interface I is:

     bool
     dr_is_better(a,b,I) {
         if( there is a neighbor n on I for which n.dr_priority_present
                 is false ) {
             return a.primary_ip_address > b.primary_ip_address
         } else {
             return ( a.dr_priority > b.dr_priority ) OR
                    ( a.dr_priority == b.dr_priority AND
                      a.primary_ip_address > b.primary_ip_address )
         }
     }

   The trivial function I_am_DR(I) is defined to aid readability:

     bool
     I_am_DR(I) {
        return DR(I) == me
     }
ToP   noToC   RFC4601 - Page 34
   The DR Priority is a 32-bit unsigned number, and the numerically
   larger priority is always preferred.  A router's idea of the current
   DR on an interface can change when a PIM Hello message is received,
   when a neighbor times out, or when a router's own DR Priority
   changes.  If the router becomes the DR or ceases to be the DR, this
   will normally cause the DR Register state machine to change state.
   Subsequent actions are determined by that state machine.

     We note that some PIM implementations do not send Hello messages on
     point-to-point interfaces and thus cannot perform DR election on
     such interfaces.  This is non-compliant behavior.  DR election MUST
     be performed on ALL active PIM-SM interfaces.

4.3.3. Reducing Prune Propagation Delay on LANs

In addition to the information recorded for the DR Election, the following per neighbor information is obtained from the LAN Prune Delay Hello option: neighbor.lan_prune_delay_present A flag indicating if the LAN Prune Delay option was present in the Hello message. neighbor.tracking_support A flag storing the value of the T bit in the LAN Prune Delay option if it is present in the Hello message. This indicates the neighbor's capability to disable Join message suppression. neighbor.propagation_delay The Propagation Delay field of the LAN Prune Delay option (if present) in the Hello message. neighbor.override_interval The Override_Interval field of the LAN Prune Delay option (if present) in the Hello message. The additional state described above is deleted along with the DR neighbor state when the neighbor timeout expires. Just like the DR_Priority option, the information provided in the LAN Prune Delay option is not used unless all neighbors on a link advertise the option. The function below computes this state:
ToP   noToC   RFC4601 - Page 35
     bool
     lan_delay_enabled(I) {
         for each neighbor on interface I {
             if ( neighbor.lan_prune_delay_present == false ) {
                 return false
             }
         }
         return true
     }

   The Propagation Delay inserted by a router in the LAN Prune Delay
   option expresses the expected message propagation delay on the link
   and should be configurable by the system administrator.  It is used
   by upstream routers to figure out how long they should wait for a
   Join override message before pruning an interface.

   PIM implementers should enforce a lower bound on the permitted values
   for this delay to allow for scheduling and processing delays within
   their router.  Such delays may cause received messages to be
   processed later as well as triggered messages to be sent later than
   intended.  Setting this Propagation Delay to too low a value may
   result in temporary forwarding outages because a downstream router
   will not be able to override a neighbor's Prune message before the
   upstream neighbor stops forwarding.

   When all routers on a link are in a position to negotiate a
   Propagation Delay different from the default, the largest value from
   those advertised by each neighbor is chosen.  The function for
   computing the Effective_Propagation_Delay of interface I is:

     time_interval
     Effective_Propagation_Delay(I) {
         if ( lan_delay_enabled(I) == false ) {
             return Propagation_delay_default
         }
         delay = Propagation_Delay(I)
         for each neighbor on interface I {
             if ( neighbor.propagation_delay > delay ) {
                 delay = neighbor.propagation_delay
             }
         }
         return delay
     }

   To avoid synchronization of override messages when multiple
   downstream routers share a multi-access link, sending of such
   messages is delayed by a small random amount of time.  The period of
   randomization should represent the size of the PIM router population
ToP   noToC   RFC4601 - Page 36
   on the link.  Each router expresses its view of the amount of
   randomization necessary in the Override Interval field of the LAN
   Prune Delay option.

   When all routers on a link are in a position to negotiate an Override
   Interval different from the default, the largest value from those
   advertised by each neighbor is chosen.  The function for computing
   the Effective Override Interval of interface I is:

     time_interval
     Effective_Override_Interval(I) {
         if ( lan_delay_enabled(I) == false ) {
             return t_override_default
         }
         delay = Override_Interval(I)
         for each neighbor on interface I {
             if ( neighbor.override_interval > delay ) {
                 delay = neighbor.override_interval
             }
         }
         return delay
     }

   Although the mechanisms are not specified in this document, it is
   possible for upstream routers to explicitly track the join membership
   of individual downstream routers if Join suppression is disabled.  A
   router can advertise its willingness to disable Join suppression by
   using the T bit in the LAN Prune Delay Hello option.  Unless all PIM
   routers on a link negotiate this capability, explicit tracking and
   the disabling of the Join suppression mechanism are not possible.
   The function for computing the state of Suppression on interface I
   is:

     bool
     Suppression_Enabled(I) {
         if ( lan_delay_enabled(I) == false ) {
             return true
         }
         for each neighbor on interface I {
             if ( neighbor.tracking_support == false ) {
                 return true
             }
         }
         return false
     }

   Note that the setting of Suppression_Enabled(I) affects the value of
   t_suppressed (see Section 4.10).
ToP   noToC   RFC4601 - Page 37

4.3.4. Maintaining Secondary Address Lists

Communication of a router's interface secondary addresses to its PIM neighbors is necessary to provide the neighbors with a mechanism for mapping next_hop information obtained through their MRIB to a primary address that can be used as a destination for Join/Prune messages. The mapping is performed through the NBR macro. The primary address of a PIM neighbor is obtained from the source IP address used in its PIM Hello messages. Secondary addresses are carried within the Hello message in an Address List Hello option. The primary address of the source interface of the router MUST NOT be listed within the Address List Hello option. In addition to the information recorded for the DR Election, the following per neighbor information is obtained from the Address List Hello option: neighbor.secondary_address_list The list of secondary addresses used by the PIM neighbor on the interface through which the Hello message was transmitted. When processing a received PIM Hello message containing an Address List Hello option, the list of secondary addresses in the message completely replaces any previously associated secondary addresses for that neighbor. If a received PIM Hello message does not contain an Address List Hello option, then all secondary addresses associated with the neighbor must be deleted. If a received PIM Hello message contains an Address List Hello option that includes the primary address of the sending router in the list of secondary addresses (although this is not expected), then the addresses listed in the message, excluding the primary address, are used to update the associated secondary addresses for that neighbor. All the advertised secondary addresses in received Hello messages must be checked against those previously advertised by all other PIM neighbors on that interface. If there is a conflict and the same secondary address was previously advertised by another neighbor, then only the most recently received mapping MUST be maintained, and an error message SHOULD be logged to the administrator in a rate-limited manner. Within one Address List Hello option, all the addresses MUST be of the same address family. It is not permitted to mix IPv4 and IPv6 addresses within the same message. In addition, the address family of the fields in the message SHOULD be the same as the IP source and destination addresses of the packet header.


(next page on part 3)

Next Section