RFC 4601

Protocol Independent Multicast - Sparse Mode (PIM-SM): Protocol Specification (Revised)

Pages: 150
Obsoletes: 2362
Obsoleted by: 7761
Updated by: 5059 5796 6226

Part 1 of 6 – Pages 1 to 12

noToC RFC4601 - Page 1

Network Working Group                                          B. Fenner
Request for Comments: 4601                          AT&T Labs - Research
Obsoletes: 2362                                               M. Handley
Category: Standards Track                                            UCL
                                                             H. Holbrook
                                                                 Arastra
                                                             I. Kouvelas
                                                                   Cisco
                                                             August 2006


         Protocol Independent Multicast - Sparse Mode (PIM-SM):
                    Protocol Specification (Revised)

Status of This Memo

   This document specifies an Internet standards track protocol for the
   Internet community, and requests discussion and suggestions for
   improvements.  Please refer to the current edition of the "Internet
   Official Protocol Standards" (STD 1) for the standardization state
   and status of this protocol.  Distribution of this memo is unlimited.

Copyright Notice

   Copyright (C) The Internet Society (2006).

Abstract

   This document specifies Protocol Independent Multicast - Sparse Mode
   (PIM-SM).  PIM-SM is a multicast routing protocol that can use the
   underlying unicast routing information base or a separate multicast-
   capable routing information base.  It builds unidirectional shared
   trees rooted at a Rendezvous Point (RP) per group, and optionally
   creates shortest-path trees per source.

   This document obsoletes RFC 2362, an Experimental version of PIM-SM.

noToC RFC4601 - Page 2

Table of Contents

   1. Introduction ....................................................5
   2. Terminology .....................................................5
      2.1. Definitions ................................................5
      2.2. Pseudocode Notation ........................................7
   3. PIM-SM Protocol Overview ........................................7
      3.1. Phase One: RP Tree .........................................8
      3.2. Phase Two: Register-Stop ...................................8
      3.3. Phase Three: Shortest-Path Tree ............................9
      3.4. Source-Specific Joins .....................................10
      3.5. Source-Specific Prunes ....................................11
      3.6. Multi-Access Transit LANs .................................11
      3.7. RP Discovery ..............................................12
   4. Protocol Specification .........................................12
      4.1. PIM Protocol State ........................................13
           4.1.1. General Purpose State ..............................14
           4.1.2. (*,*,RP) State .....................................15
           4.1.3. (*,G) State ........................................16
           4.1.4. (S,G) State ........................................17
           4.1.5. (S,G,rpt) State ....................................20
           4.1.6. State Summarization Macros .........................21
      4.2. Data Packet Forwarding Rules ..............................26
           4.2.1. Last-Hop Switchover to the SPT .....................28
           4.2.2. Setting and Clearing the (S,G) SPTbit ..............29
      4.3. Designated Routers (DR) and Hello Messages ................30
           4.3.1. Sending Hello Messages .............................30
           4.3.2. DR Election ........................................32
           4.3.3. Reducing Prune Propagation Delay on LANs ...........34
           4.3.4. Maintaining Secondary Address Lists ................37
      4.4. PIM Register Messages .....................................38
           4.4.1. Sending Register Messages from the DR ..............38
           4.4.2. Receiving Register Messages at the RP ..............43
      4.5. PIM Join/Prune Messages ...................................45
           4.5.1. Receiving (*,*,RP) Join/Prune Messages .............45
           4.5.2. Receiving (*,G) Join/Prune Messages ................49
           4.5.3. Receiving (S,G) Join/Prune Messages ................53
           4.5.4. Receiving (S,G,rpt) Join/Prune Messages ............56
           4.5.5. Sending (*,*,RP) Join/Prune Messages ...............62
           4.5.6. Sending (*,G) Join/Prune Messages ..................66
           4.5.7. Sending (S,G) Join/Prune Messages ..................71
           4.5.8. (S,G,rpt) Periodic Messages ........................76
           4.5.9. State Machine for (S,G,rpt) Triggered Messages .....77
           4.5.10. Background: (*,*,RP) and (S,G,rpt) Interaction ....82
      4.6. PIM Assert Messages .......................................83
           4.6.1. (S,G) Assert Message State Machine .................83
           4.6.2. (*,G) Assert Message State Machine .................91
           4.6.3. Assert Metrics .....................................98

noToC RFC4601 - Page 3

           4.6.4. AssertCancel Messages ..............................99
           4.6.5. Assert State Macros ...............................100
      4.7. PIM Bootstrap and RP Discovery ...........................103
           4.7.1. Group-to-RP Mapping ...............................104
           4.7.2. Hash Function .....................................105
      4.8. Source-Specific Multicast ................................106
           4.8.1. Protocol Modifications for SSM Destination
                  Addresses .........................................106
           4.8.2. PIM-SSM-Only Routers ..............................107
      4.9. PIM Packet Formats .......................................108
           4.9.1. Encoded Source and Group Address Formats ..........110
           4.9.2. Hello Message Format ..............................113
           4.9.3. Register Message Format ...........................116
           4.9.4. Register-Stop Message Format ......................119
           4.9.5. Join/Prune Message Format .........................119
                  4.9.5.1. Group Set Source List Rules ..............122
                  4.9.5.2. Group Set Fragmentation ..................126
           4.9.6. Assert Message Format .............................126
      4.10. PIM Timers ..............................................128
      4.11. Timer Values ............................................129
   5. IANA Considerations ...........................................135
      5.1. PIM Address Family .......................................135
      5.2. PIM Hello Options ........................................136
   6. Security Considerations .......................................136
      6.1. Attacks Based on Forged Messages .........................136
           6.1.1. Forged Link-Local Messages ........................136
           6.1.2. Forged Unicast Messages ...........................137
      6.2. Non-Cryptographic Authentication Mechanisms ..............137
      6.3. Authentication Using IPsec ...............................138
           6.3.1. Protecting Link-Local Multicast Messages ..........138
           6.3.2. Protecting Unicast Messages .......................139
                  6.3.2.1. Register Messages ........................139
                  6.3.2.2. Register-Stop Messages ...................139
      6.4. Denial-of-Service Attacks ................................140
   7. Acknowledgements ..............................................140
   8. Normative References ..........................................141
   9. Informative References ........................................141
   Appendix A. PIM Multicast Border Router Behavior .................143
      A.1. Sources External to the PIM-SM Domain ....................143
      A.2.  Sources Internal to the PIM-SM Domain ...................144
   Appendix B. Index ................................................146

noToC RFC4601 - Page 4

List of Figures

   Figure 1. Per-(S,G) register state machine at a DR ................38
   Figure 2. Downstream per-interface (*,*,RP) state machine .........46
   Figure 3. Downstream per-interface (*,G) state machine ............50
   Figure 4. Downstream per-interface (S,G) state machine ............53
   Figure 5. Downstream per-interface (S,G,rpt) state machine ........57
   Figure 6. Upstream (*,*,RP) state machine .........................62
   Figure 7. Upstream (*,G) state machine ............................67
   Figure 8. Upstream (S,G) state machine ............................71
   Figure 9. Upstream (S,G,rpt) state machine for triggered
             messages ................................................77
   Figure 10. Per-interface (S,G) Assert State machine ...............84
   Figure 11. Per-interface (*,G) Assert State machine ...............92

noToC RFC4601 - Page 5

1.  Introduction

   This document specifies a protocol for efficiently routing multicast
   groups that may span wide-area (and inter-domain) internets.  This
   protocol is called Protocol Independent Multicast - Sparse Mode
   (PIM-SM) because, although it may use the underlying unicast routing
   to provide reverse-path information for multicast tree building, it
   is not dependent on any particular unicast routing protocol.

   PIM-SM version 2 was originally specified in RFC 2117 and was revised
   in RFC 2362, both Experimental RFCs.  This document is intended to
   obsolete RFC 2362, to correct a number of deficiencies that have been
   identified with the way PIM-SM was previously specified, and to bring
   PIM-SM onto the IETF Standards Track.  As far as possible, this
   document specifies the same protocol as RFC 2362 and only diverges
   from the behavior intended by RFC 2362 when the previously specified
   behavior was clearly incorrect.  Routers implemented according to the
   specification in this document will be able to interoperate
   successfully with routers implemented according to RFC 2362.

2.  Terminology

   In this document, the key words "MUST", "MUST NOT", "REQUIRED",
   "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY",
   and "OPTIONAL" are to be interpreted as described in RFC 2119 [1] and
   indicate requirement levels for compliant PIM-SM implementations.

2.1.  Definitions

   The following terms have special significance for PIM-SM:

   Rendezvous Point (RP):
         An RP is a router that has been configured to be used as the
         root of the non-source-specific distribution tree for a
         multicast group.  Join messages from receivers for a group are
         sent towards the RP, and data from senders is sent to the RP so
         that receivers can discover who the senders are and start to
         receive traffic destined for the group.

   Designated Router (DR):
         A shared-media LAN like Ethernet may have multiple PIM-SM
         routers connected to it.  A single one of these routers, the
         DR, will act on behalf of directly connected hosts with respect
         to the PIM-SM protocol.  A single DR is elected per interface
         (LAN or otherwise) using a simple election process.

noToC RFC4601 - Page 6

   MRIB  Multicast Routing Information Base.  This is the multicast
         topology table, which is typically derived from the unicast
         routing table, or routing protocols such as Multiprotocol BGP
         (MBGP) that carry multicast-specific topology information.  In
         PIM-SM, the MRIB is used to decide where to send Join/Prune
         messages.  A secondary function of the MRIB is to provide
         routing metrics for destination addresses; these metrics are
         used when sending and processing Assert messages.

   RPF Neighbor
         RPF stands for "Reverse Path Forwarding".  The RPF Neighbor of
         a router with respect to an address is the neighbor that the
         MRIB indicates should be used to forward packets to that
         address.  In the case of a PIM-SM multicast group, the RPF
         neighbor is the router that a Join message for that group would
         be directed to, in the absence of modifying Assert state.

   TIB   Tree Information Base.  This is the collection of state at a
         PIM router that has been created by receiving PIM Join/Prune
         messages, PIM Assert messages, and Internet Group Management
         Protocol (IGMP) or Multicast Listener Discovery (MLD)
         information from local hosts.  It essentially stores the state
         of all multicast distribution trees at that router.

   MFIB  Multicast Forwarding Information Base.  The TIB holds all the
         state that is necessary to forward multicast packets at a
         router.  However, although this specification defines
         forwarding in terms of the TIB, to actually forward packets
         using the TIB is very inefficient.  Instead, a real router
         implementation will normally build an efficient MFIB from the
         TIB state to perform forwarding.  How this is done is
         implementation-specific and is not discussed in this document.

   Upstream
         Towards the root of the tree.  The root of tree may be either
         the source or the RP, depending on the context.

   Downstream
         Away from the root of the tree.

   GenID Generation Identifier, used to detect reboots.

   PMBR  PIM Multicast Border Router, joining a PIM domain with another
         multicast domain.

noToC RFC4601 - Page 7

2.2.  Pseudocode Notation

   We use set notation in several places in this specification.

   A (+) B is the union of two sets, A and B.

   A (-) B is the elements of set A that are not in set B.

   NULL    is the empty set or list.

   In addition, we use C-like syntax:

   =       denotes assignment of a variable.

   ==      denotes a comparison for equality.

   !=      denotes a comparison for inequality.

   Braces { and } are used for grouping.

3.  PIM-SM Protocol Overview

   This section provides an overview of PIM-SM behavior.  It is intended
   as an introduction to how PIM-SM works, and it is NOT definitive.
   For the definitive specification, see Section 4.

   PIM relies on an underlying topology-gathering protocol to populate a
   routing table with routes.  This routing table is called the
   Multicast Routing Information Base (MRIB).  The routes in this table
   may be taken directly from the unicast routing table, or they may be
   different and provided by a separate routing protocol such as MBGP
   [10].  Regardless of how it is created, the primary role of the MRIB
   in the PIM protocol is to provide the next-hop router along a
   multicast-capable path to each destination subnet.  The MRIB is used
   to determine the next-hop neighbor to which any PIM Join/Prune
   message is sent.  Data flows along the reverse path of the Join
   messages.  Thus, in contrast to the unicast RIB, which specifies the
   next hop that a data packet would take to get to some subnet, the
   MRIB gives reverse-path information and indicates the path that a
   multicast data packet would take from its origin subnet to the router
   that has the MRIB.

   Like all multicast routing protocols that implement the service model
   from RFC 1112 [3], PIM-SM must be able to route data packets from
   sources to receivers without either the sources or receivers knowing
   a priori of the existence of the others.  This is essentially done in
   three phases, although as senders and receivers may come and go at
   any time, all three phases may occur simultaneously.

noToC RFC4601 - Page 8

3.1.  Phase One: RP Tree

   In phase one, a multicast receiver expresses its interest in
   receiving traffic destined for a multicast group.  Typically, it does
   this using IGMP [2] or MLD [4], but other mechanisms might also serve
   this purpose.  One of the receiver's local routers is elected as the
   Designated Router (DR) for that subnet.  On receiving the receiver's
   expression of interest, the DR then sends a PIM Join message towards
   the RP for that multicast group.  This Join message is known as a
   (*,G) Join because it joins group G for all sources to that group.
   The (*,G) Join travels hop-by-hop towards the RP for the group, and
   in each router it passes through, multicast tree state for group G is
   instantiated.  Eventually, the (*,G) Join either reaches the RP or
   reaches a router that already has (*,G) Join state for that group.
   When many receivers join the group, their Join messages converge on
   the RP and form a distribution tree for group G that is rooted at the
   RP.  This is known as the RP Tree (RPT), and is also known as the
   shared tree because it is shared by all sources sending to that
   group.  Join messages are resent periodically so long as the receiver
   remains in the group.  When all receivers on a leaf-network leave the
   group, the DR will send a PIM (*,G) Prune message towards the RP for
   that multicast group.  However, if the Prune message is not sent for
   any reason, the state will eventually time out.

   A multicast data sender just starts sending data destined for a
   multicast group.  The sender's local router (DR) takes those data
   packets, unicast-encapsulates them, and sends them directly to the
   RP.  The RP receives these encapsulated data packets, decapsulates
   them, and forwards them onto the shared tree.  The packets then
   follow the (*,G) multicast tree state in the routers on the RP Tree,
   being replicated wherever the RP Tree branches, and eventually
   reaching all the receivers for that multicast group.  The process of
   encapsulating data packets to the RP is called registering, and the
   encapsulation packets are known as PIM Register packets.

   At the end of phase one, multicast traffic is flowing encapsulated to
   the RP, and then natively over the RP tree to the multicast
   receivers.

3.2.  Phase Two: Register-Stop

   Register-encapsulation of data packets is inefficient for two
   reasons:

   o Encapsulation and decapsulation may be relatively expensive
     operations for a router to perform, depending on whether or not the
     router has appropriate hardware for these tasks.

noToC RFC4601 - Page 9

   o Traveling all the way to the RP, and then back down the shared tree
     may result in the packets traveling a relatively long distance to
     reach receivers that are close to the sender.  For some
     applications, this increased latency or bandwidth consumption is
     undesirable.

   Although Register-encapsulation may continue indefinitely, for these
   reasons, the RP will normally choose to switch to native forwarding.
   To do this, when the RP receives a register-encapsulated data packet
   from source S on group G, it will normally initiate an (S,G) source-
   specific Join towards S.  This Join message travels hop-by-hop
   towards S, instantiating (S,G) multicast tree state in the routers
   along the path.  (S,G) multicast tree state is used only to forward
   packets for group G if those packets come from source S.  Eventually
   the Join message reaches S's subnet or a router that already has
   (S,G) multicast tree state, and then packets from S start to flow
   following the (S,G) tree state towards the RP.  These data packets
   may also reach routers with (*,G) state along the path towards the
   RP; if they do, they can shortcut onto the RP tree at this point.

   While the RP is in the process of joining the source-specific tree
   for S, the data packets will continue being encapsulated to the RP.
   When packets from S also start to arrive natively at the RP, the RP
   will be receiving two copies of each of these packets.  At this
   point, the RP starts to discard the encapsulated copy of these
   packets, and it sends a Register-Stop message back to S's DR to
   prevent the DR from unnecessarily encapsulating the packets.

   At the end of phase 2, traffic will be flowing natively from S along
   a source-specific tree to the RP, and from there along the shared
   tree to the receivers.  Where the two trees intersect, traffic may
   transfer from the source-specific tree to the RP tree and thus avoid
   taking a long detour via the RP.

   Note that a sender may start sending before or after a receiver joins
   the group, and thus phase two may happen before the shared tree to
   the receiver is built.

3.3.  Phase Three: Shortest-Path Tree

   Although having the RP join back towards the source removes the
   encapsulation overhead, it does not completely optimize the
   forwarding paths.  For many receivers, the route via the RP may
   involve a significant detour when compared with the shortest path
   from the source to the receiver.

noToC RFC4601 - Page 10

   To obtain lower latencies or more efficient bandwidth utilization, a
   router on the receiver's LAN, typically the DR, may optionally
   initiate a transfer from the shared tree to a source-specific
   shortest-path tree (SPT).  To do this, it issues an (S,G) Join
   towards S.  This instantiates state in the routers along the path to
   S.  Eventually, this join either reaches S's subnet or reaches a
   router that already has (S,G) state.  When this happens, data packets
   from S start to flow following the (S,G) state until they reach the
   receiver.

   At this point, the receiver (or a router upstream of the receiver)
   will be receiving two copies of the data: one from the SPT and one
   from the RPT.  When the first traffic starts to arrive from the SPT,
   the DR or upstream router starts to drop the packets for G from S
   that arrive via the RP tree.  In addition, it sends an (S,G) Prune
   message towards the RP.  This is known as an (S,G,rpt) Prune.  The
   Prune message travels hop-by-hop, instantiating state along the path
   towards the RP indicating that traffic from S for G should NOT be
   forwarded in this direction.  The prune is propagated until it
   reaches the RP or a router that still needs the traffic from S for
   other receivers.

   By now, the receiver will be receiving traffic from S along the
   shortest-path tree between the receiver and S.  In addition, the RP
   is receiving the traffic from S, but this traffic is no longer
   reaching the receiver along the RP tree.  As far as the receiver is
   concerned, this is the final distribution tree.

3.4.  Source-Specific Joins

   IGMPv3 permits a receiver to join a group and specify that it only
   wants to receive traffic for a group if that traffic comes from a
   particular source.  If a receiver does this, and no other receiver on
   the LAN requires all the traffic for the group, then the DR may omit
   performing a (*,G) join to set up the shared tree, and instead issue
   a source-specific (S,G) join only.

   The range of multicast addresses from 232.0.0.0 to 232.255.255.255 is
   currently set aside for source-specific multicast in IPv4.  For
   groups in this range, receivers should only issue source-specific
   IGMPv3 joins.  If a PIM router receives a non-source-specific join
   for a group in this range, it should ignore it, as described in
   Section 4.8.

noToC RFC4601 - Page 11

3.5.  Source-Specific Prunes

   IGMPv3 also permits a receiver to join a group and to specify that it
   only wants to receive traffic for a group if that traffic does not
   come from a specific source or sources.  In this case, the DR will
   perform a (*,G) join as normal, but may combine this with an
   (S,G,rpt) prune for each of the sources the receiver does not wish to
   receive.

3.6.  Multi-Access Transit LANs

   The overview so far has concerned itself with point-to-point transit
   links.  However, using multi-access LANs such as Ethernet for transit
   is not uncommon.  This can cause complications for three reasons:

   o Two or more routers on the LAN may issue (*,G) Joins to different
     upstream routers on the LAN because they have inconsistent MRIB
     entries regarding how to reach the RP.  Both paths on the RP tree
     will be set up, causing two copies of all the shared tree traffic
     to appear on the LAN.

   o Two or more routers on the LAN may issue (S,G) Joins to different
     upstream routers on the LAN because they have inconsistent MRIB
     entries regarding how to reach source S.  Both paths on the source-
     specific tree will be set up, causing two copies of all the traffic
     from S to appear on the LAN.

   o A router on the LAN may issue a (*,G) Join to one upstream router
     on the LAN, and another router on the LAN may issue an (S,G) Join
     to a different upstream router on the same LAN.  Traffic from S may
     reach the LAN over both the RPT and the SPT.  If the receiver
     behind the downstream (*,G) router doesn't issue an (S,G,rpt)
     prune, then this condition would persist.

   All of these problems are caused by there being more than one
   upstream router with join state for the group or source-group pair.
   PIM does not prevent such duplicate joins from occurring; instead,
   when duplicate data packets appear on the LAN from different routers,
   these routers notice this and then elect a single forwarder.  This
   election is performed using PIM Assert messages, which resolve the
   problem in favor of the upstream router that has (S,G) state; or, if
   neither or both router has (S,G) state, then the problem is resolved
   in favor of the router with the best metric to the RP for RP trees,
   or the best metric to the source to source-specific trees.

   These Assert messages are also received by the downstream routers on
   the LAN, and these cause subsequent Join messages to be sent to the
   upstream router that won the Assert.

noToC RFC4601 - Page 12

3.7.  RP Discovery

   PIM-SM routers need to know the address of the RP for each group for
   which they have (*,G) state.  This address is obtained automatically
   (e.g., embedded-RP), through a bootstrap mechanism, or through static
   configuration.

   One dynamic way to do this is to use the Bootstrap Router (BSR)
   mechanism [11].  One router in each PIM domain is elected the
   Bootstrap Router through a simple election process.  All the routers
   in the domain that are configured to be candidates to be RPs
   periodically unicast their candidacy to the BSR.  From the
   candidates, the BSR picks an RP-set, and periodically announces this
   set in a Bootstrap message.  Bootstrap messages are flooded hop-by-
   hop throughout the domain until all routers in the domain know the
   RP-Set.

   To map a group to an RP, a router hashes the group address into the
   RP-set using an order-preserving hash function (one that minimizes
   changes if the RP-Set changes).  The resulting RP is the one that it
   uses as the RP for that group.

(page 12 continued on part 2)