Tech Notes

Networking: Multicast

Multicast:One copy of the packet will send to the receivers who are “interested”.

https://www.youtube.com/watch?v=BC8MfzMSRhY&list=PLVND-cRwt9SNw9_EIK4GGDBAT0wtz0xSC&index=5

Layer 3 multicast:

  • Source IPs in the multicast packet remains unicast
  • the destination IP is of the multicast group
  • Destination IP will belong to the Class D IPv4:
    • 224.0.0.0 – 239.255.255.255
  • Every single IP in the multicast represents a multicast group

reservation:

CIDRdetails
224.0.0.0/24for Routing protocols. link local scope. TTL 1 or 2
232.0.0.0/8For multicast stream for which source is known
239.0.0.0/8similar as RFC1918. within a ASN

Layer 2 multicast:

  • source Mac is source’s MAC
  • Since destination MAC is a multicast, we have to agree on a single destination MAC which is agreed by sender and receiver
  • Receiver should know the expected MAC for a source
  • A receiver would know for which multicast it is listening to based on the application layer. application layer is going to tell the network stack to listen to a particular multicast group

We use mapping from the layer3 multicast IP to layer 2 MAC: MAC address will be 01-00-5E>as 24 bits <> 0 <> last 23 bits of the layer 3 multicast IP

IP multicast routers:

Concept of Reverse path forwarding(RPF):

  • In multicast, we are taking the packet away from a source to potentially multiple destination
  • We will have upstream interface:
    • A interface which is closest to the source
    • In terms of IGP, this is the best path to the source
  • We will have downstream interfaces:
    • We will have one or multiple downstream interfaces based on the interested receivers

Multicast routing table:

S,G: where S is the source and G is the multicast group

*,G: where * is any or unknown source and G is the multicast group

Multicast routing table is also called forwarding state which will following information a S,G or *,G:

  • IIF: incoming interface. only one incoming interface will be selected by the multicast routing table
  • OIL: outgoing interfaces list. multiple outgoing interfaces will be selected based on the interested receiver. we can have zero OIL too in the list

Multicast routing protocol

Job of the multicast routing protcol

  • select incoming interface: identify upstream interface for a source. usually unicast routing table’s forwarding table is used
  • identify downstream interfaces: The routing protocol will have a way to tell the “interested” receivers to signal that they are interested in a multicast group.
  • dynamic multicast tree: routing protocol will maintain the multicast tree based on the sources coming up or going down. receiver showing interest in the group or leaving the group.
  • avoid loops:
    • usually done using RPF(reverse path forwarding) check:
      • When a multicast packet is received, check the source IP of the packet
        • if the IIF for that S,G is not same as the source of the packet, drop the packet
      • usually IIF is computed via IGP routing protocol

Over-ride RPF check:

  • use static routes
  • use MultiProtocol BGP
  • Create separate table form unicast routing protcol and multicast routing table

So, we have 3 components in the multicast:

  • A source which tells the multicast core network that it is streaming
  • A receiver which tells the multicast core network that is is interested in the source
  • multicast core network which carry’s the packets from the source to the receiver for a particular group

Multicast Receiver signalling(IGMP)

Once receiver knows which groups receiver is interested in, it has to tell the multicast network that it is interested for a particular group.

Layer 2 multicast flows: if we have a signal layer 2 switch, the multicast flow would depend on the switch type:

  • Basic switch which doesn’t understand MC:
    • the switch will send the multicast frame to all the ports except the one from which it received
    • all the receivers will process
    • the receivers which are not interested will drop
    • same as a broadcast
  • Switch which understand the MC:
    • Inspect the multi-cast signalling between receiver and routers
    • Create proper MC state for each port of the switch
    • only interested receivers will get the packet
    • called as IGMP snooping

Multicast with Last hop router(LHR)

  • The receiver has to tell the multicast router that it is interested in to receive traffic for a MC group
  • receiver will send a IGMP message to the multicast router which is reachable over layer 2
  • Multicast router will create a state with the GROUP which it received from the receiver and the interface on which Router received the IGMP message
    • Based on the group, interface. LHR will create a Outgoing Interface(OIL) for that particular group
    • This is the start of creating the tree in the multi-cast core
  • The role of IGMP is to signal the LHR about receivers who are interested in a particular group
  • IGMP is IP protocol 2

IGMP message types:

  • IGMP membership report(IGMP join)
    • receiver informs the MC router about joining a Group
    • Packet structure:
      • source IP: source IP of the receiver
      • destination IP: Multicast group(eg 239.1.2.3) which receiver want to join
      • IP protocol for IGMP is 2
        • IGMP header will have group address too(239.1.2.3)
      • TTL = 1
  • Leave Group(for v2): receiver can tell MC router that it is leaving the group
    • Receiver will send this message
      • source IP: receiver IP
      • destination IP: 224.0.0.2 (ALL-ROUTERS)
      • IGMP header will have the group address which the receiver wants to leave(239.1.2.3)
  • Membership query:
    • MC router query if the receivers are still interested to get the MC streams.
    • two types
      • General query: Query for all MC groups of the router to make sure receiver are still interested
      • Specific query for a group: it is done with leave group message is received by the MC router. It is make sure other receivers for the same group are still interested.

PIM: Dense Mode

PIM: Protocol independent multicast

PIM doesn’t maintain a multicast routing database. Instead, PIM uses unicast routing protocol database. It also does RPF check.

PIM packets:

PIM Hello:

  • Use IP protocol 103
  • send hello on the multicast address: 224.0.0.13
  • TTL = 1
  • use TLVs to carry most of the information
    • TLV can carry below information:
      • Hold time
      • DR priority
        • Designated router selection for ethernet.
        • PIM- Dense Mode doesn’t need DR. PIM-SM needs DR
        • Use DR priority in Hello. If Priority is same, then highest IP is used
      • Generation ID
      • State refresh (for Dense mode only)
  • Since Hello packets doesn’t distinguish between PIM DM and SM. we can have DM and SM becoming neighbours but it will be a disaster.

PIM neighbours:

we can use: show ip pim neighbour command to get the PIM neighbours

PIM Join/Prune message:

  • used in both dense mode and sparse mode
  • can do both join and prune
    • Join is mostly used in sparse mode along with prune
    • dense mode mostly used Prune as join is automatic
  • Join: When a receiver or LHR wants to indicate it wants to join the S,G
  • Prune: When a receiver or LHR wants to indicate it want to leave the S,G. SPT tree pruning will start.
  • Its a multicast packet with layer 3 destination as 224.0.0.13
    • the header of the PIM packet will carry the upstream neighbor details based on the PIM neighborship
    • then it will carry the details number of groups it is signalling
      • then for each group(S,G)
        • it will have the details of the Join, Prune message
        • along with the IP of the neighbor

PIM: Selective mode override

On a ethernet interface, a prune from the disinterested downstream neighbour may affect the traffic being received by an interested neighbour. the prune message will have the RPF neighbour in the header though.

Since, prune message is send to 224.0.0.13, it is received by all MC talking routers. So, we can work around a way to make sure interested neighbours are still getting the MC stream.

Now, since interested router will also get the prune message from the dis-interested neighbor, these interested neighbor can send join message.

Keep in mind, the RPF neighbour don’t prune the OIL immediately, it wait for 3 seconds before pruning the OIL interface.

dis-interested neighbour now again will receive the traffic but it has to wait for 3 minutes of hold-timer before sending prune message.

PIM: Assert mechanism

On an ethernet, we can have multiple MC routers and one receiver on the same broadcast domain. if the receiver is subscribed to a MC group, it will receive 2 copies of the stream – one from 1st MC router and another one from the 2nd MC router.

two solve this issue, one of the MC router will be chosen as the Designated forwarder in the broadcast domain.

if a MC router gets MC packet on a interface where RPF check fails, it responds with the assert message. Keep in mind, prune is send for point-to-point links, but for ethernet we will use assert message once RPF fails

inside assert message:

  • S,G which we want to signal
  • Metric preference(iOS: Administrative distance of the route to the source)
  • metric of the route to the source

if all above information is same, highest IP wins. so the router which wins the assert message selection mechanism, all the other routers will send prune message.

PIM: Graft

We keep all the state information in all the MC routers. This is done to make sure that if a router has pruned itself and it gets an interested receiver, it know the details about S,G + upstream router.

Since it has the state details, all it has to do is tell the upstream router that it wants to join the tree back.

to do it, the router can send a graft message to the upstream router. this message will be unicast. upstream router will send graft-ack.

graft message will have the upstream neighbor IP.

PIM: State refresh

should be enabled in the hello packet. if we are not using state refresh, we keep doing flood and prune every 3 minutes.

  • The FHR router will send State refresh to all the downstream

PIM: Dense Mode (How it works)

  • PIM-DM creates shortest path trees
  • the root is at the source of the multicast
  • PIM-DM assumes that all the routers needs to multicast feed
    • It means all(S,G) the multicast traffic must be forwarded to OIL except the RPF interface(IIF)
  • The receiving router which is receiving the S,G multicast traffic, the LHR or receiving router will prune itself from the tree. Basically tell that I don’t need particular S,G feed

PIM: Dense Mode Brief

Basically, Dense mode creates shortest path tree automatically

  • A router receiver multicast packet on the IIF
    • the router performs the IIF check
    • the routers adds all other multicast enabled interfaces to the OIL
      • any interface which has PIM DM enabled is considered as multicast interface
    • the router transmits the packet to OIL
  • This process starts at First hop router(FHR)
    • Repeats till the packet reach the LHR
  • If the LHR is register for that S,G multicast stream via IGMP, no action is taken and the packets will be send to the receivers

In cisco routers, we can enable PIM per interface:

type: ip pim dense-mode

check the routing table for MC:

show ip mroute

Details of PIM: Dense Mode

Step 1:

  • All the MC speaking router will send Hello packet to reach other on the multicast enabled interface
    • we will configure the interface for pim dense mode explicitly
  • routers will become MC neighbours
  • DR selection will also take place
  • IGP is already running

Step 1: Source sends unsolicited traffic to multicast address

  • FHR gets the multicast packet(say 239.1.1.1) from the source(say: 172.16.1.1). FHR will make a note of the S,G
  • FHR will do RPF check to check the source matches the IGP
  • RPF neighbor: usually identified as the upstream MC neighbor. For FHR its 0.0.0.0
  • FHR will create OIL for the all MC interfaces for DM except the RPF interface
  • FHR will forward to all the OIL. keep in mind all these interface has MC neigbhorship

Step 2: All the routers send the traffic to downstream

  • FHR send the traffic to the downstream interfaces
  • Downstream routers will check for RPF
  • Add the S,G details in the MC routing table with OIL
  • Send the traffic further

Step 3: traffic reaches LHR

  • Now, If the LHR has receiver registered, all well and good. LHR will send the traffic to the receivers
    • check the receivers:
      • show ip igmp groups
  • if not, follow step 4

Step 4: Reactive SPT pruning

  • the LHR has no registered receivers by IGMP, LHR will send the prune message on IIF
  • RPF neighbour will in the prune message
  • RPF neighbour will receive the prune message and prune the interface from OIL. It basically makes the interface stale so that traffic can’t be forwarded to that interface
  • If it doesn’t have any OIL left, the router will itself send the prune message to upstream
  • what if the we can have 2 equal cost neighbours as IGP cost is same to reach the source from both these neighbours ?
    • PIM DM will select the neighbor with the highest IP as the RPF neighbor
    • we can check the RPF neighbor using: show ip rpf <source-ip>

PIM: Sparse Mode Introduction

Features: You ask and you shall receive

  • In general, multicast works when a source sends the MC traffic using S,G and receiver indicates that it is interested in receiving that MC traffic as *,G
  • the source has to tell the First hop router on the data plane whereas receiver has to tell the last hop router(s)
  • now it is the job of the FHR, LHR and all the MC enabled routers to make sure the traffic reaches from the source to the receiver

comparison with PIM DM:

  • Unlike PIM DM, a receiver must explicitly ask to join a MC stream
  • totally different protocol as compared PIM DM
Rendezvous point(RP)
  • RP is a centralised registry router which has special role
    • RP should receive MC state information from both LHR and FHR
    • it should compare this information and work to connect them if need be
  • FHR, LHR and RP and every MC router in the network should agree on who is the RP. they all should be aware who is the RP including the RP itself

Connect LHR to the RP

  • LHR will have S,G or *,G based on the interest of the receiver
  • LHR will create a join message to the RP where RP is the root
  • LHR will follow the RPF interface to reach the RP as RP interface would be announced via IGP
  • LHR will do the join message and RPF check will be done. Each router will do the RPF check for each PIM Join message as TTL is 1. Each router will create the state for the *,G
  • PIM state in each router
    • On the interface on which a router receives the PIM JOIN message, it will add it to the OIL
    • router will do the RPF check and the interface on which it sends the JOIN message based on the RPF check, it will add that interface to the IIF
    • Once RP receives the JOIN message, it will create *,G state and have OIL interface on which it received the JOIN message
    • Since RP is the root, the process for the JOIN message has completed

Connect FHR to the RP

  • The concept here is that FHR will send the PIM registered message to the RP as a unicast. This way, FHR can tell the RP about a source.
  • If there are any active receivers, that RP will create the tree towards the FHR as the root
  • RP will send the PIM Join message to the FHR. each of the middle routers will now create state entry for that S,G
  • PIM register message:
    • Source IP: FHR IP
    • Destination IP: RP IP
    • Contents: info about the S,G

Now RP has details about the receivers and source for a group. RP can start creating the tree towards the source by doing RPF check. RP has to co-relate the *,G from the receiver and S,G signal it received from the source to create end-to-end tree

PIM: Sparse Mode Detailed

PIM Designated router

DR plays two roles:

  • In the segment for First hop router and we have multiple routers connected to the source in a same broadcast domain, the DR is responsible for sending the PIM register message to the RP
  • In the segment for the last hop router and we have multiple routers connected to the receiver in the same broadcast domain, the DR router is responsible for sending the PIM JOIN message
R5#show ip pim neighbor
PIM Neighbor Table
Mode: B - Bidir Capable, DR - Designated Router, N - Default DR Priority,
      S - State Refresh Capable
Neighbor          Interface                Uptime/Expires    Ver   DR
Address                                                            Prio/Mode
172.26.56.6       FastEthernet0/0          01:24:39/00:01:19 v2    1 / DR S P

PIM Join/prune message:

  • we can have S,G and *,G join/prune messages
  • JOIN message are processed by hop by hop
    • source IP: interface outgoing IP
    • destination IP: 224.0.0.13
    • TTL: 1
  • the JOIN message will have the Upstream neighbor address based on the RPF check
  • JOIN message will also will have the details of the group and sources it want to join

*,G message:

  • Used by the LHR to signal the RP
  • Upstream neighbor is the MC RPF neighbor for the RP IP
  • Multicast group: Group
  • Joined Source: the RP’s IP address
  • Each MHR will do the RPF check
R5#show ip mroute sparse
(*, 239.1.1.1), 00:20:47/00:02:25, RP 3.3.3.3, flags: SJC
  Incoming interface: Null, RPF nbr 0.0.0.0
  Outgoing interface list:
    FastEthernet0/0, Forward/Sparse, 00:20:47/00:02:25

R3#show ip pim rp mapping
PIM Group-to-RP Mappings

Group(s): 224.0.0.0/4, Static
    RP: 3.3.3.3 (?)
R3#

PIM register message:

The PIM register message’s primary job is to inform the RP of an active(S,G) state.

We basically tunnel the MC packet inside a unicast packet so that the packet can be delivered from the FHR to the RP.

This is a unicast packet with source as the FHR outgoing interface and destination ip as the RP IP.

  • source IP: IP of the FHR
  • Destination IP: IP of the RP

The IP packet will have the payload of the MC:

  • Source IP as S
  • G as the destination IP
  • MC payload

Each MHR simply routes/forwards the packets as it would any unicast packet. The packet is not process by the MHR.

FHR will basically take the initial MC packet and tunnel it:

final packet: unicast packet + PIM header registration packet + original IP packet + initial MC packet.

PIM register-stop message:

send from the RP to the FHR as unicast packet. no tunnelling is done. S,G information is shared in the message.

register-stop message is used by the RP to tell the FHR to stop sending any more register messages once steady state is established. This will stop FHR to some hold time so that it doesn’t send register message.

RP and PIM register:

case 1: RP has no (*,G) state for the G for MC registration message received from FHR.

  • RP will send registration stop message as RP doesn’t have any listeners.
  • FHR will not send registration message for some fixed time
  • But RP will keep the S,G state but no action will be taken.
R1#
*Aug  9 15:33:16.790: PIM(0): Send v2 Data-header Register to 3.3.3.3 for 172.26.1.10, group 239.1.1.1
*Aug  9 15:33:16.814: PIM(0): Received v2 Register-Stop on GigabitEthernet1/0 from 3.3.3.3
*Aug  9 15:33:16.814: PIM(0):   for source 172.26.1.10, group 239.1.1.1
*Aug  9 15:33:16.814: PIM(0): Clear Registering flag to 3.3.3.3 for (172.26.1.10/32, 239.1.1.1)
R1#

case 2: RP has *,G state for the particular group and it is very 1st registered message for the G. So we have the receivers

  • RP will de-capsulate the MC packet send it down the *,G tree(keep in mind the registered message which is unicast has the MC payload encapsulated)
  • RP will initiate an S,G JOIN towards the source
  • SPT tree will be created
  • Once RP starts receiving the MC packets over the S,G tree, it will send registered stop message
  • Now onwards, registered NULL will be send and RP will respond with register STOP

configuration on arista:

enabling PIM on a interface:

show ip pim interface ethernet 2 detail 
Interface Ethernet2 address is 172.26.12.1
Vif number is 1
PIM: enabled
PIM version: 2, mode: sparse
PIM neighbor count: 1
PIM Effective DR: 172.26.12.2 
PIM Effective DR Priority: 1
PIM Effective Propagation Delay: 500 milliseconds
PIM Effective Override Interval: 2500 milliseconds
PIM Effective Tracking Support: disabled
PIM Hello Interval: 30 seconds
PIM Hello Hold Time: 105 seconds
PIM Hello Priority: 1
PIM Hello Lan Delay: 500 milliseconds
PIM Hello Override Interval: 2500 milliseconds
PIM Hello Tracking Support: disabled
PIM Hello Generation ID: 0x5fe97ccd
PIM Hello Generation ID is not required
PIM Triggered Hello Delay: 1 seconds
PIM Join-Prune Interval: 60 seconds
PIM Join-Prune Hold Time: 210 seconds
PIM Assert Timeout: 180 seconds
PIM Assert Override Interval: 3 seconds
PIM Interface Secondary Addresses: 

tcpdump -i ethernet2 verbose filter pim
tcpdump: listening on et2, link-type EN10MB (Ethernet), snapshot length 262144 bytes
15:25:39.500082 50:42:c0:7d:1e:89 > 01:00:5e:00:00:0d, ethertype IPv4 (0x0800), length 80: (tos 0xc0, ttl 1, id 0, offset 0, flags [DF], proto PIM (103), length 66)
    172.26.12.1 > 224.0.0.13: PIMv2, length 46
        Hello, cksum 0x17b2 (correct)
          Hold Time Option (1), length 2, Value: 1m45s
            0x0000:  0069
          LAN Prune Delay Option (2), length 4, Value: 
            T-bit=0, LAN delay 500ms, Override interval 2500ms
            0x0000:  01f4 09c4
          DR Priority Option (19), length 4, Value: 1
            0x0000:  0000 0001
          Generation ID Option (20), length 4, Value: 0x5fe97ccd
            0x0000:  5fe9 7ccd
          Unknown Option (31), length 8, Value: 
            0x0000:  c07d 1e89 0000 000f
15:26:04.335182 50:34:e1:08:18:fb > 01:00:5e:00:00:0d, ethertype IPv4 (0x0800), length 80: (tos 0xc0, ttl 1, id 0, offset 0, flags [DF], proto PIM (103), length 66)
    172.26.12.2 > 224.0.0.13: PIMv2, length 46
        Hello, cksum 0xfcb5 (correct)
          Hold Time Option (1), length 2, Value: 1m45s
            0x0000:  0069
          LAN Prune Delay Option (2), length 4, Value: 
            T-bit=0, LAN delay 500ms, Override interval 2500ms
            0x0000:  01f4 09c4
          DR Priority Option (19), length 4, Value: 1
            0x0000:  0000 0001
          Generation ID Option (20), length 4, Value: 0x5fe97ccd
            0x0000:  5fe9 7ccd
          Unknown Option (31), length 8, Value: 
            0x0000:  e108 18fb 0000 000e

SPT failover

Initially LHR doesn’t know about the source. LHR knows *,G based on the IGMP. But once LHR receives S,G packet from the RP, LHR has the info about Source. So LHR can now do JOIN message for the source itself. LHR can now have the most optimal path to the source bypassing the RP.


Posted

in

by

Tags: