The previous equation can determine the number of packets lost globally in the monitored network, exploiting only the data provided by the counters in the input and output nodes.
In addition, it is also possible to leverage the data provided by the other counters in the network to converge on the smallest identifiable subnetworks where the losses occur. These subnetworks are named "clusters".
A cluster graph is a subnetwork of the entire monitoring network graph that still satisfies the packet loss equation (introduced in the previous section), where PL in this case is the number of packets lost in the cluster. As for the entire monitoring network graph, the cluster is defined on a per-flow basis.
For this reason, a cluster should contain all the arcs emanating from its input nodes and all the arcs terminating at its output nodes. This ensures that we can count all the packets (and only those) exiting an input node again at the output node, whatever path they follow.
In a completely monitored unidirectional network (a network where every network interface is monitored), each network device corresponds to a cluster, and each physical link corresponds to two clusters (one for each device).
Clusters can have different sizes depending on the flow-filtering criteria adopted.
Moreover, sometimes clusters can be optionally simplified. For example, when two monitored interfaces are divided by a single router (one is the input interface, the other is the output interface, and the router has only these two interfaces), instead of counting exactly twice, upon entering and leaving, it is possible to consider a single measurement point. In this case, we do not care about the internal packet loss of the router.
It is worth highlighting that it might also be convenient to define clusters based on the topological information so that they are applicable to all the possible flows in the monitored network.
A simple algorithm can be applied in order to split our monitoring network into clusters. This can be done for each direction separately. The clusters partition is based on the monitoring network graph, which can be valid for a specific flow or can also be general and valid for the entire network topology.
It is a two-step algorithm:
-
Group the links where there is the same starting node;
-
Join the grouped links with at least one ending node in common.
Considering that the links are unidirectional, the first step implies listing all the links as connections between two nodes and grouping the different links if they have the same starting node. Note that it is possible to start from any link, and the procedure will work. Following this classification, the second step implies eventually joining the groups classified in the first step by looking at the ending nodes. If different groups have at least one common ending node, they are put together and belong to the same set. After the application of the two steps of the algorithm, each one of the composed sets of links, together with the endpoint nodes, constitutes a cluster.
In our monitoring network graph example, it is possible to identify the clusters partition by applying this two-step algorithm.
The first step identifies the following groups:
-
Group 1: (R1-R2), (R1-R3), (R1-R10)
-
Group 2: (R2-R4), (R2-R5)
-
Group 3: (R3-R5), (R3-R9)
-
Group 4: (R4-R6), (R4-R7)
-
Group 5: (R5-R8)
And then, the second step builds the clusters partition (in particular, we can underline that Groups 2 and 3 connect together, since R5 is in common):
-
Cluster 1: (R1-R2), (R1-R3), (R1-R10)
-
Cluster 2: (R2-R4), (R2-R5), (R3-R5), (R3-R9)
-
Cluster 3: (R4-R6), (R4-R7)
-
Cluster 4: (R5-R8)
The flow direction here considered is from left to right. For the opposite direction, the same reasoning can be applied, and in this example, you get the same clusters partition.
In the end, the following 4 clusters are obtained:
Cluster 1
+------+
<> R2 <>---
/ +------+
/
+------+ / +------+
---<> R1 <>---<> R3 <>---
+------+ \ +------+
\
\
\
\
\
\
\
\
\ +------+
<> R10 <>---
+------+
Cluster 2
+------+ +------+
---<> R2 <>---<> R4 <>---
+------+ \ +------+
\
+------+ \ +------+
---<> R3 <>---<> R5 <>---
+------+ \ +------+
\
\
\
\
\ +------+
<> R9 <>---
+------+
Cluster 3
+------+
<> R6 <>---
/ +------+
+------+ /
---<> R4 <>
+------+ \
\ +------+
<> R7 <>---
+------+
Cluster 4
+------+
---<> R5 <>
+------+ \
\ +------+
<> R8 <>---
+------+
There are clusters with more than two nodes as well as two-node clusters. In the two-node clusters, the loss is on the link (Cluster 4). In more-than-two-node clusters, the loss is on the cluster, but we cannot know in which link (Cluster 1, 2, or 3).
In this way, the calculation of packet loss can be made on a cluster basis. Note that the packet counters for each marking period permit calculating the packet rate on a cluster basis, so Committed Information Rate (CIR) and Excess Information Rate (EIR) could also be deduced on a cluster basis.
Obviously, by combining some clusters in a new connected subnetwork (called a "super cluster"), the packet-loss rule is still true.
In this way, in a very large network, there is no need to configure detailed filter criteria to inspect the traffic. You can check a multipoint network and, in case of problems, go deep with a step-by-step cluster analysis, but only for the cluster or combination of clusters where the problem happens.
In summary, once a flow is defined, the algorithm to build the clusters partition is based on topological information; therefore, it considers all the possible links and nodes crossed by the given flow, even if there is no traffic. So, if the flow does not enter or traverse all the nodes, the counters have a nonzero value for the involved nodes and a zero value for the other nodes without traffic; but in the end, all the formulas are still valid.
The algorithm described above is an iterative clustering algorithm, but it is also possible to apply a recursive clustering algorithm by using the node-node adjacency matrix representation [
IEEE-ACM-ToN-MPNPM].
The complete and mathematical analysis of the possible algorithms for clusters partition, including the considerations in terms of efficiency and a comparison between the different methods, is in the paper [
IEEE-ACM-ToN-MPNPM].