Volume 2024, Issue 1 9696548
Research Article
Open Access

Using MSPG to Suppress Flooding in the Data-Link Layer for SDN

Guangjia Song

Corresponding Author

Guangjia Song

School of Engineering and Technology , Jiyang College of Zhejiang A&F University , Zhuji , 311800 , China

Search for more papers by this author
Hui Wang

Hui Wang

National Computer Network Emergency Response Technical Team , Coordination Center of China , Beijing , 100029 , China , cert.org.cn

Search for more papers by this author
Jianhua Hu

Jianhua Hu

School of Engineering and Technology , Jiyang College of Zhejiang A&F University , Zhuji , 311800 , China

Search for more papers by this author
First published: 13 September 2024
Academic Editor: Helena Rifà-Pous

Abstract

The media access control (MAC) address is the basis of frame forwarding in data-link layer. When MAC missing occurs in the software-defined network (SDN), the processing method is the same as that of Ethernet, which is also broadcast. The broadcast will spread and cover part or even the entire network, result in a large number of frames, which leads to a sharp increase in network traffic, even cause congestion. To suppress the layer-2 flooding, a new frame forwarding method is proposed, namely, multidevice search and path generation (MSPG). New approach will search multiple devices when MAC missing occurs and calculate a forwarding path; a set of flow table is generated to restrain the switch from broadcasting. Experiments and analysis show that in comparison with the existing frame forwarding methods in SDN, MSPG can effectively suppress flooding, greatly reduce the network load, minimize the traffic between data plane and control plane when MAC missing occurs, and reduce energy consumption significantly. The MSPG’s path calculation and flow dispatch method also decrease the forwarding delay and are more suitable for large-scale network.

1. Introduction

In local area network (LAN), many services, such as address resolution protocol (ARP), neighbor discover protocol (NDP), and dynamic host configuration protocol (DHCP and DHCPv6), are based on the same technology, namely, broadcast. Due to a single node lacking relevant knowledge of the other nodes, broadcasting messages to all the other nodes is the premise of address resolution, duplicating address detection, and so on, which will generate a large amount of network load. Whether in data center environments, or in network environments such as the Internet of Things, sensor networks, mobile IPv6 networks, and the Internet of Vehicles, nodes are always in a moving state. Due to regional changes, their network parameters also change accordingly. Therefore, processes such as DHCP, DAD, and ARP become more frequent, and broadcast traffic also increases dramatically [13]. Broadcast and network devices are inseparable. Broadcast needs the support of network devices, and the network device itself will also produce a large number of broadcasts, such as the MAC learning of the switches.

Broadcast seriously affects the performance of the network. Literature [4] shows that the traffic generated by ARP accounts for more than 86% of the total traffic in LAN. Broadcasting is also a main source of network congestion, which will also deteriorate the performance of data centers and cause considerable pressure to cloud service providers [5]. Most of the existing researches on broadcast suppression mainly focus on proxy, such as adding proxy servers in LAN or adding specific modules in software-defined network (SDN) to sniff and respond to the broadcast frames generated by the protocols. These methods can reduce the broadcast traffic, but with limitations, only applicable to specific protocols, without universality. Their deployment complexity also is high and cannot solve the flooding caused by MAC missing in switches.

The SDN provides new options for network evolution. The separation of data plane and control plane allows network devices to no longer be closed, making data forwarding and scheduling more flexible. The deployment of multiple controllers is beneficial for achieving load balancing and handling link failures, facilitating network management, and significantly improving network reliability [6]. In this paper, a new frame forwarding model for SDN is designed which is called multidevice search and path generation (MSPG). New method uses multidevice search and path generation as a resolution, which greatly reduces the flooding caused by MAC missing and effectively reduces the broadcasting traffic in LAN. The major contributions of the paper are as follows:
  • (1)

    A new flooding-suppressing model based on graph theory is proposed, and relevant theorems have been proved

  • (2)

    A novel approach for handling MAC missing in SDN is proposed

  • (3)

    The new approach is universal and not limited to specific protocols

The remainder of this paper is organized as follows: Section 2 introduces the typical research work related to LAN broadcast suppression. Section 3 introduces the detailed architecture and related algorithms used in the MSPG system. Section 4 compares and analyzes MSPG by using typical frame forwarding methods through experiments and comparisons. Finally, the last part summarizes the study.

2. Research Background

2.1. MAC Missing and Flooding

The channel of the Internet’s data-link layer has two working modes: point-to-point and broadcast. Broadcast, sometimes also known as multicast, is generally completed by a switch, which forwards a frame from all ports (excluding the frame’s entry port). This forwarding mode often causes chain reactions from other switches, resulting in flooding. Flooding can bring a large amount of load to the network, limit network expansion, and worsen network performance.

Technically, there are two main reasons for flooding: MAC missing and protocol requirements. MAC missing refers to the switch being unable to find the corresponding MAC address of the frame in its forwarding table, unable to perform point-to-point forwarding, and can only broadcast, resulting in flooding, which is called passive flooding. Another reason for flooding is that a certain protocol generates a broadcast message, requiring the switch to broadcast the message, resulting in network flooding, which we call protocol flooding.

Protocol flooding is the prerequisite of protocol operation, and the switch must complete this action, not only cannot eliminate it but also fully guarantee it. For passive flooding, it is essentially caused by a lack of knowledge, i.e., a lack of relevant entries in the switch’s forwarding table. If these entries can be supplemented, the impact of flooding can be reduced or even eliminated. This paper mainly discussed the use of multitable search and path generation in the SDN environment to compensate for missing entries in switches and reduce the harm of passive flooding to the network.

2.1.1. Frame Forwarding in Ethernet

Flooding is a data stream transfer technology used by the switch to send data to all interfaces. One switch’s flooding will often cause the simultaneous flooding of other switches, resulting in a large number of broadcast messages in LAN and subsequently increasing the network burden. MAC missing occurs when a MAC address, which is requested, does not exist in the MAC table of the switch. The disadvantage of the Ethernet switch is the production of considerable flooding when MAC missing occurs.

In the following derivation, assuming the network topology is a tree and the transmission delay between switches is negligible. Therefore, a wired network with arbitrary topology and layer-2 interoperability can be transformed to a tree G by spanning tree protocol (STP), where G = 〈V,  E,  M,  F〉.

V represents the node set, which is also known as the switch set, V = {v1, v2, ⋯, vn};

E denotes the edge set, which is also known as the links between switches, E = {e1, e2, ⋯, en−1};

M refers to the MAC address set, M = {mv1, mv2, …, mvn}, where mvi is the MAC address set owned by switch vi, 1 ≤ in;

F indicates the frame count set, F = {fv1, fv2, …, fvn}, where fvi denotes the number of frames generated when flooding occurs in switch vi, 1 ≤ in.

If a switch v receives a frame, then the destination address of the frame is MACX. If MACXmv, then v will broadcast the frame to all ports except the entry port in accordance with the forwarding principle of the Ethernet switch, that is, flooding occurs. We record the flooding in switch v triggered by MACX as missing(v, MACX) and using frame(v, MACX) to represent the number of frames generated by missing(v, MACX).

For the convenience of subsequent discussion, the meanings of related concepts are shown in Figure 1.
  • Set V is {v1,  v2, …,  v7}

  • Set E is {e1,  e2, …,  e6}

  • Set M is {mv1,  m, …,  mv7}

Details are in the caption following the image
Concepts demonstration of MSPG.

Support v1, v2, and v3 does not have MACX, but v4, v5, v6, and v7 do, and then, missing(v1, MACX) covers set {v1, v2, v3}.

Definition 1. MAC missing cover. If missing(v, MACX) occurs and results in all nodes in set V(VV and vV) flooding, then it means missing(v, MACX) covers V, and the total number of frames caused by missing(v, MACX) is flood(V).

Theorem 2. In tree G, if nodes u and v satisfy the following conditions:

  • (1)

    MACXmv

  • (2)

    and MACXmu

Then, the MAC set of each node in path p, which connects nodes u and v, should contain MACX. Hence, where vp.

Proof. On the basis of MAC learning, if f is forwarded from switch v to target switch u for frame f with the source address MACX and the forwarding path p, then f will pass through all switches on path p, and all nodes on path p will learn the MACX. Hence, where vp.

Theorem 3. In tree G, if missing(v, MACX) covers V and |V| > 1, then all nodes in V can form a subtree of G.

Proof. Suppose that nodes uV and zV and no path is available between u and z. As flooding is generated by missing(v, MACX), u and z must have their own paths to node v. Suppose their separate paths are p1 and p2, then a path p to connect u and z can be constructed, where p = p1 + p2, that is, a path exists between u and z.

Hence, the hypothesis is invalid. A path exists between all nodes in V, that is, V can form a subtree of G.

Theorem 4. In tree G, if missing(v, MACX) and missing(u, MACY) cover the same subtree g, then the number of flood frames generated by the two is equal.

Proof. For node v, where vg, when flooding occurs, the number of frames it generates is , which has nothing to do with the MAC address that triggered the flooding.

Hence, when flooding occurs at all nodes in g, the total number of frames is fixed as follows:

(1)

Theorem 5. In network G, if MACX satisfies , then missing(v, MACX) covers G.

Proof. All nodes in G are connected because G is a tree. For any node v or u, a path will exist between them. Suppose that the path between u and v is p, where |p| = l and p = vieivi+1ei+l−1u, then vi+1 on path p will receive the flood frame because the flooding frame propagates along all the edges of v when missing(v, MACX) occurs. As MACXmvi+1, then missing(vi+1, MACX) will occur and cause the flooding frame to continue propagating to the next node on the path p with the process repeating itself. The flooding frame eventually propagates along the path p to u because MACXmu. Hence, missing(u, MACX) occurs. As u is arbitrary, then flooding occurs at all nodes, indicating that missing(v, MACX) will cover G.

Theorem 6. In network G, if missing (v, MACX) occurs, flooding stops at node u, and u is a leaf, then u is removed from G to form subtree G. If missing(v, MACX) occurs and vG, then missing(v, MACX) will cover G.

Proof. Suppose that missing(v, MACX) does not cover G, then a node u exists in G that satisfies the following conditions:

  • (1)

    A path p exists between u and v, and u is the last node on p

  • (2)

As the flooding terminates at switch u, then MACXmu. Therefore, MACX appears on switches u and u. Hence, a path p exists in G that connects u and u, and vp. On the basis of Theorem 2, MACX exists in the address set of all nodes on p. Hence, MACXfv exists. That is, missing(v, MACX) will not occur, thereby contradicting the premise. Hence, missing(v, MACX) will cover G.

Corollary 7. In tree G, assume missing(v, MACX) occurs and flooding terminates at u.

If u is a branch node, taking u as the cut point, after u is deleted, G is divided into set G = (g1, g2⋯). Let node v belongs to gi, and giG. If missing(v, MACX) occurs and vgi, then missing(v, MACX) will cover gi.

Proof. If |gi| = 1, it means gi contains only one node. Hence, v is v. so if missing(v, MACX) occurs, it would cover gi.

If |gi| > 1, then gi includes multiple nodes. On the basis of Theorem 3, gi is a subtree of G. Suppose that missing(v, MACX) does not cover gi, then u must exist, satisfying ugi and . On the basis of Theorem 2, a path p exists between u and u because and MACXmu and ugi and uG. As missing(v, MACX) covers gi and terminates at node u, then vp must exist. On the basis of Theorem 2, MACXmv must exist, that is, missing(v, MACX) does not occur in v, which is contradictory to the premise. Hence, missing(v, MACX) covers gi.

Corollary 8. The flood frames generated by missing(v, MACX) in gi are equal to the flood frames generated by missing(v, MACX) in G.

Proof. Corollary 7 shows that missing(v, MACX) and missing(v, MACX) cover the same subtree gi. Hence, on the basis of Theorem 4, they produce the same amount of frames.

2.1.2. Frame Forward in SDN

SDN is mainly composed of SDN controllers, OpenFlow channels, and OpenFlow switches. OpenFlow switches generally have two working models, namely, proactive and reactive. In the proactive model, the switch can automatically process the MAC frame forwarding through preset flows without the controller’s intervention and simulates the behaviors of the Ethernet switch. In the reactive model, the switch cannot automatically process the transfer of MAC frames because it must interact with the controller [7]. For example, sending PacketIn messages to the controller will command the switch via the dispatching flows. For the SDN, many switches exist in its topology. Each switch may work in a proactive or reactive mode. Therefore, when switches with different modes exist in the SDN, frame forwarding becomes further complex.

The working mode of proactive switch is the same as that of the Ethernet switch and also completes the MAC frame forwarding via MAC learning and flooding.

When switch working in reactive mode, it has to cooperate with the controller. Generally, the controller has to maintain a table to store the dpip < MAC: inport > mappings. When a frame is received, the switch forwards it as follows:
  • Step 1: After the frame enters the switch, if frame matches the flow entries, the corresponding instructions are executed.

  • Step 2: If table missing occurs, then the switch sends PacketIn to the controller.

  • Step 3: After receiving the package message, the controller updates its dpid < MAC: in_port > mapping table on the basis of the src_MAC, in_port, and dpid fields in the PackageIn message.

  • Step 4: The controller searches on the basis of the dst_MAC field in the frame. Assuming that the dpid of the switch is X, then the controller looks up entries in the <MAC: inport > mapping table corresponding to X.

  • Step 5: If the controller finds a matching entry, then it first dispatches the forwarding flow table to the corresponding switch X and then sends the PacketOut message to the switch X. Figure 2(a) shows these processes.

If the controller does not find any entry matches with the dst_MAC, then it sends a PacketOut message requiring the switch X to flood, which may cause a chain reaction on other switches, as shown in Figure 2(b).

Details are in the caption following the image
MAC address learning and frame forwarding in SDN. (a) Normal unicast. (b) MAC missing.
Details are in the caption following the image
MAC address learning and frame forwarding in SDN. (a) Normal unicast. (b) MAC missing.

Definition 9. MAC missing. For node v in G, if v received a frame whose target MAC address is MACX and no flow entry matches the frame, then v eventually forwards the frame by flooding. Hence, MAC missing occurs at v, which is denoted by missing (v, MACX).

As the mode of frame forwarding in the proactive switch simulates the Ethernet switch, the frames generated by both switches are the same when MAC missing occurs. Hence, Theorem 10 exists.

Theorem 10. For the proactive switch, the number of frames generated by missing (v, MACX) is equal to that of the Ethernet switch.

Theorem 11. For the reactive switch, the number of frames generated by missing (v, MACX) is larger than that of the proactive switch.

Proof. In reactive switch X, if missing (v, MACX) occurs, then v will send a PacketIn message to the controller, and the controller could not find any entry corresponding to MACX in <MAC: inport > table related to X, then the controller will send a PacketOut message to v and command v to forward the frame by flooding. When MAC missing occurs in the reactive switches, besides the flooding frames, one PacketIn and one PacketOut message will also be generated in the network at least.

Hence, for reactive switch v, missing (v, MACX) produces more frames than those running under the proactive mode.

Theorem 12. Regardless of the edges between the controller and switch for Ethernet G and SDN G with the same topology, if missing (v, MACX) covers the same subtree g, then the frames generated in G is no less than those in G.

Proof. If missing (v, MACX) covers g, on the basis of Theorem 4, the number of frames generated by missing (v, MACX) in G is as follows:

(2)

In G, if {v|vg and vVp} exists, that is, G is composed entirely of proactive switches, then, on the basis of Theorem 10, the number of frames generated by missing (v, MACX) is the same as those occurring in G and is presented as follows:

(3)

If ∃v satisfies vg and vVr, because the number of frames generated by v when MAC missing is larger than that generated by the node in the same topology position in G (at least more than two frames), then the number of frames generated by missing (v, MACX) at this time is at least flood(g) + 2. Therefore, in SDN, the number of frames increases instead of decreasing.

2.2. Related Work

Traditional methods, such as subnet and virtual local area network (VLAN) division, can alleviate the harm of broadcasting, but these methods are limited in the data centers because the migration of virtual machines requires business continuity and layer-2 consistency [8, 9]. In addition, research on broadcast suppression could be divided into three categories: using new network architectures, combining SDN for proxies, and re-encode MAC address.

2.2.1. Using New Network Architecture

Many new architectures have been proposed to enhance network scalability and suppress broadcast, such as Global Environment for Network Innovation (GENI) [10], Future Internet Research and Experimentation (FIRE) [11], Japan Gigabit Network II plus (JGN2plus) [12], SOFIA [13], SDN, and Address Driven Network (ADN) [14].

Scalable Ethernet Architecture for Large Enterprises (SEATTLE) aims to expand the scale of Ethernet [15, 16]. As Spanning Tree Protocol (STP), ARP, and other service discovery protocols rely on broadcasting, the number of LAN nodes is limited by the broadcast domain and is difficult to expand, failing to meet the needs of modern enterprise applications. SEATTLE uses one-hop Distributed Hash Table (DHT) technology to store various network information (e.g., MAC, IP, and network location) in different nodes and uses unicast forwarding instead of broadcasting. SEATTLE eliminates the dependence of many protocols on broadcasting, reduces the broadcasting traffic, and optimizes redundant links. However, SEATTLE must transform the entire network. Although SEATTLE does not need to change the protocol stack on the host side, the network switches must support the link state protocol and one-hop DHT technology.

The SEATTLE has been improved in [17, 18]. As SEATTLE is incompatible with existing protocols and the distributed deployment will increase the difficulty of data-plane control, a new method, namely, floodless service discovery model based on software-defined network (FSDM), is proposed to centralize the DHCP discovery and ARP request messages, which are the most common in LAN.

DHT-enhanced RBridge (DBridge) is the combination of Transparent Interconnection of Lots of Links (TRILL) and SEATTLE [19]. DBridge has the advantage of overcoming the backward compatibility of routing bridges (RBridge) and supporting incremental deployment. DBridge can be used in a mixed environment where RBridge and Ethernet switch coexist [2022].

To suppress the broadcast traffic generated by ARP and Internet Group Management Protocol (IGMP), literature [23] proposed an adaptive broadcast and multicast traffic cutting (ABC) framework. New architecture includes four modules, namely, dispatch, Ethernet handling, ARP handling, and IGMP handling. ABC utilizes the knowledge of control plane to control the change of switch’s flow table and convert the messages that must be broadcast into unicast messages, thereby reducing the communication traffic. However, for messages that cannot be matched, the broadcast will still be performed and a copy will be sent to the controller, thereby increasing the communication overhead without any reduction.

Smart address learning (SAL) considers that in data center, the hosts of different tenants do not communicate with each other, so switches can learn selectively in the process of MAC learning, which greatly reduces the size of forwarding tables of switches and is conducive to the expansion of Ethernet [24]. But the disadvantage is that in order to carry tenant information, the frame in layer-2 will be larger than normal.

2.2.2. Combining SDN for Proxies

SDN has progressed considerably in recent years, attracting considerable attention from researchers [25]. Many studies focus on utilizing the global view of SDN controller for protocol proxy in order to reduce network traffic.

Literature [26] designed an application (App) on the basis of the Ryu controller in SDN. The Ryu controller provides network IP and MAC mapping information to the App through a shared memory. The App has ARP and DHCP modules. By monitoring the communication, when the Ryu receives PacketIn messages related to the ARP or DHCP, it forwards these messages to the corresponding module, and then the App provides the response. However, for ARP messages that cannot find their corresponding mapping, discarding is adopted, leading to a certain degree of misjudgment.

Literature [27] proposed the switch-based proxy ARP (SProxy), which aims to enable the OpenFlow switches to respond the ARP request in SDN. SProxy is an improvement of the controller-agent method, but it must maintain a global address information database in the controller. In SProxy, the switch needs reply the ARP message by using unicast. To achieve this goal, the OpenFlow switch must monitor the ARP messages and compare them with the global address information and then modify some specific fields in accordance with the flow table to construct a unicast ARP message.

Literature [28] also adopted the software proxy process, and SDN controller is also used to reduce the broadcasting generated by ARP, but if the controller does not know the target to be resolved, the whole network-wide broadcasting will occur.

Literature [29] proposed that the layer-2 should be divided into control and decision planes. The control plane is responsible for the statistics of network information, and the data plane makes forwarding decisions on the basis of this information, thereby completely eliminating broadcasting. However, no specific implementation method exists. Literature [30] presents an agent-based approach, which employs two representative technologies: frame encapsulation and broadcast processing. The agent will re-encapsulate the unicast frames between hosts and converts them into unicast frames between proxy and proxy. Broadcasting processing technology enables agent to directly answer ARP requests, thus greatly reducing the scope of broadcasting impact. The disadvantage of this method is that secondary encapsulation will lead to a lager frame, which may result in too large maximum transmission unit (MTU) and other problems.

The authors of [31] proposed a centralized ARP method for the multitenant environment in OpenStack. As OpenStack possesses the IP and MAC mapping knowledge of all tenants, the new method adds several new modules, such as ARP table and flow engine, in SDN. The controller to monitor and reply to the ARP requests to eliminate broadcasting completely.

For host-generated broadcasting messages, such as ARP request and DHCP discovery messages, scalable Ethernet architecture using software-defined networking (SEASDN) uses the controller to relay them, thereby eliminating broadcasting [32]. However, if a table missing occurs, then SEASDN will drop the message, which has certain risks.

Literature [33] redefines the relationship between server and SDN architecture by using the idea of multinode to one node. That is, servers provide services, while SDN architecture is responsible for the best path. For servers registered in SDN, the controller automatically calculates the best path and sends entries to the access switch. In this way, the server discovery message (such as ARP) in the network will be forwarded directly to the server through the best path, thus avoiding broadcasting. The disadvantage is that additional servers need to be deployed. Moreover, due to the need to monitor the running state of the server, the controller detects all servers periodically, thus incurring additional overhead.

Literature [34] proposed a new method called DR-ARP (distributed responder ARP) that divides a large-scale area into several small areas. DR-ARP has capability of broadcasting filtering and ARP messages replying, thereby reducing the broadcasting coverage. Literature [35] uses a bloom filter to monitor network flow changes, thereby preventing MITM attacks and reducing broadcast traffic caused by malicious attacks. Literature [36] uses GCN to train routing models, making packet forwarding more intelligent and accurate. But the training cost cannot be ignored, and there are no application examples at the link layer yet.

2.2.3. Re-Encode MAC Address

Some researches re-encode the 48 bits MAC address to make it meaningful and providing support for MAC routing. We re-encode the MAC to include additional information, such as localization, tenant, virtual machine, or access port information. In literature [37], its MAC is still 48 bits, and the difference is that pseudo MAC (PMAC) encodes the location information of host. Host accesses to the same switch have the same 16 bits prefix PMAC, and the real MAC of host and PMAC form a mapping on the access switch, so the switch only needs a small amount of flow to complete the forwarding of ARP packets. Switches in different zones use different PMAC prefixes, which allows MAC routing and significantly reduces broadcast traffic. However, PMAC needs centralized fabric manager and MAC table to respond to ARP requests in network. For the host whose query fails in fabric manager, ARP process needs to be done again, which will lead to longer time in address resolution.

MAC address leakage is a prerequisite for many attacks. In literature [38], virtual MAC (VMAC) is composed of real MAC, access port, and a specific network prefix, which are kept confidential through hash calculation. The real MAC of the host is only known by host itself and the access switch. Only the VMAC can be learned between switches, and the VMAC changes constantly during frame forwarding. Therefore, VMAC can effectively prevent MAC spoofing, Man in The Middle, and DoS and can effectively reduce malicious network traffic.

3. MSPG

According to Definitions 1 and 9, it can be inferred from Theorems 1012 that when missing(v, MACX) occurs at node v, it will trigger flooding. If these floods cannot be handled properly, the SDN environment will generate more traffic than Ethernet. According to Theorems 26, multiple devices may exist in the network with the address MACX, if a switch v with an address MACX nearest to switch v can be found, then a path from v to v can be built, and unicast forwarding can be used instead of flooding to reduce the network disturbance caused by missing(v, MACX). Therefore, to suppress flooding, a new MAC missing processing method in SDN is designed, namely, MSPG. The working principle of MSPG is aimed at layer-2 switches, processing broadcast traffic at the frame forwarding level, and can capture MAC missing caused by any protocol, not limited to specific protocols, thus having universality. Figure 3 shows the structure of MSPG, and its modules are described as follows:
  • (1)

    Topology Module. This module uses the Link Layer Discovery Protocol (LLDP) to learn the network topology and is responsible for monitoring network topology changes.

  • The major monitoring events include switch add and remove, link addition and deletion, and switch port state change.

  • (2)

    Flow Table Statistical Module. This module mainly includes two functions.

    • (i)

      On the basis of the network topology, the module periodically states the flow table related to MAC frame forwarding in the switches

    • (ii)

      Handling the flow table changes events in the switches

  • (4)

    MDST Module: When missing(v, MACX) occurs in switch v, this module will find the switch with MACX closest to switch v

  • (5)

    Path Generation (PG) Module: For a given start switch u and end switch v, a path p connecting u and v is generated on the basis of the network topology

  • (6)

    Flow Table Dispatching Module. This module generates and dispatches a set of flow tables to avoid MAC missing diffusion

Details are in the caption following the image
Relationship between the algorithms.

In MSPG, the controller also maintains a single-device searching table (SDST) for each OpenFlow switch.

3.1. Workflow of MSPG

For ease of description, the abbreviations that will be used later are described as follows:
  • (1)

    SDST: Single-device searching table which contains three fields: Dpid, MAC, and Port. The SDST of node i is denoted as Nodei.SDST.

  • (2)

    MDST: Multidevice searching table which contains five fields: Dpid, MAC, InPort, IdleTime, and CreateTime.

  • (3)

    Nodes: A switch (node) set, Nodes = {node1, node2, …, nodem}, where nodei = {dpid : node_obj}. dpid is the ID of the switch, and node_obj represents the switch object.

  • (4)

    Links: A set of links (edges) between the switches, Links = {link1, link2, …, linkn}, where linki = {(srcdpid, srcport) : (dstdpid, dstport)}. (srcdpid, srcport) refers to the switch ID and port number of the starting point of the edge. (dstdpid, dstport) denotes the switch ID and port number of the end switch of the edge.

  • (5)

    Path p: It represents a path from switch v to switch u, where |p| = m, p = {link1, link2, …, linkm} and ∀link ∈ Links.

  • Step 1: The controller generates the topology of network G through LLDP, where G = (Nodes, Links). Nodes are the set of switches, and Links are the set of edges. The controller dispatches the flow tables with priority of zero to all switches in G, requiring them to send a PacketIn message to the controller when flow missing occurs.

  • Step 2: If a switch (supposed as S_START) generates missing(S_START, MACX) and the corresponding frame is FrameX, then it sends a PacketIn message to the controller. After the controller receives the message, proceed to step 3.

  • Step 3: The controller searches for MACX in S_START.SDST. If an entry corresponding to MACX exists, assuming that the entry is (MACX, PortX), then it initially sends a FlowMod message to S_START and then requests S_START to forward FrameX from PortX through a PacketOut message. If no entry corresponding to MACX exists, then proceed to Step 4.

  • Step 4: The controller searches for MACX in MDST. If n(n ≥ 1) entries are related to MACX, then find an entry with the smallest CreateTime field among them. Then, we determine the switch (supposed as S_END) and port corresponding to the entry and proceed to Step 5. If n is zero, then go to Step 6.

  • Step 5: The controller generates the forwarding path p between S_START and S_END according to the network topology. Then, it generates a flow set (F − set) on the basis of the forwarding path p and sends each flow in the F − set to the corresponding switch. Finally, it requests S_START to forward FrameX.

  • Step 6: The controller sends a PacketOut message to S_START and commands S_START to flood FrameX.

3.2. Main Algorithms

For Ethernet switch running in non-SDN environment, when missing (S_START, MACX) occurs, flooding is inevitable. On the basis of Theorem 5, missing (S_START, MACX) will cover at least subgraph G and will cover at worst the entire network G. If a switch S_END exists in G, and ∃entry ∈ S_END.SDST, and entry.MAC = MACX, then flooding will terminate at S_END. On the basis of Theorem 6 and Corollary 7, missing (S_START, MACX) will cover subgraph G, which includes S_START. We can construct a path between S_START and S_END and use unicast instead of flooding because G is a tree. Therefore, in step 3, MSPG uses a multitable searching algorithm. When the search fails in S_START.SDST, the algorithm will search MACX in MDST, and if multiple entries match, then the nearest entry to MACX is selected. In this way, unicast forwarding can be used to replace flooding to reduce the network load. The workflow of this process is described in Figure 3, and Algorithms 1–4 shows the details of it.

    Algorithm 1: Multitable lookup.
  • Input: S_START, MACX

  • Output: S_END, Outport

  • If MACX >> 24 ∧ 0 × 03 = = 0 × 3:

  • Return (S_START, Flood)

  • For each entry in S_START.SDST:

  • If entry.MAC = = MACX:

  •   Out_port = entry.Port

  •   Return S_START, Outport

  • Flist = []

  • For each entry in MDST:

  • If entry. MAC = = MACX:

  •   Flist.append(entry.Dpid, entry.inport, entry.Createtime)

  • timemax = 0; S_END = None; port = 0

  • For each item in Flist:

  •  If item.Createtime > timemax:

  •   Timemax = item.Createtime

  •   Nearest = item.Dpid

  •   Port = item.inport

  • If timemax > 0:

  •  S_END = nearest

  •  Outport = port

  • Return S_END, Outport

  • Else:

  • Return S_START, FLOOD

    Algorithm 2: Path generation.
  • Input: S_START, S_END

  • Output: Path

  • If S_START.Dpid = = S_END.Dpid:

  •  Path = [S_START.Dpid]

  • Return Path

  • Else:

  •  Path 1 = []

  • While True:

  •   path 1.append (S_START.Dpid)

  •    If S_START.ancestor! = None:

  •     S_START = S_START.ancestor.Dpid

  •    Else:

  •     break

  •  path 2 = []

  • While True:

  •   path 2.append (S_END.Dpid)

  •   If S_END.ancestor! = None:

  •    S_END = S_END.ancestor.Dpid

  •   Else:

  •    break

  • For each Dpid in path 1:

  •   If Dpid in path 2:

  •    junction = Dpid

  •    break

  •  path 2.reverse()

  •  jindex 1 = path 1.index (junction)

  •  jindex 2 = path 2.index (junction)

  •  Path = path 1[0 : jindex1] + path 2[jindex2:]

  •   return Path

    Algorithm 3: Data update.
  • Input: OpenFlow_Event (ev)

  • Output: Null

  • While True:

  •  Idle and waiting for Event (ev):

  • If Switch.EnterEvent (ev):

  •   Nodes.add (ev.switch.dpid Dpid,ev.switch)

  • If link.AddEvent (ev):

  •   Links.add (ev.link)

  • If link.AddDel (ev):

  •   Links.remove (ev.link)

  • If Switch.EnterLeave (ev):

  •   Dpid = ev.switch.dpid

  •   For each node in Nodes:

  •    If node.dpid = Dpid:

  •     Nodes.remove (Dpid)

  •   For each link in Links:

  •    If link.srcdpid = = Dpid or link.dstdpid = = Dpid:

  •     Links.remove (link)

  •   For each entry in MDST:

  •    If entry.dpid = = Dpid:

  •     MDST.remove (entry)

  • If statReplyEvent (ev):

  •   For each stat in (ev.msg.body):

  •    For each entry in MDST:

  •     If stat.msg.dpid = = entry.dpid:

  •      Entry.update ()

    Algorithm 4: Generation and dispatch algorithm of F − set.
  • Input: Path, MACX

  • Output: F − set

  • N = Len (Path)

  • For each i in range (0, N):

  •  Srcdp = Path [i]

  •  Dstdp = Path [i + 1]

  • For each link in Links:

  •   If link.srcdpid = srcdp and link.dstdpid = = dstdp:

  •    dpid = Srcdp

  •    match = Match (Ether = MACX)

  •    action = Output (link.srcport)

  •    idle_timeout = 300

  •    priority = 1

  •    Flow = OFPFlowMod (dpid, idle_timeout = 180, priority, Match, Action)

  • F-set.add (Flow)

  • Return F-set

In step 4, if S_END satisfies the conditions because network G is interconnected, then a path p must exist between S_START and S_END. The purpose of Algorithm 1 is to construct the path p. In Algorithm 1, if MACX is a broadcast address, there is no need to calculate. Otherwise, it searches the SDST of the S_START and, if the corresponding entry for MACX is found, then returns the corresponding port. Otherwise, we traverse the MDST, if a switch contains an entry corresponding to MACX and adds the device to the Flist. Finally, it traverses Flist to find the switch closest to S_START.

Algorithm 2 describes the process of constructing a path p from S_START to S_END in G. In Algorithm 2, if S_START and S_END are the same, then S_START is the only node on path. Otherwise, it constructs path1 from S_START to root and then constructs path 2 from S_END to root. Finally, it identifies the intersection of path 1 and path 2 and constructs a complete path from S_START to S_END;

When the switch enters or leaves or when the switch state or the flow table changes, the maintenance module of MSPG maintains and updates the related important data, such as SDST, MDST, Nodes, and Links. In Algorithm, if a switch entry event occurs, it updates Nodes and Links. If a switch leave event occurs, it first deletes the corresponding device from Nodes, then removes all edges related to the switch in Links, and then deletes the device in MDST. If a statistical event occurs, it updates the entry corresponding to that event in MDST.

When path p is generated between S_START and S_END, the Flow_Dispatch module is responsible for generating the flow set (F − set) and dispatches each flow in the F − set to the corresponding switches, thereby avoiding the flooding of switches in path p. Algorithm 4 shows the F − set generation and dispatch algorithm. It first identifies all edges between adjacent switches on path p, then constructs a forwarding flow table, and adds it to F − set.

The relationship between the above four algorithms is shown in Figure 3. With the help of these algorithms, MSPG can effectively deal with MAC missing, complete the transition from broadcast to unicast, and reduce the broadcast overhead in LAN.

4. Experiments and Comparative Analysis

4.1. Experiments

We conducted comparative experiments between MSPG, FSDM, and Portland to test the function and effect of the new method (test platform: Ubuntu 18.04, AMD Ryzen Threadripper 3960X CPU, 32 GB Corsair DDR4 memory, Ryu Controller Ver. 4.2, Open vSwitch Ver. 2.5.2, and Mininet Ver. 2.30d1). The network topology is a typical data center fat tree structure, and the fanout is four. The experiment was conducted in multiple scenarios to test the network’s communication status when MAC missing occurred. The main metrics recorded and analyzed are as follows:
  • (i)

    Controller’s CPU and Memory overhead

  • (ii)

    Forwarding delay

  • (iii)

    Data plane’s network traffic

  • (iv)

    Data plane’s energy consumption

The experimental results are shown in Figures 4, 5, 6, 7, and 8. Figure 4 shows that the forwarding delay increases with the increase in the tree depth. However, the increments of the three methods vary. The delay increases in FSDM and Portland are significant, while that in MSPG is relatively mild. The reason is that when MAC missing occurs in SDN, the switch will send frames that cannot be processed to the controller through packet-in message. In FSDM and Portland, the controller will use a packet-out message to request the switch to broadcast. The process of sending packet-in and receiving packet-out increases the number of repetitions with the increase in the length of the forwarding path. Consequently, the delay increases.

Details are in the caption following the image
Comparison of forwarding delay.
Details are in the caption following the image
Comparison of CPU overhead.
Details are in the caption following the image
Comparison of memory overhead.
Details are in the caption following the image
Comparison of network traffic.
Details are in the caption following the image
Comparison of energy consumption.

Meanwhile, in MSPG, the process of sending packet-in and receiving packet-out does not repeat with the increase in the path length. The controller will use a path generation module to prevent the spread of MAC missing. The packet-in and packet-out processes will only occur on the first switch.

The comparison of the three methods in terms of CPU overhead is shown in Figure 5. The CPU overhead of processing MAC missing is slightly high in MSPG. When MAC missing occurs, the switch will send packet-in to the controller. The controller will traverse SDST and MDST to find devices with missing MAC and generate a path and a set of flow tables through calculation. Thereafter, these flow tables are deployed to the corresponding switches to reduce the interference of MAC missing on the network, resulting in a higher CPU overhead.

In FSDM, although the switch will send a packet-in message to the controller when MAC missing occurs, the controller only looks up its own host-information table and guides the switch to broadcast. Accordingly, the overhead is lower than MSPG. The method of handling MAC missing in Portland is similar to that in FSDM, with a lower CPU overhead than that of the MSPG. However, the FSDM’s hash calculation results in a slightly higher CPU overhead than in Portland.

In Figure 6, MSPG has a slightly higher memory overhead. MSPG has to maintain MDST and multiple SDST to accelerate the completion of MAC address traversal and path generation. Moreover, the size of the tables increases with the increase in the network topology depth, resulting in a higher memory usage for handling MAC missing. Portland must maintain a PMAC table with entry length lowers than the host-info table in FSDM; hence, its memory overhead is slightly lower than that in FSDM. The data structures that FSDM must maintain include the host-info-table and the location-info table. The consistency of these two tables must be maintained at all times. FSDM also needs to store the hash of the IP address, and its memory overhead is lower than that of MSPG and higher than that of Portland.

In Figure 7, repeated experiments are conducted at different topology depths, and the number of frames generated by MAC missing between MSPG, Portland, and FSDM is tested and compared. It shows that the number of frames generated by Portland and FSDM increase remarkably with the network scale. Due to the lack of a mechanism to handle MAC missing, the numbers of broadcast frames generated by Portland and FSDM are close to the number of switch ports. In contrast, only a slight traffic increase occurs in MSPG. When MAC missing occurs and covers subgraph G, flooding will occur in all nodes of G. However, MSPG will construct a path by searching MDST, generate a flow set, and dispatch it. In most cases, only a small number of PacketIn, PacketOut, and FlowMod messages are generated. Thus, MSPG effectively suppresses the generation of broadcast frames and greatly reduces the broadcasting traffic.

The energy consumption of the three methods when MAC missing occurs is shown in Figure 8, with MSPG forwarding energy consumption as a benchmark. The switch’s energy consumption mainly lies in frame forwarding and cyclic redundancy check (CRC). The energy consumption of forwarding is 70 times that of CRC, according to the actual measurements. One switch’s MAC missing will trigger flooding, causing a chain reaction of other switches, resulting in a surge in data plane energy consumption, which is positively correlated with an increase in traffic. When MAC missing occurs in MSPG, the controller will suppress flooding through path calculation and dispatch flow tables; thus, its energy consumption is much lower than those in FSDM and Portland. From Figures 7 and 8, it can be seen that MSPG is suitable for large-scale networks in terms of energy consumption and network traffic control.

4.2. Comparison and Analysis

We compare MSPG with several typical approaches in the aspects of add/modify protocols, additional facility, compatible with switch, topology traversal ability, path generation ability, and redeploy network. The compared results are presented in Table 1.

Table 1. Comparison of classic broadcast suppress approaches.
Approach Add/modify protocol Additional facility Compatible with switch Topology traversal ability Path generation ability Redeploy network
FSDM [18] n n y n n n
SSED [33] n y y y y y
SEATTLE [15] y n n n y y
Portland [37] n y y n n y
SAL [24] y n n n n y
EtherAgent [30] n y y n n y
MSPG n n y y y n

In terms of protocol modification: SEATTLE uses one-hop DHT for frame forwarding, and it needs to add a variety of new protocols in layer-2 and layer-3 and not compatible with Ethernet switch or OpenFlow enabled switch. In SAL, switch needs to encapsulate application and location address information in the frame for other switches to learn, so it is necessary to redesign its layer-2 protocol.

In terms of network facility: Servers under software-defined network architectures to eliminate discovery messages (SSED) redefine the relationship between servers and SDN architecture. In order to reduce the broadcasting caused by service discovery protocols, SSED deploys a variety of dedicated servers in the network, such as ARP servers, and DHCP servers. In addition, it also needs to consider the deployment location of the servers. Portland also needs to add a server to deploy fabric manager to manage global address information. EtherAgent needs to add agents to support the response of ARP messages and the relay of DHCP messages. Floodless service discovery model (FSDM) adds both ARP module and DHCP module to the controller, and it completes address registration by monitoring the DHCP message and host’s ARP behavior, so that controller can complete ARP message forgery response and DHCP message relay. However, FSDM lacks global topology traversal and path generation capabilities.

In terms of compatibility and network deployment, EtherAgent needs to redesign the network and divide it into several regions which are represented by different agents. In order to transfer of application and location information, SAL needs to design new protocols in layer-2, and the address resolution table of switch also needs to be modified to store the virtual machine information of tenant. These are all different from SDN switches, so the network needs to be upgraded. Because SEATTLE uses one-hop DHT for forwarding and routing, the existing switches cannot support it, and it also needs to divide the network into multiple edge-areas and one backbone-area, which need to redesign the network.

In terms of topology traversal and path generation, SSED has the ability of topology awareness. When forwarding ARP messages, different actions can be taken according to the type of the message and the sender. If the reply message is sent by the ARP server, the forwarding path between the starting point and the ending point can be calculated according to topology, and then unicast forwarding can be carried out.

Compared with the above methods, MSPG does not need to modify the network protocols, add network devices or servers, and redeploy the network, compatible with OpenFlow enabled switches. MSPG also has the ability of topology traversal and path generation; it is an efficient and easy-to-deploy forwarding method in layer-2.

5. Conclusion

The creative architecture of SDN makes it different from the previous network. In the traditional network, the physical hardware of the device is remarkably coupled with the system software, making the device an independent and closed object that supports the limited centralized monitoring and management. The three-tier architecture of SDN allows changes in the network working mechanism and protocol design. To reduce flooding caused by MAC missing, MSPG is designed in this study. The advantage of MSPG is the use of the controller to coordinate the processing of MAC missing from a global perspective and transforms the flooding to unicast among switches. When the network size is small, the forwarding delay of MSPG is close to Ethernet. However, the increase in network scale increases the coverage of MAC missing and forwarding path, thereby showing the advantages of MSPG. When MAC missing occurs, MSPG significantly reduces not only the number of frames but also the traffic between the data and control planes. The forwarding delay of MSPG is also lower than normal SDN. MSPG is compatible with existing OpenFlow networks and does not need to upgrade the protocol stack of hosts and network devices, and it is an efficient plug-and-play flood suppress solution. However, MSPG maintains Nodes, Links, and other data on the control plane, thereby causing the increase in memory overhead, multitable search, and path calculation and subsequently increasing the CPU overhead. Nevertheless, MSPG remarkably reduces the overall network load and energy consumption, so it is more suitable for large-scale networks. The future work will focus on three aspects: (1) optimize multitable lookup and path generation algorithms to reduce CPU overhead; (2) expand the OpenFlow by using MSPG as a supplementary module to standardize the flooding processing mechanism; (3) implement hardware prototypes for MSPG controllers and switches.

Conflicts of Interest

The authors declare that there are no conflicts of interest.

Acknowledgments

This work was supported by the National Social Science Foundation of China (Grant no. 17BGL232) and Basic Public Welfare Research Program of Zhejiang Province (Grant no. LGG22F020002).

    Data Availability

    The data used to support the findings of this study are included within the article.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.