RHEL - Network Teaming

A good technique to support fault tolerance in production server environments is to bond/team multiple physical interfaces together. This way, if one interface would fail, another interface is ready (with the same IP) to provide connectivity. Another benefit would be to provide higher performance using the combined bandwidth of both interfaces. The result is a single logical link (at layer 2).

RHEL and CentOS support two methods to aggregate multiple links; bonding and teaming. Bonding is the somewhat legacy method, where teaming is the shiny new one. At a very high level, the bonding driver has limited flexibility (everything happens at Kernel level), where the team driver was developed to improve on this limitation. Side by side, the team driver outperforms the bond driver, and thus we’ll be discussing configuring link aggregation with a teamed driver. More info on the comparison of the drivers here: https://rhelblog.redhat.com/2014/06/23/team-driver/

If you have existing bonded interfaces and wish to migrate them to teamed interfaces, RHEL and CentOS provide a tool for exactly this requirement.

[root@server ~]# man -f bond2team
bond2team (1)        - Converts bonding configuration to team
Let’s get started:

The first few steps below are just to ensure all the necessary packages are installed and the driver is loaded. This should be done by default, but it’s best to double-check.

Check that the ‘teamd’ package is installed:

[root@server ~]# yum list installed teamd
Installed Packages
teamd.x86_64                         1.9-15.el7                  @anaconda

If not, go ahead and install it:

[root@server ~]# yum install –y teamd

Load the team module.

[root@server ~]# modprobe team

You can get some info on the module, with ‘modinfo’.

[root@server ~]# modinfo team

filename: /lib/modules/3.10.0 123.el7.x86_64/kernel/drivers/net/team/team.ko

alias:          rtnl-link-team
description:    Ethernet team device driver
author:         Jiri Pirko
license:        GPL v2
srcversion:     39F7B52A85A880B5099D411
depends:
intree:         Y
vermagic:       3.10.0-123.el7.x86_64 SMP mod_unload modversions
signer:         Red Hat Enterprise Linux kernel signing key
sig_key:        00:AA:5F:56:C5:87:BD:82:F2:F9:9D:64:BA:83:DD:1E:9E:0D:33:4A
sig_hashalgo:   sha256

Ok, so… Some background. My virtual machine has three interfaces:

eno16777736 is my management interface, with an IP of 192.168.188.132/24 already assigned to it. This is the interface we’re using to connect to the device. Besides providing connectivity to the device, we won’t be using it in this tutorial.

eno33554960 and eno50332184 are the two interfaces we’re going team together. The team’s IP address will be added to the new interface we’ll add shortly. As a matter of interest, take note of these two interfaces’ mac addresses. I’ll point out something regarding the mac addresses once we’ve added them to the team.

[root@server ~]# ip address show
[...]
2: eno16777736:  mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:21:dc:fb brd ff:ff:ff:ff:ff:ff
inet 192.168.188.132/24 brd 192.168.188.255 scope global dynamic eno16777736
valid_lft 1715sec preferred_lft 1715sec
inet6 fe80::20c:29ff:fe21:dcfb/64 scope link
valid_lft forever preferred_lft forever
3: eno33554960:  mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:21:dc:05 brd ff:ff:ff:ff:ff:ff
4: eno50332184:  mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:21:dc:0f brd ff:ff:ff:ff:ff:ff

The official Red Hat documentation state that an active link cannot be added to a team, although, in my experience, I’ve never had an issue with this. If you do run into an issue, try shutting those interfaces, before adding them to the team.

[root@server ~]# ip link set eno33554960 down
[root@server ~]# ip link set eno50332184 down
[root@server ~]# ip address show
[...]
2: eno16777736:  mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:21:dc:fb brd ff:ff:ff:ff:ff:ff
inet 192.168.188.132/24 brd 192.168.188.255 scope global dynamic eno16777736
valid_lft 1496sec preferred_lft 1496sec
inet6 fe80::20c:29ff:fe21:dcfb/64 scope link
valid_lft forever preferred_lft forever
3: eno33554960:  mtu 1500 qdisc pfifo_fast state DOWN qlen 1000
link/ether 00:0c:29:21:dc:05 brd ff:ff:ff:ff:ff:ff
4: eno50332184:  mtu 1500 qdisc pfifo_fast state DOWN qlen 1000
link/ether 00:0c:29:21:dc:0f brd ff:ff:ff:ff:ff:ff

Before we configure the team, let’s select a runner. A runner is basically the type of teamed interfaces, or team ‘mode’, you need based on your requirements. We can find a list of available runners with a description of each in the teamd.conf man page. Man pages are your friends, they are extremely helpful, especially in certification exams. If you don’t already, you will learn to love them.

[root@server ~]# man teamd.conf
broadcast -- Simple runner which directs the team device to transmit packets via all ports.

roundrobin -- Simple runner which directs the team device to transmits packets in a round-robin fashion.

activebackup -- Watches for link changes and selects active port to be used for data transfers.

loadbalance -- To do passive load balancing, runner only sets up BPF hash function which will determine port for packet transmit. To do active load balancing, runner moves hashes among available ports trying to reach perfect balance.

lacp -- Implements 802.3ad LACP protocol. Can use same Tx port selection possibilities as loadbalance runner.

For this we’re going to go with an ‘activebackup’ runner.

We’re going to turn to man pages again to get examples of the commands we need to execute. In the ‘nmcli-examples’ man page, we can use Example7 the configure out team interface. Copy these example commands to a text editor, and make a few minor changes to match out setup.

[root@server ~]# man nmcli-examples
[…snippet…]
“Example 7. Adding a team master and two slave connection profiles
$ nmcli con add type team con-name Team1 ifname Team1 config team1-master-json.conf
$ nmcli con add type team-slave con-name Team1-slave1 ifname em1 master Team1
$ nmcli con add type team-slave con-name Team1-slave2 ifname em2 master Team1
Configuring the team:

First we’re going to add the team interface itself, don’t worry about assigning it an IP address, we’ll do that shortly. Instead of referencing a file for the JSON config, I just add the runner config in the command. If you struggle to remember the syntax for the interface config part, you can find examples in /usr/share/doc/teamd-X/example_configs/

[root@server ~]# nmcli con add type team con-name team1 ifname team1 config '{"runner": {"name": "activebackup"}}'
Connection 'team1' (8010bf0f-51b1-4dc3-9321-2ff0548bf844) successfully added.

Note the spaces, after every colon, in the JSON code when defining the runner:

'{"runner": {"name": "activebackup"}}'

Use the examples in the path above if you’d like to rather define a file to reference.

Next, we’ll add the first interface to the team (referencing the master team1 from the previous command), giving it a name team1-slave1

[root@server ~]# nmcli con add type team-slave con-name team1-slave1 ifname eno33554960 master team1
Connection 'team1-slave1' (e81a7abe-32ea-4cb3-afdd-9aa16f4d88e1) successfully added.

Last interface, giving it a name team1-slave2, once again referencing the master team interface.

[root@server ~]# nmcli con add type team-slave con-name team1-slave2 ifname eno50332184 master team1
Connection 'team1-slave2' (3ce37264-58e7-4f87-9bbe-b009e193686c) successfully added.

Let’s look at the interfaces we added. For cleanliness and for clarity, you might want to give your slave interfaces a name that includes the description of the device it belongs to. i.e. team1-slave1-eno33554960.

[root@server ~]# nmcli con sh
NAME          UUID                                  TYPE            DEVICE
team1-slave2  3ce37264-58e7-4f87-b009e193686c  802-3-ethernet  eno50332184
team1-slave1  e81a7abe-32ea-4cb3-9aa16f4d88e1  802-3-ethernet  eno33554960
team1         8010bf0f-51b1-4dc3-2ff0548bf844  team            team1
eno16777736   271159d5-37d0-4bd9-23f96c591bca  802-3-ethernet  eno16777736

We can now bring these new connections up, first the slaves, then the master

[root@server ~]# nmcli con up team1-slave1
Connection successfully activated (D-Bus active path: /org/freedesktop/NetworkManager/ActiveConnection/8)

[root@server ~]# nmcli con up team1-slave2
Connection successfully activated (D-Bus active path: /org/freedesktop/NetworkManager/ActiveConnection/9)

[root@server ~]# nmcli con up team1
Connection successfully activated (D-Bus active path: /org/freedesktop/NetworkManager/ActiveConnection/21)

An interesting observation, once the team interfaces (master and slaves) are added, and all the interfaces are enabled, you will notice that the, in our case, three interfaces all display the same mac address. This is due to the fault tolerance functionality that the teaming driver has built in.

[root@server ~]# ip a s
[...]
2: eno16777736: mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:21:dc:fb brd ff:ff:ff:ff:ff:ff
inet 192.168.188.132/24 brd 192.168.188.255 scope global dynamic eno16777736
valid_lft 980sec preferred_lft 980sec
inet6 fe80::20c:29ff:fe21:dcfb/64 scope link
valid_lft forever preferred_lft forever
3: eno33554960: mtu 1500 qdisc pfifo_fast master team1 state UP qlen 1000
link/ether 00:0c:29:21:dc:05 brd ff:ff:ff:ff:ff:ff
4: eno50332184: mtu 1500 qdisc pfifo_fast master team1 state UP qlen 1000
link/ether 00:0c:29:21:dc:05 brd ff:ff:ff:ff:ff:ff
6: team1: mtu 1500 qdisc noqueue state UP
link/ether 00:0c:29:21:dc:05 brd ff:ff:ff:ff:ff:ff
inet6 fe80::20c:29ff:fe21:dc05/64 scope link
valid_lft forever preferred_lft forever

Configuring an IP address for team1 is done exactly like a normal interface. I’m going to give it an address from the same range as the management interface. Once the address is added, we can set the method to manual, and ensure that the interface comes up when then server boots.

[root@server ~]# nmcli con mod team1 ipv4.addresses "192.168.188.111/24 192.168.188.2"
[root@server ~]# nmcli con mod team1 ipv4.method manual
[root@server ~]# nmcli con mod team1 connection.autoconnect yes
[root@server ~]# systemctl restart network
[root@server ~]# ip a s
[...]
2: eno16777736: mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:21:dc:fb brd ff:ff:ff:ff:ff:ff
inet 192.168.188.132/24 brd 192.168.188.255 scope global dynamic eno16777736
valid_lft 1797sec preferred_lft 1797sec
inet6 fe80::20c:29ff:fe21:dcfb/64 scope link
valid_lft forever preferred_lft forever
3: eno33554960: mtu 1500 qdisc pfifo_fast master team1 state UP qlen 1000
link/ether 00:0c:29:21:dc:05 brd ff:ff:ff:ff:ff:ff
4: eno50332184: mtu 1500 qdisc pfifo_fast master team1 state UP qlen 1000
link/ether 00:0c:29:21:dc:05 brd ff:ff:ff:ff:ff:ff
7: team1: mtu 1500 qdisc noqueue state UP
link/ether 00:0c:29:21:dc:05 brd ff:ff:ff:ff:ff:ff
inet 192.168.188.111/24 brd 192.168.188.255 scope global team1
valid_lft forever preferred_lft forever
inet6 fe80::20c:29ff:fe21:dc05/64 scope link tentative dadfailed
valid_lft forever preferred_lft forever

The teamdctl command can be used to check the state of the team interfaces, you’ll notice the runner type is also displayed in this output. It beneficial to note that this command can also take the –v flag for a verbose output. Try it…

[root@server ~]# teamdctl team1 state
setup:
  runner: activebackup
ports:
  eno33554960
    link watches:
      link summary: up
      instance[link_watch_0]:
        name: ethtool
        link: up
  eno50332184
    link watches:
      link summary: up
      instance[link_watch_0]:
        name: ethtool
        link: up
runner:
  active port: eno33554960
Testing:

Cool. So that’s it… Let’s test this bad boy! From another node in the same network (or the host machine if you’re running a VM), we’re going to ping the IP address that we assigned to our team interface.

[root@client ~]# ping 192.168.188.111
PING 192.168.188.111 (192.168.188.111) 56(84) bytes of data.
64 bytes from 192.168.188.111: icmp_seq=1 ttl=64 time=0.654 ms
64 bytes from 192.168.188.111: icmp_seq=2 ttl=64 time=0.561 ms
64 bytes from 192.168.188.111: icmp_seq=3 ttl=64 time=0.417 ms
64 bytes from 192.168.188.111: icmp_seq=4 ttl=64 time=0.747 ms
[...]
64 bytes from 192.168.188.111: icmp_seq=21 ttl=64 time=0.537 ms
64 bytes from 192.168.188.111: icmp_seq=22 ttl=64 time=0.559 ms
64 bytes from 192.168.188.111: icmp_seq=23 ttl=64 time=0.571 ms
64 bytes from 192.168.188.111: icmp_seq=24 ttl=64 time=0.568 ms
64 bytes from 192.168.188.111: icmp_seq=25 ttl=64 time=2.45 ms

With the continuous ping running on the other node, we’re going to reenact a failure of one of the interfaces. With the ‘ip’ command we disconnect the team1-slave1 interface. Keep an eye on those pings.

[root@server ~]# ip link set eno33554960 down
With the teamdctl command we can confirm that the team now only has one active interface.
[root@server ~]# teamdctl team1 state
setup:
  runner: activebackup
ports:
  eno50332216
    link watches:
      link summary: up
      instance[link_watch_0]:
        name: ethtool
        link: up
runner:
  active port: eno50332184
[root@server ~]# ip link set eno33554960 up

Bring the interface back up. On the other node, the one running the ping, no packets should have been dropped during the time the interface was down.

Doing a tail on the /var/log/messages we can see that NetworkManager picks up the transition of the interface.

[root@server ~]# tail -f /var/log/messages
Jan 10 12:51:27 server NetworkManager: eno33554960: ethtool-link went down.
Jan 10 12:51:27 server NetworkManager[920]: (eno33554960): link disconnected (deferring action for 4 seconds)
Jan 10 12:51:30 server NetworkManager[920]: (eno33554960): link disconnected (calling deferred action)
Jan 10 12:51:45 server kernel: e1000: eno33554960 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Jan 10 12:51:45 server NetworkManager: eno33554960: ethtool-link went up.
Jan 10 12:51:45 server NetworkManager: eno33554960: ethtool-link went down.
Jan 10 12:51:45 server NetworkManager: eno33554960: ethtool-link went up.
Jan 10 12:51:45 server kernel: IPv6: ADDRCONF(NETDEV_UP): eno33554960: link is not ready
Jan 10 12:51:45 server kernel: IPv6: ADDRCONF(NETDEV_CHANGE): eno33554960: link becomes ready
Jan 10 12:51:45 server NetworkManager[920]: (eno33554960): link connected

Conclusion

And that’s it really. Teamed interfaces done and dusted. Happy teaming…!