patch-2.4.15 linux/Documentation/networking/bonding.txt
Next file: linux/Documentation/networking/dl2k.txt
Previous file: linux/Documentation/networking/8139too.txt
Back to the patch index
Back to the overall index
- Lines: 525
- Date:
Wed Nov 7 14:39:36 2001
- Orig file:
v2.4.14/linux/Documentation/networking/bonding.txt
- Orig date:
Wed Dec 31 16:00:00 1969
diff -u --recursive --new-file v2.4.14/linux/Documentation/networking/bonding.txt linux/Documentation/networking/bonding.txt
@@ -0,0 +1,524 @@
+
+ Linux Ethernet Bonding Driver mini-howto
+
+Initial release : Thomas Davis <tadavis at lbl.gov>
+Corrections, HA extensions : 2000/10/03-15 :
+ - Willy Tarreau <willy at meta-x.org>
+ - Constantine Gavrilov <const-g at xpert.com>
+ - Chad N. Tindel <ctindel at ieee dot org>
+ - Janice Girouard <girouard at us dot ibm dot com>
+
+Note :
+------
+The bonding driver originally came from Donald Becker's beowulf patches for
+kernel 2.0. It has changed quite a bit since, and the original tools from
+extreme-linux and beowulf sites will not work with this version of the driver.
+
+For new versions of the driver, patches for older kernels and the updated
+userspace tools, please follow the links at the end of this file.
+
+Installation
+============
+
+1) Build kernel with the bonding driver
+---------------------------------------
+For the latest version of the bonding driver, use kernel 2.4.12 or above
+(otherwise you will need to apply a patch).
+
+Configure kernel with `make menuconfig/xconfig/config', and select
+"Bonding driver support" in the "Network device support" section. It is
+recommended to configure the driver as module since it is currently the only way
+to pass parameters to the driver and configure more than one bonding device.
+
+Build and install the new kernel and modules.
+
+2) Get and install the userspace tools
+--------------------------------------
+This version of the bonding driver requires updated ifenslave program. The
+original one from extreme-linux and beowulf will not work. Kernels 2.4.12
+and above include the updated version of ifenslave.c in Documentation/network
+directory. For older kernels, please follow the links at the end of this file.
+
+IMPORTANT!!! If you are running on Redhat 7.1 or greater, you need
+to be careful because /usr/include/linux is no longer a symbolic link
+to /usr/src/linux/include/linux. If you build ifenslave while this is
+true, ifenslave will appear to succeed but your bond won't work. The purpose
+of the -I option on the ifenslave compile line is to make sure it uses
+/usr/src/linux/include/linux/if_bonding.h instead of the version from
+/usr/include/linux.
+
+To install ifenslave.c, do:
+ # gcc -Wall -Wstrict-prototypes -O -I/usr/src/linux/include ifenslave.c -o ifenslave
+ # cp ifenslave /sbin/ifenslave
+
+3) Configure your system
+------------------------
+Also see the following section on the module parameters. You will need to add
+at least the following line to /etc/conf.modules (or /etc/modules.conf):
+
+ alias bond0 bonding
+
+Use standard distribution techniques to define bond0 network interface. For
+example, on modern RedHat distributions, create ifcfg-bond0 file in
+/etc/sysconfig/network-scripts directory that looks like this:
+
+DEVICE=bond0
+IPADDR=192.168.1.1
+NETMASK=255.255.255.0
+NETWORK=192.168.1.0
+BROADCAST=192.168.1.255
+ONBOOT=yes
+BOOTPROTO=none
+USERCTL=no
+
+(put the appropriate values for you network instead of 192.168.1).
+
+All interfaces that are part of the trunk, should have SLAVE and MASTER
+definitions. For example, in the case of RedHat, if you wish to make eth0 and
+eth1 (or other interfaces) a part of the bonding interface bond0, their config
+files (ifcfg-eth0, ifcfg-eth1, etc.) should look like this:
+
+DEVICE=eth0
+USERCTL=no
+ONBOOT=yes
+MASTER=bond0
+SLAVE=yes
+BOOTPROTO=none
+
+(use DEVICE=eth1 for eth1 and MASTER=bond1 for bond1 if you have configured
+second bonding interface).
+
+Restart the networking subsystem or just bring up the bonding device if your
+administration tools allow it. Otherwise, reboot. (For the case of RedHat
+distros, you can do `ifup bond0' or `/etc/rc.d/init.d/network restart'.)
+
+If the administration tools of your distribution do not support master/slave
+notation in configuration of network interfaces, you will need to configure
+the bonding device with the following commands manually:
+
+ # /sbin/ifconfig bond0 192.168.1.1 up
+ # /sbin/ifenslave bond0 eth0
+ # /sbin/ifenslave bond0 eth1
+
+(substitute 192.168.1.1 with your IP address and add custom network and custom
+netmask to the arguments of ifconfig if required).
+
+You can then create a script with these commands and put it into the appropriate
+rc directory.
+
+If you specifically need that all your network drivers are loaded before the
+bonding driver, use one of modutils' powerful features : in your modules.conf,
+tell that when asked for bond0, modprobe should first load all your interfaces :
+
+probeall bond0 eth0 eth1 bonding
+
+Be careful not to reference bond0 itself at the end of the line, or modprobe will
+die in an endless recursive loop.
+
+4) Module parameters.
+---------------------
+The following module parameters can be passed:
+
+ mode=
+
+Possible values are 0 (round robin policy, default) and 1 (active backup
+policy), and 2 (XOR). See question 9 and the HA section for additional info.
+
+ miimon=
+
+Use integer value for the frequency (in ms) of MII link monitoring. Zero value
+is default and means the link monitoring will be disabled. A good value is 100
+if you wish to use link monitoring. See HA section for additional info.
+
+ downdelay=
+
+Use integer value for delaying disabling a link by this number (in ms) after
+the link failure has been detected. Must be a multiple of miimon. Default
+value is zero. See HA section for additional info.
+
+ updelay=
+
+Use integer value for delaying enabling a link by this number (in ms) after
+the "link up" status has been detected. Must be a multiple of miimon. Default
+value is zero. See HA section for additional info.
+
+ arp_interval=
+
+Use integer value for the frequency (in ms) of arp monitoring. Zero value
+is default and means the arp monitoring will be disabled. See HA section
+for additional info. This field is value in active_backup mode only.
+
+ arp_ip_target=
+
+An ip address to use when arp_interval is > 0. This is the target of the
+arp request sent to determine the health of the link to the target.
+Specify this value in ddd.ddd.ddd.ddd format.
+
+If you need to configure several bonding devices, the driver must be loaded
+several times. I.e. for two bonding devices, your /etc/conf.modules must look
+like this:
+
+alias bond0 bonding
+alias bond1 bonding
+
+options bond0 miimon=100
+options bond1 -o bonding1 miimon=100
+
+5) Testing configuration
+------------------------
+You can test the configuration and transmit policy with ifconfig. For example,
+for round robin policy, you should get something like this:
+
+[root]# /sbin/ifconfig
+bond0 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4
+ inet addr:XXX.XXX.XXX.YYY Bcast:XXX.XXX.XXX.255 Mask:255.255.252.0
+ UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
+ RX packets:7224794 errors:0 dropped:0 overruns:0 frame:0
+ TX packets:3286647 errors:1 dropped:0 overruns:1 carrier:0
+ collisions:0 txqueuelen:0
+
+eth0 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4
+ inet addr:XXX.XXX.XXX.YYY Bcast:XXX.XXX.XXX.255 Mask:255.255.252.0
+ UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
+ RX packets:3573025 errors:0 dropped:0 overruns:0 frame:0
+ TX packets:1643167 errors:1 dropped:0 overruns:1 carrier:0
+ collisions:0 txqueuelen:100
+ Interrupt:10 Base address:0x1080
+
+eth1 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4
+ inet addr:XXX.XXX.XXX.YYY Bcast:XXX.XXX.XXX.255 Mask:255.255.252.0
+ UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
+ RX packets:3651769 errors:0 dropped:0 overruns:0 frame:0
+ TX packets:1643480 errors:0 dropped:0 overruns:0 carrier:0
+ collisions:0 txqueuelen:100
+ Interrupt:9 Base address:0x1400
+
+Questions :
+===========
+
+1. Is it SMP safe?
+
+ Yes. The old 2.0.xx channel bonding patch was not SMP safe.
+ The new driver was designed to be SMP safe from the start.
+
+2. What type of cards will work with it?
+
+ Any Ethernet type cards (you can even mix cards - a Intel
+ EtherExpress PRO/100 and a 3com 3c905b, for example).
+ You can even bond together Gigabit Ethernet cards!
+
+3. How many bonding devices can I have?
+
+ One for each module you load. See section on module parameters for how
+ to accomplish this.
+
+4. How many slaves can a bonding device have?
+
+ Limited by the number of network interfaces Linux supports and the
+ number of cards you can place in your system.
+
+5. What happens when a slave link dies?
+
+ If your ethernet cards support MII status monitoring and the MII
+ monitoring has been enabled in the driver (see description of module
+ parameters), there will be no adverse consequences. This release
+ of the bonding driver knows how to get the MII information and
+ enables or disables its slaves according to their link status.
+ See section on HA for additional information.
+
+ For ethernet cards not supporting MII status, or if you wish to
+ verify that packets have been both send and received, you may
+ configure the arp_interval and arp_ip_target. If packets have
+ not been sent or received during this interval, an arp request
+ is sent to the target to generate send and receive traffic.
+ If after this interval, either the successful send and/or
+ receive count has not incremented, the next slave in the sequence
+ will become the active slave.
+
+ If neither mii_monitor and arp_interval is configured, the bonding
+ driver will not handle this situation very well. The driver will
+ continue to send packets but some packets will be lost. Retransmits
+ will cause serious degradation of performance (in the case when one
+ of two slave links fails, 50% packets will be lost, which is a serious
+ problem for both TCP and UDP).
+
+6. Can bonding be used for High Availability?
+
+ Yes, if you use MII monitoring and ALL your cards support MII link
+ status reporting. See section on HA for more information.
+
+7. Which switches/systems does it work with?
+
+ In round-robin mode, it works with systems that support trunking:
+
+ * Cisco 5500 series (look for EtherChannel support).
+ * SunTrunking software.
+ * Alteon AceDirector switches / WebOS (use Trunks).
+ * BayStack Switches (trunks must be explicitly configured). Stackable
+ models (450) can define trunks between ports on different physical
+ units.
+ * Linux bonding, of course !
+
+ In Active-backup mode, it should work with any Layer-II switches.
+
+8. Where does a bonding device get its MAC address from?
+
+ If not explicitly configured with ifconfig, the MAC address of the
+ bonding device is taken from its first slave device. This MAC address
+ is then passed to all following slaves and remains persistent (even if
+ the the first slave is removed) until the bonding device is brought
+ down or reconfigured.
+
+ If you wish to change the MAC address, you can set it with ifconfig:
+
+ # ifconfig bond0 ha ether 00:11:22:33:44:55
+
+ The MAC address can be also changed by bringing down/up the device
+ and then changing its slaves (or their order):
+
+ # ifconfig bond0 down ; modprobe -r bonding
+ # ifconfig bond0 .... up
+ # ifenslave bond0 eth...
+
+ This method will automatically take the address from the next slave
+ that will be added.
+
+ To restore your slaves' MAC addresses, you need to detach them
+ from the bond (`ifenslave -d bond0 eth0'), set them down
+ (`ifconfig eth0 down'), unload the drivers (`rmmod 3c59x', for
+ example) and reload them to get the MAC addresses from their
+ eeproms. If the driver is shared by several devices, you need
+ to turn them all down. Another solution is to look for the MAC
+ address at boot time (dmesg or tail /var/log/messages) and to
+ reset it by hand with ifconfig :
+
+ # ifconfig eth0 down
+ # ifconfig eth0 hw ether 00:20:40:60:80:A0
+
+9. Which transmit polices can be used?
+
+ Round robin, based on the order of enslaving, the output device
+ is selected base on the next available slave. Regardless of
+ the source and/or destination of the packet.
+
+ XOR, based on (src hw addr XOR dst hw addr) % slave cnt. This
+ selects the same slave for each destination hw address.
+
+ Active-backup policy that ensures that one and only one device will
+ transmit at any given moment. Active-backup policy is useful for
+ implementing high availability solutions using two hubs (see
+ section on HA).
+
+High availability
+=================
+
+To implement high availability using the bonding driver, you need to
+compile the driver as module because currently it is the only way to pass
+parameters to the driver. This may change in the future.
+
+High availability is achieved by using MII status reporting. You need to
+verify that all your interfaces support MII link status reporting. On Linux
+kernel 2.2.17, all the 100 Mbps capable drivers and yellowfin gigabit driver
+support it. If your system has an interface that does not support MII status
+reporting, a failure of its link will not be detected!
+
+The bonding driver can regularly check all its slaves links by checking the
+MII status registers. The check interval is specified by the module argument
+"miimon" (MII monitoring). It takes an integer that represents the
+checking time in milliseconds. It should not come to close to (1000/HZ)
+(10 ms on i386) because it may then reduce the system interactivity. 100 ms
+seems to be a good value. It means that a dead link will be detected at most
+100 ms after it goes down.
+
+Example:
+
+ # modprobe bonding miimon=100
+
+Or, put in your /etc/modules.conf :
+
+ alias bond0 bonding
+ options bond0 miimon=100
+
+There are currently two policies for high availability, depending on whether
+a) hosts are connected to a single host or switch that support trunking
+b) hosts are connected to several different switches or a single switch that
+ does not support trunking.
+
+1) HA on a single switch or host - load balancing
+-------------------------------------------------
+It is the easiest to set up and to understand. Simply configure the
+remote equipment (host or switch) to aggregate traffic over several
+ports (Trunk, EtherChannel, etc.) and configure the bonding interfaces.
+If the module has been loaded with the proper MII option, it will work
+automatically. You can then try to remove and restore different links
+and see in your logs what the driver detects. When testing, you may
+encounter problems on some buggy switches that disable the trunk for a
+long time if all ports in a trunk go down. This is not Linux, but really
+the switch (reboot it to ensure).
+
+Example 1 : host to host at double speed
+
+ +----------+ +----------+
+ | |eth0 eth0| |
+ | Host A +--------------------------+ Host B |
+ | +--------------------------+ |
+ | |eth1 eth1| |
+ +----------+ +----------+
+
+ On each host :
+ # modprobe bonding miimon=100
+ # ifconfig bond0 addr
+ # ifenslave bond0 eth0 eth1
+
+Example 2 : host to switch at double speed
+
+ +----------+ +----------+
+ | |eth0 port1| |
+ | Host A +--------------------------+ switch |
+ | +--------------------------+ |
+ | |eth1 port2| |
+ +----------+ +----------+
+
+ On host A : On the switch :
+ # modprobe bonding miimon=100 # set up a trunk on port1
+ # ifconfig bond0 addr and port2
+ # ifenslave bond0 eth0 eth1
+
+2) HA on two or more switches (or a single switch without trunking support)
+---------------------------------------------------------------------------
+This mode is more problematic because it relies on the fact that there
+are multiple ports and the host's MAC address should be visible on one
+port only to avoid confusing the switches.
+
+If you need to know which interface is the active one, and which ones are
+backup, use ifconfig. All backup interfaces have the NOARP flag set.
+
+To use this mode, pass "mode=1" to the module at load time :
+
+ # modprobe bonding miimon=100 mode=1
+
+Or, put in your /etc/modules.conf :
+
+ alias bond0 bonding
+ options bond0 miimon=100 mode=1
+
+Example 1: Using multiple host and multiple switches to build a "no single
+point of failure" solution.
+
+
+ | |
+ |port3 port3|
+ +-----+----+ +-----+----+
+ | |port7 ISL port7| |
+ | switch A +--------------------------+ switch B |
+ | +--------------------------+ |
+ | |port8 port8| |
+ +----++----+ +-----++---+
+ port2||port1 port1||port2
+ || +-------+ ||
+ |+-------------+ host1 +---------------+|
+ | eth0 +-------+ eth1 |
+ | |
+ | +-------+ |
+ +--------------+ host2 +----------------+
+ eth0 +-------+ eth1
+
+In this configuration, there are an ISL - Inter Switch Link (could be a trunk),
+several servers (host1, host2 ...) attached to both switches each, and one or
+more ports to the outside world (port3...). One an only one slave on each host
+is active at a time, while all links are still monitored (the system can
+detect a failure of active and backup links).
+
+Each time a host changes its active interface, it sticks to the new one until
+it goes down. In this example, the hosts are not too much affected by the
+expiration time of the switches' forwarding tables.
+
+If host1 and host2 have the same functionality and are used in load balancing
+by another external mechanism, it is good to have host1's active interface
+connected to one switch and host2's to the other. Such system will survive
+a failure of a single host, cable, or switch. The worst thing that may happen
+in the case of a switch failure is that half of the hosts will be temporarily
+unreachable until the other switch expires its tables.
+
+Example 2: Using multiple ethernet cards connected to a switch to configure
+ NIC failover (switch is not required to support trunking).
+
+
+ +----------+ +----------+
+ | |eth0 port1| |
+ | Host A +--------------------------+ switch |
+ | +--------------------------+ |
+ | |eth1 port2| |
+ +----------+ +----------+
+
+ On host A : On the switch :
+ # modprobe bonding miimon=100 mode=1 # (optional) minimize the time
+ # ifconfig bond0 addr # for table expiration
+ # ifenslave bond0 eth0 eth1
+
+Each time the host changes its active interface, it sticks to the new one until
+it goes down. In this example, the host is strongly affected by the expiration
+time of the switch forwarding table.
+
+3) Adapting to your switches' timing
+------------------------------------
+If your switches take a long time to go into backup mode, it may be
+desirable not to activate a backup interface immediately after a link goes
+down. It is possible to delay the moment at which a link will be
+completely disabled by passing the module parameter "downdelay" (in
+milliseconds, must be a multiple of miimon).
+
+When a switch reboots, it is possible that its ports report "link up" status
+before they become usable. This could fool a bond device by causing it to
+use some ports that are not ready yet. It is possible to delay the moment at
+which an active link will be reused by passing the module parameter "updelay"
+(in milliseconds, must be a multiple of miimon).
+
+A similar situation can occur when a host re-negotiates a lost link with the
+switch (a case of cable replacement).
+
+A special case is when a bonding interface has lost all slave links. Then the
+driver will immediately reuse the first link that goes up, even if updelay
+parameter was specified. (If there are slave interfaces in the "updelay" state,
+the interface that first went into that state will be immediately reused.) This
+allows to reduce down-time if the value of updelay has been overestimated.
+
+Examples :
+
+ # modprobe bonding miimon=100 mode=1 downdelay=2000 updelay=5000
+ # modprobe bonding miimon=100 mode=0 downdelay=0 updelay=5000
+
+4) Limitations
+--------------
+The main limitations are :
+ - only the link status is monitored. If the switch on the other side is
+ partially down (e.g. doesn't forward anymore, but the link is OK), the link
+ won't be disabled. Another way to check for a dead link could be to count
+ incoming frames on a heavily loaded host. This is not applicable to small
+ servers, but may be useful when the front switches send multicast
+ information on their links (e.g. VRRP), or even health-check the servers.
+ Use the arp_interval/arp_ip_target parameters to count incoming/outgoing
+ frames.
+
+Resources and links
+===================
+
+Current developement on this driver is posted to:
+ - http://www.sourceforge.net/projects/bonding/
+
+Donald Becker's Ethernet Drivers and diag programs may be found at :
+ - http://www.scyld.com/network/
+
+You will also find a lot of information regarding Ethernet, NWay, MII, etc. at
+www.scyld.com.
+
+For new versions of the driver, patches for older kernels and the updated
+userspace tools, take a look at Willy Tarreau's site :
+ - http://wtarreau.free.fr/pub/bonding/
+ - http://www-miaif.lip6.fr/willy/pub/bonding/
+
+To get latest informations about Linux Kernel development, please consult
+the Linux Kernel Mailing List Archives at :
+ http://boudicca.tux.org/hypermail/linux-kernel/latest/
+
+-- END --
FUNET's LINUX-ADM group, linux-adm@nic.funet.fi
TCL-scripts by Sam Shen (who was at: slshen@lbl.gov)