Re: [2.4.21-pre5] ethernet bonding caused system lockup

Willy Tarreau (willy@w.ods.org)
Sat, 22 Mar 2003 11:37:52 +0100


On Sat, Mar 22, 2003 at 12:12:23PM +0200, Tomi Hakala wrote:
> Hello,
>
> Last night I experienced total system lockup when I added second NIC to
> bond0 group, no OOPS or anything, system stopped to respond immediatiately.

I had the same problem here, but with an oops. I tracked it down to the bond
driver trying to get the slave's IP address for arp requests. It crashes if
the slave doesn't have an IP address.

This patch works for me, but I haven't submitted it yet. I haven't worked on
bonding for a long time, and saw that a lot of new features came in. I don't
know if they have been widely tested yet.

Cheers,
Willy

--- 20030207/drivers/net/bonding.c Thu Mar 6 21:57:55 2003
+++ fix/drivers/net/bonding.c Mon Mar 17 21:45:04 2003
@@ -276,6 +276,10 @@
* 2003/02/07 - Tony Cureington <tony.cureington * hp_com>
* - Fix bond_mii_monitor() logic error that could result in
* bonding round-robin mode ignoring links after failover/recovery
+ *
+ * 2003/03/17 - Willy Tarreau <wtarreau at meta-x.org>
+ * - fix bond_enslave() which could oops when the slave had no
+ * IP address.
*/

#include <linux/config.h>
@@ -382,7 +386,7 @@
MODULE_PARM(miimon, "i");
MODULE_PARM_DESC(miimon, "Link check interval in milliseconds");
MODULE_PARM(use_carrier, "i");
-MODULE_PARM_DESC(use_carrier, "Use netif_carrier_ok (vs MII ioctls) in miimon; 09 for off, 1 for on (default)");
+MODULE_PARM_DESC(use_carrier, "Use netif_carrier_ok (vs MII ioctls) in miimon; 0 for off, 1 for on (default)");
MODULE_PARM(mode, "s");
MODULE_PARM_DESC(mode, "Mode of operation : 0 for round robin, 1 for active-backup, 2 for xor");
MODULE_PARM(arp_interval, "i");
@@ -1163,11 +1167,14 @@
#endif
bond_set_slave_inactive_flags(new_slave);
}
- read_lock_irqsave(&(((struct in_device *)slave_dev->ip_ptr)->lock), rflags);
- ifap= &(((struct in_device *)slave_dev->ip_ptr)->ifa_list);
- ifa = *ifap;
- my_ip = ifa->ifa_address;
- read_unlock_irqrestore(&(((struct in_device *)slave_dev->ip_ptr)->lock), rflags);
+ if (((struct in_device *)slave_dev->ip_ptr) != NULL) {
+ read_lock_irqsave(&(((struct in_device *)slave_dev->ip_ptr)->lock), rflags);
+ ifap= &(((struct in_device *)slave_dev->ip_ptr)->ifa_list);
+ ifa = *ifap;
+ if (ifa != NULL)
+ my_ip = ifa->ifa_address;
+ read_unlock_irqrestore(&(((struct in_device *)slave_dev->ip_ptr)->lock), rflags);
+ }

/* if there is a primary slave, remember it */
if (primary != NULL)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/