[PATCH] patches: workaround for br_del_if race

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[PATCH] patches: workaround for br_del_if race

Ryan Harper
This patch provides a workaround for bugzilla #90 which shows up far too
often when creating and then destroying lots of domUs and dom0 is SMP.
Details are in the [1]bug.  With this patch, I now can create/destroy
domains in a tight loop for hours where previously every 3 to 10 cycles
would blow up.

Please apply.

1. http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=90

--
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
(512) 838-9253   T/L: 678-9253
[hidden email]


diffstat output:
 workaround_double_br_del_if.patch |   11 +++++++++++
 1 files changed, 11 insertions(+)

Signed-off-by: Ryan Harper <[hidden email]>
---
diff -r dfbeb7da829f patches/linux-2.6.12/workaround_double_br_del_if.patch
--- /dev/null Thu Aug 18 19:51:46 2005
+++ b/patches/linux-2.6.12/workaround_double_br_del_if.patch Thu Aug 18 15:53:37 2005
@@ -0,0 +1,11 @@
+--- linux-2.6.12/net/bridge/br_if.c 2005-06-17 14:48:29.000000000 -0500
++++ linux-2.6.12-xen0-smp/net/bridge/br_if.c 2005-08-18 15:17:27.302615846 -0500
+@@ -382,7 +382,7 @@
+ {
+ struct net_bridge_port *p = dev->br_port;
+
+- if (!p || p->br != br)
++ if (!p || p->br != br || p->state == BR_STATE_DISABLED)
+ return -EINVAL;
+
+ br_sysfs_removeif(p);

_______________________________________________
Xen-devel mailing list
[hidden email]
http://lists.xensource.com/xen-devel
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] patches: workaround for br_del_if race

Keir Fraser

On 18 Aug 2005, at 22:01, Ryan Harper wrote:

> This patch provides a workaround for bugzilla #90 which shows up far
> too
> often when creating and then destroying lots of domUs and dom0 is SMP.
> Details are in the [1]bug.  With this patch, I now can create/destroy
> domains in a tight loop for hours where previously every 3 to 10 cycles
> would blow up.

Please also submit this to the Ethernet bridge maintainer. Details are
in the Linux MAINTAINERS file.

  -- Keir


_______________________________________________
Xen-devel mailing list
[hidden email]
http://lists.xensource.com/xen-devel
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] patches: workaround for br_del_if race

Ryan Harper
* Keir Fraser <[hidden email]> [2005-08-19 07:37]:

>
> On 18 Aug 2005, at 22:01, Ryan Harper wrote:
>
> >This patch provides a workaround for bugzilla #90 which shows up far
> >too
> >often when creating and then destroying lots of domUs and dom0 is SMP.
> >Details are in the [1]bug.  With this patch, I now can create/destroy
> >domains in a tight loop for hours where previously every 3 to 10 cycles
> >would blow up.
>
> Please also submit this to the Ethernet bridge maintainer. Details are
> in the Linux MAINTAINERS file.

I've had a little discussion on [1]netdev about this but since this
isn't the proper fix I'm doing some more digging.

The real race is between when a call_rcu() callback runs and when the
netif_destroy() calls unregister_netdev().  

When we get an oops, the brctl delif IOCTL has run, and br_del_if() has
been called setting the port state to DISABLED, and then queues up an
rcu callback, destroy_npb(),  which will set dev->br_port = NULL.  

add_del_if()                       // IOCTL handler from brctl delif
                                   // xen-br0 $VIF
  br_del_if()                      
    del_nbp()
      br_stp_disable_port()        // Set port->state = BR_STATE_DISABLED
      call_rcu(destroy_nbp_rcu)    // setup rcu callback to run
                                   // destory_nbp() which will set
                                   // dev->br_port = NULL
     
After the xen scripts have called the brctl command, xend sends the
disconnect and destroy control messages, which trigger:

netif_destroy()
   unregister_netdev()
      unregister_netdevice()
         notify_call_chain(NETDEV_UNREGISTER)
            br_device_event()    // The first thing done here is to check
                                 // the device's br_port to see if it is
                                 // NULL

If the destory_nbp_rcu() callback isn't fired before br_device_event()
checks dev->br_port, then the NULL check fails and a second call to
br_del_if() is invoked and we blow up in sysfs/kobject BUG_ON() for ref
counts of dentrys. [2]

Before I go back to netdev, I wanted to check if there is anything we
should be doing to be more defensive or does this seem to be something
the bridge code should handle (error out, whatever)?

1. http://oss.sgi.com/archives/netdev/2005-08/msg00097.html
2. Routines are in linux/net/bridge/{br_if.c, br_ioctl.c, br_notify.c}

--
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
(512) 838-9253   T/L: 678-9253
[hidden email]

_______________________________________________
Xen-devel mailing list
[hidden email]
http://lists.xensource.com/xen-devel
Reply | Threaded
Open this post in threaded view
|

[PATCH] scripts, patches: remove workaround, skip brtcl delif

Ryan Harper
Attached is a patch that needs more testing, but I've not been able to
recreate the race-condition with this patch applied.  It does the
following:

1. Remove workaround patch
2. Update scripts/network-bridge and scripts/vif-bridge to not call
   brctl delif

When a domU is shutdown/destroyed and the netif is destroyed, the
notify_call_chain triggered from unregister_netdevice() will trigger the
bridge event handler and which will call the proper code to remove the
device from the bridge.

I can't see any reason why brtcl delif should be called when taking out
a domain if the call chain will delete the interface from the bridge
when the vif is destroyed automatically.

--
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
(512) 838-9253   T/L: 678-9253
[hidden email]

diffstat output:
 a/patches/linux-2.6.12/workaround_double_br_del_if.patch |   11 -----------
 tools/examples/network-bridge                            |    3 ---
 tools/examples/vif-bridge                                |    6 ++++--
 3 files changed, 4 insertions(+), 16 deletions(-)

Signed-off-by: Ryan Harper <[hidden email]>
---
diff -r 188c782fa9bb tools/examples/vif-bridge
--- a/tools/examples/vif-bridge Fri Aug 19 13:05:31 2005
+++ b/tools/examples/vif-bridge Fri Aug 19 13:31:04 2005
@@ -74,8 +74,10 @@
     exit
 fi
 
-# Add/remove vif to/from bridge.
-brctl ${brcmd} ${bridge} ${vif}
+# Add vif to bridge. vifs are auto-removed from bridge
+if [ "${brcmd}" == "addif" ] ; then
+    brctl ${brcmd} ${bridge} ${vif}
+fi
 ifconfig ${vif} $OP
 
 if [ ${ip} ] ; then
diff -r 188c782fa9bb tools/examples/network-bridge
--- a/tools/examples/network-bridge Fri Aug 19 13:05:31 2005
+++ b/tools/examples/network-bridge Fri Aug 19 13:31:04 2005
@@ -222,10 +222,7 @@
         return
     fi
 
-    brctl delif ${bridge} ${netdev}
-
     if ifconfig veth0 2>/dev/null | grep -q veth0 ; then
-        brctl delif ${bridge} vif0.0
         ifconfig vif0.0 down
         mac=`ifconfig veth0 | grep HWadd | sed -e 's/.*\(..:..:..:..:..:..\).*/\1/'`
         ifconfig ${netdev} down
diff -r 188c782fa9bb patches/linux-2.6.12/workaround_double_br_del_if.patch
--- a/patches/linux-2.6.12/workaround_double_br_del_if.patch Fri Aug 19 13:05:31 2005
+++ /dev/null Fri Aug 19 13:31:04 2005
@@ -1,11 +0,0 @@
---- linux-2.6.12/net/bridge/br_if.c 2005-06-17 14:48:29.000000000 -0500
-+++ linux-2.6.12-xen0-smp/net/bridge/br_if.c 2005-08-18 15:17:27.302615846 -0500
-@@ -382,7 +382,7 @@
- {
- struct net_bridge_port *p = dev->br_port;
-
-- if (!p || p->br != br)
-+ if (!p || p->br != br || p->state == BR_STATE_DISABLED)
- return -EINVAL;
-
- br_sysfs_removeif(p);

_______________________________________________
Xen-devel mailing list
[hidden email]
http://lists.xensource.com/xen-devel
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] patches: workaround for br_del_if race

Keir Fraser
In reply to this post by Ryan Harper
> If the destory_nbp_rcu() callback isn't fired before br_device_event()
> checks dev->br_port, then the NULL check fails and a second call to
> br_del_if() is invoked and we blow up in sysfs/kobject BUG_ON() for ref
> counts of dentrys. [2]
>
> Before I go back to netdev, I wanted to check if there is anything we
> should be doing to be more defensive or does this seem to be something
> the bridge code should handle (error out, whatever)?

I think this is an etherbridge bug. They've set up two ways to enter
br_del_if() but haven't implemented proper synchronisation in that
function. The fact that br_del_if has been called once already but has
only 'half deleted' the bridge port is an implementation detail of the
etherbridge --- network interfaces shouldn't have to code around that.

I expect that all other network drivers have this same problem but
it's just really rare to unregister_netdev() a real NIC so noone's hit
it before.

 -- Keir

> 1. http://oss.sgi.com/archives/netdev/2005-08/msg00097.html
> 2. Routines are in linux/net/bridge/{br_if.c, br_ioctl.c, br_notify.c}
>
> --
> Ryan Harper
> Software Engineer; Linux Technology Center
> IBM Corp., Austin, Tx
> (512) 838-9253   T/L: 678-9253
> [hidden email]
>


_______________________________________________
Xen-devel mailing list
[hidden email]
http://lists.xensource.com/xen-devel
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] scripts, patches: remove workaround, skip brtcl delif

Keir Fraser
In reply to this post by Ryan Harper
> I can't see any reason why brtcl delif should be called when taking out
> a domain if the call chain will delete the interface from the bridge
> when the vif is destroyed automatically.

There should be no reason not to either. I think I prefer the
etherbridge bug fix.

What was wrong with that fix that the maintainers wouldn't accept it?
If they understand it enough to see that the fix is not 100% correct,
surely they can suggest a better one?

 -- Keir

_______________________________________________
Xen-devel mailing list
[hidden email]
http://lists.xensource.com/xen-devel