lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <47123AF3.9010201@voltaire.com>
Date:	Sun, 14 Oct 2007 17:51:15 +0200
From:	Moni Shoua <monis@...taire.com>
To:	Roland Dreier <rdreier@...co.com>, Jay Vosburgh <fubar@...ibm.com>
CC:	jeff@...zik.org, David Miller <davem@...emloft.net>,
	ogerlitz@...taire.com, netdev@...r.kernel.org,
	Moni Levy <monil@...taire.com>
Subject: Re: [PATCH] IB/ipoib: Bound the net device to the ipoib_neigh structue

Roland Dreier wrote:
>  > It happens only when ib interfaces are slaves of a bonding device.
>  > I thought before that the stuck is in napi_disable() but it's almost right.
>  > I put prints before and after call to napi_disable and see that it is called twice.
>  > I'll try to investigate in this direction.
>  > 
>  > ib0: stopping interface
>  > ib0: before napi_disable
>  > ib0: after napi_disable
>  > ib0: downing ib_dev
>  > ib0: All sends and receives done.
>  > ib0: stopping interface
>  > ib0: before napi_disable
> 
> Yes, two napi_disable()s in a row without a matching napi_enable()
> will deadlock.  I guess the question is why the ipoib interface is
> being stopped twice.
> 
> If you just take the net-2.6.24 tree (without bonding patches), does
> bonding for ethernet interfaces work OK, or is there a similar problem
> with double napi_disable()?  How about bonding of ethernet after this
> batch of bonding patches?
> 
>  - R.

Ok, I think I know what happens here.
When bonding gets an NETDEV_GOING_DONW event it releases the slave and 
by the way closes the slave device (this is a new code). ifconfig on the other hand
closes the deivice one more time and this is why we see 2 napi_disable() in a row.

The fix in my opinion is in bonding - it should react to NETDEV_UNREGISTER and not to NETDEV_GOING_DONW.
I want to test this point and if it's good I'll submit new patches.


thanks
  MoniS

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists