lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 19 Oct 2009 13:07:53 -0500
From:	Michal Ostrowski <mostrows@...il.com>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	Cyrill Gorcunov <gorcunov@...il.com>,
	Denys Fedoryschenko <denys@...p.net.lb>,
	netdev <netdev@...r.kernel.org>, linux-ppp@...r.kernel.org,
	paulus@...ba.org, mostrows@...thlink.net
Subject: Re: kernel panic in latest vanilla stable, while using nameif with 
	"alive" pppoe interfaces

On Mon, Oct 19, 2009 at 12:12 PM, Eric Dumazet <eric.dumazet@...il.com> wrote:
> Michal Ostrowski a écrit :
>> Here's a bigger patch that just gets rid of flush_lock altogether.
>>
>> We were seeing oopses due to net namespaces going away while we were using
>> them, which turns out is simply due to the fact that pppoew wasn't claiming ref
>> counts properly.
>>
>> Fixing this requires that adding and removing entries to the per-net hash-table
>> requires incrementing and decrementing the ref count.  This also allows us to
>> get rid of the flush_lock since we can now depend on the existence of
>> "pn->hash_lock".
>>
>> We also have to be careful when flushing devices that removal of a hash table
>> entry may bring the net namespace refcount to 0.
>>
>
> Your patch is mangled (tabulation -> white spaces),

Patch mangling was due to mailer interactions, I'll attach a clean
version here, no more inlining.

>
> and I dont believe namespace refcount can reach 0 inside pppoe_flush_dev(),
> it would be a bug from core network code.
>

>From the original oops I was able to deduce that the namespace somehow
managed to get destroyed during the interval where we dropped locks.
If that's not due to the release_sock() call in pppoe_flush_dev()
triggering a cleanup then I'd have to assume that that it's due to a
secondary actor closing the socket in parallel, but that in turn would
point to issues with the flush_lock.  Having said that the thrust of
this patch remains valid; it just means I don't need to inc the ref
count in pppoe_flush_dev().

Do you agree?

--
Michal Ostrowski
mostrows@...il.com

Download attachment "0001-PPPoE-Fix-ref-counts-on-net-namespaces.patch" of type "application/octet-stream" (5483 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ