lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Sat, 4 Oct 2008 09:42:48 +0200 From: Jarek Poplawski <jarkao2@...il.com> To: Benjamin Thery <Benjamin.Thery@...l.net> Cc: Daniel Lezcano <dlezcano@...ibm.com>, "David S. Miller" <davem@...emloft.net>, netdev <netdev@...r.kernel.org> Subject: Re: [PATCH] net: deadlock during net device unregistration - V2 On Thu, Oct 02, 2008 at 10:34:35PM +0200, Jarek Poplawski wrote: > On Thu, Oct 02, 2008 at 09:55:02PM +0200, Benjamin Thery wrote: > > Quoting Jarek Poplawski <jarkao2@...il.com>: > ... > >> As a matter of fact I've thought about something even simpler, which > >> probably should help for all above concerns too: > >> > >> if (event == NETDEV_UNREGISTER) > >> dst_gc_task(&dst_gc_work.work); > >> > >> dst_gc_task() locking allows for this, and running this two times in > >> a row could be even faster than trying to cancel the unnecessary run. > > > > I've thought a bit more about my last proposal and come to the same > > conclusion as you, hmm, almost. I thought we could call > > cancel_delayed_work() unconditionally and then dst_gc_task(). > > > > if (event == NETDEV_UNREGISTER) { > > cancel_delayed_work(&dst_gc_work); > > dst_gc_task(&dst_gc_work.work); > > } > > > > But, you're right, calling only dst_gc_task() seems fine to me. > > I'm OK with any of these versions. > > > > > I'll run some tests tomorrow and send a new patch. > > > > Do we agree that this fix (calling dst_gc_task() in dst_dev_event()) > > is better/more appropriate than the first one (replacing rtnl_unlock() > > by the non-blocking __rtnl_unlock() in linkwatch_event())? Hmm... You can kill me, but after more looking at this I've changed my mind. > I agree it's more appropriate, because: > a) the job is done according to the rules (in the comment) during > the notification and not in some future, But, since the rules are wrong or protocols buggy, the solution isn't effective: there should be no deadlock after this, but it's also hard to predict how many dst_gc_task() runs are needed to free all refs. During this all other works would be unnecessarily blocked by the linkwatch, what looks a bit too hoggish. My current idea is we should move linkwatch or dst_gc_task() (or both) to it's own workqueue. > b) it removes some hidden dependencies between processes/works, which, > even if they currently don't exist except this one in the linkwatch, > could be extremly hard to diagnose, if added accidentally in the > future. > > On the other hand, until we are not sure of the reasons, why something BTW, sorry, of course: "On the other hand, until we are sure of the reasons, why something" > like this (the full destroying) was avoided in the past, and until it's > heavily tested (with lockdep), your first patch looks to me much safer > if applying to any -stable is considered. And this first patch looks better and better each time... Jarek P. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists