lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-Id: <20130103135851.8cc58899.akpm@linux-foundation.org> Date: Thu, 3 Jan 2013 13:58:51 -0800 From: Andrew Morton <akpm@...ux-foundation.org> To: Ben Hutchings <bhutchings@...arflare.com> Cc: David Decotigny <decot@...glers.com>, <linux-kernel@...r.kernel.org>, "David S. Miller" <davem@...emloft.net>, Or Gerlitz <ogerlitz@...lanox.com>, Amir Vadai <amirv@...lanox.com>, "Paul E. McKenney" <paul.mckenney@...aro.org>, Thomas Gleixner <tglx@...utronix.de>, Josh Triplett <josh@...htriplett.org>, David Howells <dhowells@...hat.com>, Paul Gortmaker <paul.gortmaker@...driver.com> Subject: Re: [PATCH v4] lib: cpu_rmap: avoid flushing all workqueues On Wed, 2 Jan 2013 23:46:46 +0000 Ben Hutchings <bhutchings@...arflare.com> wrote: > On Wed, 2013-01-02 at 15:12 -0800, Andrew Morton wrote: > > On Wed, 2 Jan 2013 13:52:25 -0800 > > David Decotigny <decot@...glers.com> wrote: > > > > > In some cases, free_irq_cpu_rmap() is called while holding a lock > > > (eg. rtnl). This can lead to deadlocks, because it invokes > > > flush_scheduled_work() which ends up waiting for whole system > > > workqueue to flush, but some pending works might try to acquire the > > > lock we are already holding. > > > > > > This commit uses reference-counting to replace > > > irq_run_affinity_notifiers(). It also removes > > > irq_run_affinity_notifiers() altogether. > > > > I can't say that I've ever noticed cpu_rmap.c before :( Is is too late > > to review it? > > > > - The naming is chaotic. At least these: > > > > EXPORT_SYMBOL(alloc_cpu_rmap); > > EXPORT_SYMBOL(free_cpu_rmap); > > EXPORT_SYMBOL(cpu_rmap_add); > > EXPORT_SYMBOL(cpu_rmap_update); > > EXPORT_SYMBOL(free_irq_cpu_rmap); > > EXPORT_SYMBOL(irq_cpu_rmap_add); > > > > should be consistently named cpu_rmap_foo() > > There is a common practice of defining alloc_foo() and free_foo() > alongside foo_do_this() and foo_do_that(). I deliberately chose to > follow that. If this is deprecated then it should be documented > somewhere. I don't think anyone has thought about it to that extent. I always recommend that the exported identifiers be named as subsysid_functionname() and cannot think of any reason for special-casing alloc and free. > > - What's the locking model? It appears to be caller-provided, but > > it is undocumented. > > I think caller-provided can be assumed as the default for library code. Nope, a lot of library code does internal locking. And boy, does that cause problems! Experience tells us that caller-provided locking is better. But to avoid nasty problems, the library should clearly document its locking requirements! Bear in mind that spinlocks and mutexes aren't the only form of locks. A caller may wish to use an rwsem or rwlock, either to get parallelism in cpu_rmap_lookup_index() and cpu_rmap_lookup_obj(), or because they were already using such a lock. The cpu_rmap() locking documentation should describe which interface calls are OK with read-side locking. Not only to instruct users, but also to act as a constraint upon future developers of the cpu_rmap code. It becomes a contract saying "if you use read_lock() for this, we won't later break your stuff". > And IRQ setup and teardown need to be properly serialised in the driver > already. > > > drivers/net/ethernet/mellanox/mlx4/ appears to be using > > msix_ctl.pool_lock for exclusion, but I didn't check for coverage. > > > > drivers/net/ethernet/sfc/efx.c seems to not need locking because > > all its cpu_rmap operations are at module_init() time. > > > > The cpu_rmap code would be less of a hand grenade if each of its > > interface functions documented the caller's locking requirements. > > This particular 'hand grenade' *was* documented. So I don't think > documentation is the problem. Dunno what you're referring to here. There is no cpu_rmap() locking documentation. > > As for this patch: there's no cc:stable here but it does appear that > > the problem is sufficiently serious to justify a backport, agree? > [...] > > Not sure. So far as I can see, nothing called free_irq_cpu_rmap() while > holding the RTNL lock before v3.8-rc1. If there can be work items on a > global workqueue that lock a PCI device (perhaps EEH?) then stable > versions may also be affected. OK. The patch is rather non-trivial so I guess we aim for 3.8-only for now. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists