lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAG88wWYuxY7Gi6tuBsv_tY2QqGmQgB-zER9pzpZga7uV21HorA@mail.gmail.com>
Date:	Wed, 2 Jan 2013 18:35:00 -0800
From:	David Decotigny <decot@...glers.com>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Ben Hutchings <bhutchings@...arflare.com>,
	"David S. Miller" <davem@...emloft.net>,
	Or Gerlitz <ogerlitz@...lanox.com>,
	Amir Vadai <amirv@...lanox.com>,
	"Paul E. McKenney" <paul.mckenney@...aro.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Josh Triplett <josh@...htriplett.org>,
	David Howells <dhowells@...hat.com>,
	Paul Gortmaker <paul.gortmaker@...driver.com>
Subject: Re: [PATCH v4] lib: cpu_rmap: avoid flushing all workqueues

Thanks. It is not too late to review this code. But I'd prefer not to
address refactoring issues with this patch, which is supposed to fix a
deadlock with some drivers. So I'd rather not to add function
renaming, suppressions, etc. that have an effect outside cpu_rmap's
code. Instead, I'd like to propose another patch later, which will be
a little more intrusive in that respect, if that's ok with you.

I believe Ben answered your other concerns, I consider him as the
expert as to whether there should be finer-grained locking implemented
in this subsystem. Let me just add that I second him in saying that
the deadlock risk was clearly identified and mentioned in the doc.
Unfortunately, initial implementation makes this risk hard to
work-around for some drivers, which is what this patch proposes to
address.

So, for now, I'd like to keep v4 as the current version. And some
refactoring will be done in a later patch.

Regards,

--
David Decotigny

On Wed, Jan 2, 2013 at 3:12 PM, Andrew Morton <akpm@...ux-foundation.org> wrote:
> On Wed,  2 Jan 2013 13:52:25 -0800
> David Decotigny <decot@...glers.com> wrote:
>
>> In some cases, free_irq_cpu_rmap() is called while holding a lock
>> (eg. rtnl). This can lead to deadlocks, because it invokes
>> flush_scheduled_work() which ends up waiting for whole system
>> workqueue to flush, but some pending works might try to acquire the
>> lock we are already holding.
>>
>> This commit uses reference-counting to replace
>> irq_run_affinity_notifiers(). It also removes
>> irq_run_affinity_notifiers() altogether.
>
> I can't say that I've ever noticed cpu_rmap.c before :( Is is too late
> to review it?
>
> - The naming is chaotic.  At least these:
>
>         EXPORT_SYMBOL(alloc_cpu_rmap);
>         EXPORT_SYMBOL(free_cpu_rmap);
>         EXPORT_SYMBOL(cpu_rmap_add);
>         EXPORT_SYMBOL(cpu_rmap_update);
>         EXPORT_SYMBOL(free_irq_cpu_rmap);
>         EXPORT_SYMBOL(irq_cpu_rmap_add);
>
>   should be consistently named cpu_rmap_foo()
>
> - What's the locking model?  It appears to be caller-provided, but
>   it is undocumented.
>
>   drivers/net/ethernet/mellanox/mlx4/ appears to be using
>   msix_ctl.pool_lock for exclusion, but I didn't check for coverage.
>
>   drivers/net/ethernet/sfc/efx.c seems to not need locking because
>   all its cpu_rmap operations are at module_init() time.
>
>   The cpu_rmap code would be less of a hand grenade if each of its
>   interface functions documented the caller's locking requirements.
>
>
> As for this patch: there's no cc:stable here but it does appear that
> the problem is sufficiently serious to justify a backport, agree?
>
>> --- a/include/linux/cpu_rmap.h
>> +++ b/include/linux/cpu_rmap.h
>>
>> ...
>>
>> @@ -33,15 +36,7 @@ struct cpu_rmap {
>>  #define CPU_RMAP_DIST_INF 0xffff
>>
>>  extern struct cpu_rmap *alloc_cpu_rmap(unsigned int size, gfp_t flags);
>> -
>> -/**
>> - * free_cpu_rmap - free CPU affinity reverse-map
>> - * @rmap: Reverse-map allocated with alloc_cpu_rmap(), or %NULL
>> - */
>> -static inline void free_cpu_rmap(struct cpu_rmap *rmap)
>> -{
>> -     kfree(rmap);
>> -}
>> +extern void free_cpu_rmap(struct cpu_rmap *rmap);
>
> Can we do away with free_cpu_rmap() altogether?  It is a misleading
> name - it is a put() function, not a free() function.  It would be
> clearer (not to mention faster and smaller) to change all call sites
> to directly call cpu_rmap_put().
>
>
>>  extern int cpu_rmap_add(struct cpu_rmap *rmap, void *obj);
>>  extern int cpu_rmap_update(struct cpu_rmap *rmap, u16 index,
>>
>> ...
>>
>> @@ -63,6 +64,44 @@ struct cpu_rmap *alloc_cpu_rmap(unsigned int size, gfp_t flags)
>>  }
>>  EXPORT_SYMBOL(alloc_cpu_rmap);
>>
>> +/**
>> + * cpu_rmap_reclaim - internal reclaiming helper called from kref_put
>> + * @ref: kref to struct cpu_rmap
>> + */
>> +static void cpu_rmap_reclaim(struct kref *ref)
>> +{
>> +     struct cpu_rmap *rmap = container_of(ref, struct cpu_rmap, refcount);
>> +     kfree(rmap);
>> +}
>
> I suggest this be renamed to cpu_rmap_release().  As "release" is the
> conventional term for a kref release handler.
>
>>
>> ...
>>
>> +/**
>> + * cpu_rmap_put - internal helper to release ref on a cpu_rmap
>> + * @rmap: reverse-map allocated with alloc_cpu_rmap()
>> + */
>> +static inline void cpu_rmap_put(struct cpu_rmap *rmap)
>> +{
>> +     kref_put(&rmap->refcount, cpu_rmap_reclaim);
>> +}
>
> As mentioned, I suggest this become the public interface.  And I
> suppose it should propagate kref_put()'s return value, in case someone
> is interested.
>
>> +/**
>> + * free_cpu_rmap - free CPU affinity reverse-map
>> + * @rmap: Reverse-map allocated with alloc_cpu_rmap()
>> + */
>> +void free_cpu_rmap(struct cpu_rmap *rmap)
>> +{
>> +     cpu_rmap_put(rmap);
>> +}
>> +EXPORT_SYMBOL(free_cpu_rmap);
>
> zap.
>
>>  /* Reevaluate nearest object for given CPU, comparing with the given
>>   * neighbours at the given distance.
>>   */
>>
>> ...
>>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ