linux-kernel - Re: [PATCH 02/11] mm/page_alloc: Convert per-cpu list protection to local

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20210412115612.GX3697@techsingularity.net>
Date:   Mon, 12 Apr 2021 12:56:12 +0100
From:   Mel Gorman <mgorman@...hsingularity.net>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Linux-MM <linux-mm@...ck.org>,
        Linux-RT-Users <linux-rt-users@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Chuck Lever <chuck.lever@...cle.com>,
        Jesper Dangaard Brouer <brouer@...hat.com>,
        Matthew Wilcox <willy@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...nel.org>,
        Michal Hocko <mhocko@...nel.org>,
        Oscar Salvador <osalvador@...e.de>
Subject: Re: [PATCH 02/11] mm/page_alloc: Convert per-cpu list protection to
 local_lock

On Fri, Apr 09, 2021 at 08:55:39PM +0200, Peter Zijlstra wrote:
> On Fri, Apr 09, 2021 at 02:32:56PM +0100, Mel Gorman wrote:
> > That said, there are some curious users already.
> > fs/squashfs/decompressor_multi_percpu.c looks like it always uses the
> > local_lock in CPU 0's per-cpu structure instead of stabilising a per-cpu
> > pointer. 
> 
> I'm not sure how you read that.
> 
> You're talking about this:
> 
>   local_lock(&msblk->stream->lock);
> 
> right? Note that msblk->stream is a per-cpu pointer, so
> &msblk->stream->lock is that same per-cpu pointer with an offset on.
> 
> The whole think relies on:
> 
> 	&per_cpu_ptr(msblk->stream, cpu)->lock == per_cpu_ptr(&msblk->stream->lock, cpu)
> 
> Which is true because the lhs:
> 
> 	(local_lock_t *)((msblk->stream + per_cpu_offset(cpu)) + offsetof(struct squashfs_stream, lock))
> 
> and the rhs:
> 
> 	(local_lock_t *)((msblk->stream + offsetof(struct squashfs_stream, lock)) + per_cpu_offset(cpu))
> 
> are identical, because addition is associative.
> 

Ok, I think I see and understand now, I didn't follow far enough down
into the macro magic and missed this observation so thanks for your
patience. The page allocator still incurs a double lookup of the per
cpu offsets but it should work for both the current local_lock_irq
implementation and the one in preempt-rt because the task will be pinned
to the CPU by either preempt_disable, migrate_disable or IRQ disable
depending on the local_lock implementation and kernel configuration.

I'll update the changelog and comment accordingly. I'll decide later
whether to leave it or move the location of the lock at the end of the
series. If the patch is added, it'll either incur the double lookup (not
that expensive, might be optimised by the compiler) or come up with a
helper that takes the lock and returns the per-cpu structure. The double
lookup probably makes more sense initially because there are multiple
potential users of a helper that says "pin to CPU, lookup, lock and return
a per-cpu structure" for both IRQ-safe and IRQ-unsafe variants with the
associated expansion of the local_lock API. It might be better to introduce
such a helper with multiple users converted at the same time and there are
other local_lock users in preempt-rt that could do with upstreaming first.

> > drivers/block/zram/zcomp.c appears to do the same although for
> > at least one of the zcomp_stream_get() callers, the CPU is pinned for
> > other reasons (bit spin lock held). I think it happens to work anyway
> > but it's weird and I'm not a fan.
> 
> Same thing.

Yep.

-- 
Mel Gorman
SUSE Labs