linux-kernel - Re: [RFC PATCH] locking/percpu-rwsem: use this_cpu_{inc|dec}() for read

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200918090702.GB18920@quack2.suse.cz>
Date:   Fri, 18 Sep 2020 11:07:02 +0200
From:   Jan Kara <jack@...e.cz>
To:     Oleg Nesterov <oleg@...hat.com>
Cc:     Boaz Harrosh <boaz@...xistor.com>, Hou Tao <houtao1@...wei.com>,
        peterz@...radead.org, Ingo Molnar <mingo@...hat.com>,
        Will Deacon <will@...nel.org>, Dennis Zhou <dennis@...nel.org>,
        Tejun Heo <tj@...nel.org>, Christoph Lameter <cl@...ux.com>,
        linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        Jan Kara <jack@...e.cz>
Subject: Re: [RFC PATCH] locking/percpu-rwsem: use this_cpu_{inc|dec}() for
 read_count

On Thu 17-09-20 14:01:33, Oleg Nesterov wrote:
> On 09/17, Boaz Harrosh wrote:
> >
> > On 16/09/2020 15:32, Hou Tao wrote:
> > <>
> > >However the performance degradation is huge under aarch64 (4 sockets, 24 core per sockets): nearly 60% lost.
> > >
> > >v4.19.111
> > >no writer, reader cn                               | 24        | 48        | 72        | 96
> > >the rate of down_read/up_read per second           | 166129572 | 166064100 | 165963448 | 165203565
> > >the rate of down_read/up_read per second (patched) |  63863506 |  63842132 |  63757267 |  63514920
> > >
> >
> > I believe perhaps Peter Z's suggestion of an additional
> > percpu_down_read_irqsafe() API and let only those in IRQ users pay the
> > penalty.
> >
> > Peter Z wrote:
> > >My leading alternative was adding: percpu_down_read_irqsafe() /
> > >percpu_up_read_irqsafe(), which use local_irq_save() instead of
> > >preempt_disable().
> 
> This means that __sb_start/end_write() and probably more users in fs/super.c
> will have to use this API, not good.
> 
> IIUC, file_end_write() was never IRQ safe (at least if !CONFIG_SMP), even
> before 8129ed2964 ("change sb_writers to use percpu_rw_semaphore"), but this
> doesn't matter...
> 
> Perhaps we can change aio.c, io_uring.c and fs/overlayfs/file.c to avoid
> file_end_write() in IRQ context, but I am not sure it's worth the trouble.

Well, that would be IMO rather difficult. We need to do file_end_write()
after the IO has completed so if we don't want to do it in IRQ context,
we'd have to queue a work to a workqueue or something like that. And that's
going to be expensive compared to pure per-cpu inc/dec...

If people really wanted to avoid irq-safe inc/dec for archs where it is
more expensive, one idea I had was that we could add 'read_count_in_irq' to
percpu_rw_semaphore. So callers in normal context would use read_count and
callers in irq context would use read_count_in_irq. And the writer side
would sum over both but we don't care about performance of that one much.

								Honza
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR