linux-kernel - Re: [NAK] Re: [PATCH] fs: Optimized fget to improve performance

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200831032127.GW1236603@ZenIV.linux.org.uk>
Date:   Mon, 31 Aug 2020 04:21:27 +0100
From:   Al Viro <viro@...iv.linux.org.uk>
To:     Shaokun Zhang <zhangshaokun@...ilicon.com>
Cc:     linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
        Yuqi Jin <jinyuqi@...wei.com>,
        kernel test robot <rong.a.chen@...el.com>,
        Will Deacon <will@...nel.org>,
        Mark Rutland <mark.rutland@....com>,
        Peter Zijlstra <peterz@...radead.org>,
        Boqun Feng <boqun.feng@...il.com>
Subject: Re: [NAK] Re: [PATCH] fs: Optimized fget to improve performance

On Mon, Aug 31, 2020 at 09:43:31AM +0800, Shaokun Zhang wrote:

> How about this? We try to replace atomic_cmpxchg with atomic_add to improve
> performance. The atomic_add does not check the current f_count value.
> Therefore, the number of online CPUs is reserved to prevent multi-core
> competition.

No.  Really, really - no.  Not unless you can guarantee that process on another
CPU won't lose its timeslice, ending up with more than one increment happening on
the same CPU - done by different processes scheduled there, one after another.

If you have some change of atomic_long_add_unless(), do it there.  And get it
past the arm64 folks.  get_file_rcu() is nothing special in that respect *AND*
it has to cope with any architecture out there.

BTW, keep in mind that there's such thing as a KVM - race windows are much
wider there, since a thread representing a guest CPU might lose its timeslice
whenever the host feels like that.  At which point you get a single instruction
on a guest CPU taking longer than many thousands of instructions on another
CPU of the same guest.

AFAIK, arm64 does support KVM with SMP guests.