lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200827142848.GZ1236603@ZenIV.linux.org.uk>
Date:   Thu, 27 Aug 2020 15:28:48 +0100
From:   Al Viro <viro@...iv.linux.org.uk>
To:     Shaokun Zhang <zhangshaokun@...ilicon.com>
Cc:     linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
        Yuqi Jin <jinyuqi@...wei.com>,
        kernel test robot <rong.a.chen@...el.com>,
        Will Deacon <will@...nel.org>,
        Mark Rutland <mark.rutland@....com>,
        Peter Zijlstra <peterz@...radead.org>,
        Boqun Feng <boqun.feng@...il.com>
Subject: [NAK] Re: [PATCH] fs: Optimized fget to improve performance

On Thu, Aug 27, 2020 at 06:19:44PM +0800, Shaokun Zhang wrote:
> From: Yuqi Jin <jinyuqi@...wei.com>
> 
> It is well known that the performance of atomic_add is better than that of
> atomic_cmpxchg.
> The initial value of @f_count is 1. While @f_count is increased by 1 in
> __fget_files, it will go through three phases: > 0, < 0, and = 0. When the
> fixed value 0 is used as the condition for terminating the increase of 1,
> only atomic_cmpxchg can be used. When we use < 0 as the condition for
> stopping plus 1, we can use atomic_add to obtain better performance.

Suppose another thread has just removed it from the descriptor table.

> +static inline bool get_file_unless_negative(atomic_long_t *v, long a)
> +{
> +	long c = atomic_long_read(v);
> +
> +	if (c <= 0)
> +		return 0;

Still 1.  Now the other thread has gotten to dropping the last reference,
decremented counter to zero and committed to freeing the struct file.

> +
> +	return atomic_long_add_return(a, v) - 1;

... and you increment that sucker back to 1.  Sure, you return 0, so the
caller does nothing to that struct file.  Which includes undoing the
changes to its refecount.

In the meanwhile, the third thread does fget on the same descriptor,
and there we end up bumping the refcount to 2 and succeeding.  Which
leaves the caller with reference to already doomed struct file...

	IOW, NAK - this is completely broken.  The whole point of
atomic_long_add_unless() is that the check and conditional increment
are atomic.  Together.  That's what your optimization takes out.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ