[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200916125117.GQ2674@hirez.programming.kicks-ass.net>
Date: Wed, 16 Sep 2020 14:51:17 +0200
From: peterz@...radead.org
To: Hou Tao <houtao1@...wei.com>
Cc: Oleg Nesterov <oleg@...hat.com>, Ingo Molnar <mingo@...hat.com>,
Will Deacon <will@...nel.org>, Dennis Zhou <dennis@...nel.org>,
Tejun Heo <tj@...nel.org>, "Christoph Lameter" <cl@...ux.com>,
<linux-kernel@...r.kernel.org>, <linux-fsdevel@...r.kernel.org>,
Jan Kara <jack@...e.cz>
Subject: Re: [RFC PATCH] locking/percpu-rwsem: use this_cpu_{inc|dec}() for
read_count
On Wed, Sep 16, 2020 at 08:32:20PM +0800, Hou Tao wrote:
> I have simply test the performance impact on both x86 and aarch64.
>
> There is no degradation under x86 (2 sockets, 18 core per sockets, 2 threads per core)
Yeah, x86 is magical here, it's the same single instruction for both ;-)
But it is, afaik, unique in this position, no other arch can pull that
off.
> However the performance degradation is huge under aarch64 (4 sockets, 24 core per sockets): nearly 60% lost.
>
> v4.19.111
> no writer, reader cn | 24 | 48 | 72 | 96
> the rate of down_read/up_read per second | 166129572 | 166064100 | 165963448 | 165203565
> the rate of down_read/up_read per second (patched) | 63863506 | 63842132 | 63757267 | 63514920
Teh hurt :/
Powered by blists - more mailing lists