[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230228191705.3bc8bed6@kernel.org>
Date: Tue, 28 Feb 2023 19:17:05 -0800
From: Jakub Kicinski <kuba@...nel.org>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: Eric Dumazet <edumazet@...gle.com>,
LKML <linux-kernel@...r.kernel.org>,
Linus Torvalds <torvalds@...uxfoundation.org>, x86@...nel.org,
Wangyang Guo <wangyang.guo@...el.com>,
Arjan van De Ven <arjan@...ux.intel.com>,
"David S. Miller" <davem@...emloft.net>,
Paolo Abeni <pabeni@...hat.com>, netdev@...r.kernel.org,
Will Deacon <will@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Boqun Feng <boqun.feng@...il.com>,
Mark Rutland <mark.rutland@....com>,
Marc Zyngier <maz@...nel.org>
Subject: Re: [patch 0/3] net, refcount: Address dst_entry reference count
scalability issues
FWIW looks good to me, especially the refcount part.
We do see 10% of jitter in microbenchmarks due to random cache
effects, so forgive the questioning. But again, the refcount seems
like an obvious win to my noob eyes.
While I have you it would be remiss of me not to mention my ksoftirq
change which makes a large difference in production workloads:
https://lore.kernel.org/all/20221222221244.1290833-3-kuba@kernel.org/
Is Peter's "rework" of softirq going in for 6.3?
On Wed, 01 Mar 2023 02:00:20 +0100 Thomas Gleixner wrote:
> >> We looked at this because the reference count operations stood out in
> >> perf top and we analyzed it down to the false sharing _and_ the
> >> non-scalability of atomic_inc_not_zero().
> >
> > Please share your recipe and perf results.
>
> Sorry for being not explicit enough about this, but I was under the
> impression that explicitely mentioning memcached and memtier would be
> enough of a hint for people famiiar with this matter.
I think the disconnect may be that we are indeed familiar with
the workloads, but these exact workloads don't hit the issue
in production (I don't work at Google but a similarly large corp).
My initial reaction was also to see if I can find the issue in prod.
Not to question but in hope that I can indeed find a repro, and make
this series an easy sell.
> Run memcached with -t $N and memtier_benchmark with -t $M and
> --ratio=1:100 on the same machine. localhost connections obviously
> amplify the problem,
>
> Start with the defaults for $N and $M and increase them. Depending on
> your machine this will tank at some point. But even in reasonably small
> $N, $M scenarios the refcount operations and the resulting false sharing
> fallout becomes visible in perf top. At some point it becomes the
> dominating issue while the machine still has capacity...
>
> > We must have been very lucky to not see this at Google.
>
> There _is_ a world outside of Google? :)
>
> Seriously. The point is that even if you @google cannot obverse this as
> a major issue and it just gives your usecase a minimal 0.X gain, it
> still is contributing to the overall performance, no?
Powered by blists - more mailing lists