[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f360e681-0fa9-be4b-ea78-d7783b39048b@redhat.com>
Date: Mon, 5 Dec 2022 11:30:52 -0500
From: Waiman Long <longman@...hat.com>
To: Shakeel Butt <shakeelb@...gle.com>,
"Luther, Sven" <Sven.Luther@...driver.com>
Cc: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"regressions@...ts.linux.dev" <regressions@...ts.linux.dev>,
Roman Gushchin <guro@...com>,
Andrew Morton <akpm@...ux-foundation.org>,
Christoph Lameter <cl@...ux.com>,
Johannes Weiner <hannes@...xchg.org>,
Michal Hocko <mhocko@...nel.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
Vlastimil Babka <vbabka@...e.cz>,
"kernel-team@...com" <kernel-team@...com>,
"Eric W. Biederman" <ebiederm@...ssion.com>,
Muchun Song <songmuchun@...edance.com>,
Alexey Gladkov <legion@...nel.org>,
"Bonn, Jonas" <Jonas.Bonn@...driver.com>
Subject: Re: [Regression] mqueue performance degradation after "The new cgroup
slab memory controller" patchset.
On 12/5/22 11:06, Shakeel Butt wrote:
> Hi Sven,
>
> On Mon, Dec 5, 2022 at 6:56 AM Luther, Sven <Sven.Luther@...driver.com> wrote:
>> #regzbot ^introduced 10befea91b61c4e2c2d1df06a2e978d182fcf792
>>
>> We are making heavy use of mqueues, and noticed a degradation of performance between 4.18 & 5.10 linux kernels.
>>
>> After a gross per-version tracing, we did kernel bisection between 5.8 and 5.9
>> and traced the issue to a 10 patches (of which 9 where skipped as they didn't boot) between:
>>
>>
>> commit 10befea91b61c4e2c2d1df06a2e978d182fcf792 (HEAD, refs/bisect/bad)
>> Author: Roman Gushchin <guro@...com>
>> Date: Thu Aug 6 23:21:27 2020 -0700
>>
>> mm: memcg/slab: use a single set of kmem_caches for all allocations
>>
>> and:
>>
>> commit 286e04b8ed7a04279ae277f0f024430246ea5eec (refs/bisect/good-286e04b8ed7a04279ae277f0f024430246ea5eec)
>> Author: Roman Gushchin <guro@...com>
>> Date: Thu Aug 6 23:20:52 2020 -0700
>>
>> mm: memcg/slab: allocate obj_cgroups for non-root slab pages
>>
>> All of them are part of the "The new cgroup slab memory controller" patchset:
>>
>> https://lore.kernel.org/all/20200623174037.3951353-18-guro@fb.com/T/
>>
>> from Roman Gushchin, which moves the accounting for page level to the object level.
>>
>> Measurements where done using the a test programmtest, which measures mix/average/max time mqueue_send/mqueue_rcv,
>> and average for getppid, both measured over 100 000 runs. Results are shown in the following table
>>
>> +----------+--------------------------+-------------------------+----------------+
>> | kernel | mqueue_rcv (ns) | mqueue_send (ns) | getppid |
>> | version | min avg max variation | min avg max variation | (ns) variation |
>> +----------+--------------------------+-------------------------+----------------+
>> | 4.18.45 | 351 382 17533 base | 383 410 13178 base | 149 base |
>> | 5.8-good | 380 392 7156 -2,55% | 376 384 6225 6,77% | 169 -11,83% |
>> | 5.8-bad | 524 530 5310 -27,92% | 512 519 8775 -21,00% | 169 -11,83% |
>> | 5.10 | 520 533 4078 -28,33% | 518 534 8108 -23,22% | 167 -10,78% |
>> | 5.15 | 431 444 8440 -13,96% | 425 437 6170 -6,18% | 171 -12,87% |
>> | 6.03 | 474 614 3881 -37,79% | 482 693 931 -40,84% | 171 -12,87% |
>> +----------+--------------------------+-------------------------+-----------------
>>
> Is the last kernel 6.0.3? Also we know there is performance impact of
> per-object kmem accounting. Can you try the latest i.e. 6.1-rc8? There
> are a couple of memcg charging optimization patches merged in this
> window.
It is known that per-object kmem accounting regresses performance. I had
submitted a number of optimization patches that got merged into v5.14.
So the regression is reduced in the 5.15 line above. It looks like there
are some additional regressions in the latest kernel. We will need to
figure out what causes it.
Cheers,
Longman
Powered by blists - more mailing lists