lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <be074809-e1fd-43a2-9396-8f7264532c4d@lucifer.local>
Date: Thu, 7 Aug 2025 18:41:06 +0100
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: Jann Horn <jannh@...gle.com>
Cc: kernel test robot <oliver.sang@...el.com>, Dev Jain <dev.jain@....com>,
        oe-lkp@...ts.linux.dev, lkp@...el.com, linux-kernel@...r.kernel.org,
        Andrew Morton <akpm@...ux-foundation.org>,
        Barry Song <baohua@...nel.org>, Pedro Falcato <pfalcato@...e.de>,
        Anshuman Khandual <anshuman.khandual@....com>,
        Bang Li <libang.li@...group.com>,
        Baolin Wang <baolin.wang@...ux.alibaba.com>,
        bibo mao <maobibo@...ngson.cn>, David Hildenbrand <david@...hat.com>,
        Hugh Dickins <hughd@...gle.com>, Ingo Molnar <mingo@...nel.org>,
        Lance Yang <ioworker0@...il.com>,
        Liam Howlett <liam.howlett@...cle.com>,
        Matthew Wilcox <willy@...radead.org>, Peter Xu <peterx@...hat.com>,
        Qi Zheng <zhengqi.arch@...edance.com>,
        Ryan Roberts <ryan.roberts@....com>, Vlastimil Babka <vbabka@...e.cz>,
        Yang Shi <yang@...amperecomputing.com>, Zi Yan <ziy@...dia.com>,
        linux-mm@...ck.org
Subject: Re: [linus:master] [mm] f822a9a81a:
 stress-ng.bigheap.realloc_calls_per_sec 37.3% regression

On Thu, Aug 07, 2025 at 07:37:38PM +0200, Jann Horn wrote:
> On Thu, Aug 7, 2025 at 10:28 AM Lorenzo Stoakes
> <lorenzo.stoakes@...cle.com> wrote:
> > On Thu, Aug 07, 2025 at 04:17:09PM +0800, kernel test robot wrote:
> > > 94dab12d86cf77ff f822a9a81a31311d67f260aea96
> > > ---------------- ---------------------------
> > >          %stddev     %change         %stddev
> > >              \          |                \
> > >      13777 ą 37%     +45.0%      19979 ą 27%  numa-vmstat.node1.nr_slab_reclaimable
> > >     367205            +2.3%     375703        vmstat.system.in
> > >      55106 ą 37%     +45.1%      79971 ą 27%  numa-meminfo.node1.KReclaimable
> > >      55106 ą 37%     +45.1%      79971 ą 27%  numa-meminfo.node1.SReclaimable
> > >     559381           -37.3%     350757        stress-ng.bigheap.realloc_calls_per_sec
> > >      11468            +1.2%      11603        stress-ng.time.system_time
> > >     296.25            +4.5%     309.70        stress-ng.time.user_time
> > >       0.81 ą187%    -100.0%       0.00        perf-sched.sch_delay.avg.ms.__cond_resched.zap_pte_range.zap_pmd_range.isra.0
> > >       9.36 ą165%    -100.0%       0.00        perf-sched.sch_delay.max.ms.__cond_resched.zap_pte_range.zap_pmd_range.isra.0
> > >       0.81 ą187%    -100.0%       0.00        perf-sched.wait_time.avg.ms.__cond_resched.zap_pte_range.zap_pmd_range.isra.0
> > >       9.36 ą165%    -100.0%       0.00        perf-sched.wait_time.max.ms.__cond_resched.zap_pte_range.zap_pmd_range.isra.0
> > >       5.50 ą 17%    +390.9%      27.00 ą 56%  perf-c2c.DRAM.local
> > >     388.50 ą 10%    +114.7%     834.17 ą 33%  perf-c2c.DRAM.remote
> > >       1214 ą 13%    +107.3%       2517 ą 31%  perf-c2c.HITM.local
> > >     135.00 ą 19%    +130.9%     311.67 ą 32%  perf-c2c.HITM.remote
> > >       1349 ą 13%    +109.6%       2829 ą 31%  perf-c2c.HITM.total
> >
> > Yeah this also looks pretty consistent too...
>
> FWIW, HITM hat different meanings depending on exactly which
> microarchitecture that test happened on; the message says it is from
> Sapphire Rapids, which is a successor of Ice Lake, so HITM is less
> meaningful than if it came from a pre-IceLake system (see
> https://lore.kernel.org/all/CAG48ez3RmV6SsVw9oyTXxQXHp3rqtKDk2qwJWo9TGvXCq7Xr-w@mail.gmail.com/).
>
> To me those numbers mainly look like you're accessing a lot more
> cache-cold data. (On pre-IceLake they would indicate cacheline
> bouncing, but I guess here they probably don't.) And that makes sense,
> since before the patch, this path was just moving PTEs around without
> looking at the associated pages/folios; basically more or less like a
> memcpy() on x86-64. But after the patch, for every 8 bytes that you
> copy, you have to load a cacheline from the vmemmap to get the page.

Yup this is representative of what my investigation is showing.

I've narrowed it down but want to wait to report until I'm sure...

But yeah we're doing a _lot_ more work.

I'm leaning towards disabling except for arm64 atm tbh, seems mremap is
especially sensitive to this (I found issues with this with my abortive mremap
anon merging stuff too, but really expected it there...)

On assumption arm64 is _definitely_ faster. I wonder if older arm64 arches might
suffer?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ