lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAJCW39+85dtfEjqNejB8xT=JCo2gU5XWY_ohb0OxYKs6G929jg@mail.gmail.com>
Date: Tue, 29 Jul 2025 00:08:05 +0200
From: Patryk Kowalczyk <patryk@...alczyk.ws>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: Baolin Wang <baolin.wang@...ux.alibaba.com>, hughd@...gle.com, 
	ville.syrjala@...ux.intel.com, david@...hat.com, willy@...radead.org, 
	maarten.lankhorst@...ux.intel.com, mripard@...nel.org, tzimmermann@...e.de, 
	airlied@...il.com, simona@...ll.ch, jani.nikula@...ux.intel.com, 
	joonas.lahtinen@...ux.intel.com, rodrigo.vivi@...el.com, tursulin@...ulin.net, 
	christian.koenig@....com, ray.huang@....com, matthew.auld@...el.com, 
	matthew.brost@...el.com, dri-devel@...ts.freedesktop.org, 
	linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH] mm: shmem: fix the shmem large folio allocation for the
 i915 driver

Hi,
I apologize for the second email; the first one contained HTML content
that was not accepted by the group.

In my tests, the performance drop ranges from a few percent up to 13%
in Unigine Superposition
under heavy memory usage on the CPU Core Ultra 155H with the Xe 128 EU GPU.
Other users have reported performance impact up to 30% on certain workloads.
Please find more  in the regressions reports:
https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14645
https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13845

I believe the change should be backported to all active kernel
branches after version 6.12.

best regards,
Patryk

pon., 28 lip 2025 o 23:44 Andrew Morton <akpm@...ux-foundation.org> napisał(a):
>
> On Mon, 28 Jul 2025 16:03:53 +0800 Baolin Wang <baolin.wang@...ux.alibaba.com> wrote:
>
> > After commit acd7ccb284b8 ("mm: shmem: add large folio support for tmpfs"),
> > we extend the 'huge=' option to allow any sized large folios for tmpfs,
> > which means tmpfs will allow getting a highest order hint based on the size
> > of write() and fallocate() paths, and then will try each allowable large order.
> >
> > However, when the i915 driver allocates shmem memory, it doesn't provide hint
> > information about the size of the large folio to be allocated, resulting in
> > the inability to allocate PMD-sized shmem, which in turn affects GPU performance.
> >
> > To fix this issue, add the 'end' information for shmem_read_folio_gfp()  to help
> > allocate PMD-sized large folios. Additionally, use the maximum allocation chunk
> > (via mapping_max_folio_size()) to determine the size of the large folios to
> > allocate in the i915 driver.
>
> What is the magnitude of the performance change?
>
> > Fixes: acd7ccb284b8 ("mm: shmem: add large folio support for tmpfs")
> > Reported-by: Patryk Kowalczyk <patryk@...alczyk.ws>
> > Reported-by: Ville Syrjälä <ville.syrjala@...ux.intel.com>
> > Tested-by: Patryk Kowalczyk <patryk@...alczyk.ws>
>
> This isn't a regression fix, is it?  acd7ccb284b8 adds a new feature
> and we have now found a flaw in it.
>
> Still, we could bend the rules a little bit and backport this, depends
> on how significant the runtime effect is.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ