[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3713e6d83421fcf64978927a1cb40fae1e3c7a57.camel@linux.intel.com>
Date: Mon, 01 Sep 2025 16:37:51 +0200
From: Thomas Hellström <thomas.hellstrom@...ux.intel.com>
To: Natalie Vock <natalie.vock@....de>, Maarten Lankhorst
<dev@...khorst.se>, Lucas De Marchi <lucas.demarchi@...el.com>, Rodrigo
Vivi <rodrigo.vivi@...el.com>, David Airlie <airlied@...il.com>, Simona
Vetter <simona@...ll.ch>, Maxime Ripard <mripard@...nel.org>, Tejun Heo
<tj@...nel.org>, Johannes Weiner <hannes@...xchg.org>, 'Michal
Koutný' <mkoutny@...e.com>, Michal Hocko
<mhocko@...nel.org>, Roman Gushchin <roman.gushchin@...ux.dev>, Shakeel
Butt <shakeel.butt@...ux.dev>, Muchun Song <muchun.song@...ux.dev>, Andrew
Morton <akpm@...ux-foundation.org>, David Hildenbrand <david@...hat.com>,
Lorenzo Stoakes <lorenzo.stoakes@...cle.com>, "'Liam R . Howlett'"
<Liam.Howlett@...cle.com>, Vlastimil Babka <vbabka@...e.cz>, Mike
Rapoport <rppt@...nel.org>, Suren Baghdasaryan <surenb@...gle.com>, Thomas
Zimmermann <tzimmermann@...e.de>
Cc: Michal Hocko <mhocko@...e.com>, intel-xe@...ts.freedesktop.org,
dri-devel@...ts.freedesktop.org, linux-kernel@...r.kernel.org,
cgroups@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [RFC 0/3] cgroups: Add support for pinned device memory
Hi,
On Mon, 2025-09-01 at 14:45 +0200, Natalie Vock wrote:
> Hi,
>
> On 8/19/25 13:49, Maarten Lankhorst wrote:
> > When exporting dma-bufs to other devices, even when it is allowed
> > to use
> > move_notify in some drivers, performance will degrade severely when
> > eviction happens.
> >
> > A perticular example where this can happen is in a multi-card
> > setup,
> > where PCI-E peer-to-peer is used to prevent using access to system
> > memory.
> >
> > If the buffer is evicted to system memory, not only the evicting
> > GPU wher
> > the buffer resided is affected, but it will also stall the GPU that
> > is
> > waiting on the buffer.
> >
> > It also makes sense for long running jobs not to be preempted by
> > having
> > its buffers evicted, so it will make sense to have the ability to
> > pin
> > from system memory too.
> >
> > This is dependant on patches by Dave Airlie, so it's not part of
> > this
> > series yet. But I'm planning on extending pinning to the memory
> > cgroup
> > controller in the future to handle this case.
> >
> > Implementation details:
> >
> > For each cgroup up until the root cgroup, the 'min' limit is
> > checked
> > against currently effectively pinned value. If the value will go
> > above
> > 'min', the pinning attempt is rejected.
>
> Why do you want to reject pins in this case? What happens in desktop
> usecases (e.g. PRIME buffer sharing)? AFAIU, you kind of need to be
> able
> to pin buffers and export them to other devices for that whole thing
> to
> work, right? If the user doesn't explicitly set a min value, wouldn't
> the value being zero mean any pins will be rejected (and thus PRIME
> would break)?
That's really the point. If an unprivileged malicious process is
allowed to pin arbitrary amounts of memory, thats a DOS vector.
However drivers that allow unlimited pinning today need to take care
when implementing restrictions to avoid regressions. Like perhaps
adding this behind a config option.
That said, IMO dma-buf clients should implement move_notify() whenever
possible to provide an option to avoid pinning unless necessary.
/Thomas
>
> If your objective is to prevent pinned buffers from being evicted,
> perhaps you could instead make TTM try to avoid evicting pinned
> buffers
> and prefer unpinned buffers as long as there are unpinned buffers to
> evict? As long as the total amount of pinned memory stays below min,
> no
> pinned buffers should get evicted with that either.
>
> Best,
> Natalie
>
> >
> > Pinned memory is handled slightly different and affects calculating
> > effective min/low values. Pinned memory is subtracted from both,
> > and needs to be added afterwards when calculating.
> >
> > This is because increasing the amount of pinned memory, the amount
> > of
> > free min/low memory decreases for all cgroups that are part of the
> > hierarchy.
> >
> > Maarten Lankhorst (3):
> > page_counter: Allow for pinning some amount of memory
> > cgroup/dmem: Implement pinning device memory
> > drm/xe: Add DRM_XE_GEM_CREATE_FLAG_PINNED flag and
> > implementation
> >
> > drivers/gpu/drm/xe/xe_bo.c | 66 +++++++++++++++++++++-
> > drivers/gpu/drm/xe/xe_dma_buf.c | 10 +++-
> > include/linux/cgroup_dmem.h | 2 +
> > include/linux/page_counter.h | 8 +++
> > include/uapi/drm/xe_drm.h | 10 +++-
> > kernel/cgroup/dmem.c | 57 ++++++++++++++++++-
> > mm/page_counter.c | 98
> > ++++++++++++++++++++++++++++++---
> > 7 files changed, 237 insertions(+), 14 deletions(-)
> >
>
Powered by blists - more mailing lists