[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAAofZF7H_sPKh=oG7C=fN-mtevZjZ038DjOL0rwDVuOib9p4Dw@mail.gmail.com>
Date: Tue, 30 Dec 2025 16:42:51 +0100
From: Marco Crivellari <marco.crivellari@...e.com>
To: Matthew Brost <matthew.brost@...el.com>
Cc: linux-kernel@...r.kernel.org, intel-xe@...ts.freedesktop.org,
dri-devel@...ts.freedesktop.org, Tejun Heo <tj@...nel.org>,
Lai Jiangshan <jiangshanlai@...il.com>, Frederic Weisbecker <frederic@...nel.org>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>, Michal Hocko <mhocko@...e.com>,
Lucas De Marchi <lucas.demarchi@...el.com>,
Thomas Hellstrom <thomas.hellstrom@...ux.intel.com>, Rodrigo Vivi <rodrigo.vivi@...el.com>,
David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>
Subject: Re: [PATCH] drm/xe: Replace use of system_wq with system_percpu_wq
On Wed, Dec 24, 2025 at 7:28 PM Matthew Brost <matthew.brost@...el.com> wrote:
> [...]
> > drivers/gpu/drm/xe/xe_tlb_inval.c | 6 +++---
> > 1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_tlb_inval.c b/drivers/gpu/drm/xe/xe_tlb_inval.c
> > index 918a59e686ea..b2cf6e17fbc5 100644
> > --- a/drivers/gpu/drm/xe/xe_tlb_inval.c
> > +++ b/drivers/gpu/drm/xe/xe_tlb_inval.c
> > @@ -94,7 +94,7 @@ static void xe_tlb_inval_fence_timeout(struct work_struct *work)
> > xe_tlb_inval_fence_signal(fence);
> > }
> > if (!list_empty(&tlb_inval->pending_fences))
> > - queue_delayed_work(system_wq, &tlb_inval->fence_tdr,
> > + queue_delayed_work(system_percpu_wq, &tlb_inval->fence_tdr,
>
> Actually system_wq or system_percpu_wq doesn't work here as this is the
> fence signaling path. We should use one Xe's ordered work queues which
> is properly setup to be reclaim same.
Hi,
So only for this specific workqueue we should use for example this, instead:
462 /** @ordered_wq: used to serialize compute mode resume */
463 struct workqueue_struct *ordered_wq;
I noticed this has been allocated using:
490 xe->ordered_wq = alloc_ordered_workqueue("xe-ordered-wq", 0);
Using alloc_ordered_workqueue() makes this workqueue to be unbound.
569 #define alloc_ordered_workqueue(fmt, flags, args...) \
570 alloc_workqueue(fmt, WQ_UNBOUND | __WQ_ORDERED | (flags), 1, ##args)
So this patch should be split in 2:
- 1 patch with Xe's ordered workqueue, changing the behavior to
unbound "implicitly"
- 1 patch changing system_wq with the new per-cpu wq (system_percpu_wq).
To keep this workqueue per-cpu we can use xe->unordered_wq, that makes use of
alloc_workqueue() without specifying flags (eg. WQ_UNBOUND or the new
WQ_PERCPU),
making this workqueue per-cpu.
Would this sound reasonable to you?
Thanks!
--
Marco Crivellari
L3 Support Engineer
Powered by blists - more mailing lists