lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <acb81a4e86f4f683c4f83509afdc5f24ea01e64d.camel@mailbox.org>
Date: Wed, 09 Jul 2025 12:14:54 +0200
From: Philipp Stanner <phasta@...lbox.org>
To: Tvrtko Ursulin <tvrtko.ursulin@...lia.com>, Philipp Stanner
 <phasta@...nel.org>, Min Ma <min.ma@....com>, Lizhi Hou
 <lizhi.hou@....com>,  Oded Gabbay <ogabbay@...nel.org>, Alex Deucher
 <alexander.deucher@....com>, Christian König
 <christian.koenig@....com>, Xinhui Pan <Xinhui.Pan@....com>, David Airlie
 <airlied@...il.com>, Simona Vetter <simona@...ll.ch>, Lucas Stach
 <l.stach@...gutronix.de>, Russell King <linux+etnaviv@...linux.org.uk>, 
 Christian Gmeiner <christian.gmeiner@...il.com>, Frank Binns
 <frank.binns@...tec.com>, Matt Coster <matt.coster@...tec.com>, Maarten
 Lankhorst <maarten.lankhorst@...ux.intel.com>,  Maxime Ripard
 <mripard@...nel.org>, Thomas Zimmermann <tzimmermann@...e.de>, Qiang Yu
 <yuq825@...il.com>,  Rob Clark <robdclark@...il.com>, Sean Paul
 <sean@...rly.run>, Konrad Dybcio <konradybcio@...nel.org>,  Abhinav Kumar
 <quic_abhinavk@...cinc.com>, Dmitry Baryshkov
 <dmitry.baryshkov@...aro.org>, Marijn Suijten
 <marijn.suijten@...ainline.org>, Karol Herbst <kherbst@...hat.com>, Lyude
 Paul <lyude@...hat.com>, Danilo Krummrich <dakr@...hat.com>, Boris
 Brezillon <boris.brezillon@...labora.com>, Rob Herring <robh@...nel.org>,
 Steven Price <steven.price@....com>, Liviu Dudau <liviu.dudau@....com>,
 Matthew Brost <matthew.brost@...el.com>, Melissa Wen <mwen@...lia.com>, 
 Maíra Canal <mcanal@...lia.com>, Lucas De Marchi
 <lucas.demarchi@...el.com>, Thomas Hellström
 <thomas.hellstrom@...ux.intel.com>, Rodrigo Vivi <rodrigo.vivi@...el.com>,
 Sunil Khatri <sunil.khatri@....com>,  Lijo Lazar <lijo.lazar@....com>,
 Hawking Zhang <Hawking.Zhang@....com>, Mario Limonciello
 <mario.limonciello@....com>, Ma Jun <Jun.Ma2@....com>, Yunxiang Li
 <Yunxiang.Li@....com>
Cc: dri-devel@...ts.freedesktop.org, linux-kernel@...r.kernel.org, 
	amd-gfx@...ts.freedesktop.org, etnaviv@...ts.freedesktop.org, 
	lima@...ts.freedesktop.org, linux-arm-msm@...r.kernel.org, 
	freedreno@...ts.freedesktop.org, nouveau@...ts.freedesktop.org, 
	intel-xe@...ts.freedesktop.org, Christian Gmeiner <cgmeiner@...lia.com>
Subject: Re: [PATCH v4] drm/sched: Use struct for drm_sched_init() params

On Tue, 2025-07-08 at 14:02 +0100, Tvrtko Ursulin wrote:
> 
> 
> On 11/02/2025 11:14, Philipp Stanner wrote:
> > drm_sched_init() has a great many parameters and upcoming new
> > functionality for the scheduler might add even more. Generally, the
> > great number of parameters reduces readability and has already
> > caused
> > one missnaming, addressed in:
> > 
> > commit 6f1cacf4eba7 ("drm/nouveau: Improve variable name in
> > nouveau_sched_init()").
> > 
> > Introduce a new struct for the scheduler init parameters and port
> > all
> > users.
> > 
> > Signed-off-by: Philipp Stanner <phasta@...nel.org>
> > Reviewed-by: Liviu Dudau <liviu.dudau@....com>
> > Acked-by: Matthew Brost <matthew.brost@...el.com> # for Xe
> > Reviewed-by: Boris Brezillon <boris.brezillon@...labora.com> # for
> > Panfrost and Panthor
> > Reviewed-by: Christian Gmeiner <cgmeiner@...lia.com> # for Etnaviv
> > Reviewed-by: Frank Binns <frank.binns@...tec.com> # for Imagination
> > Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@...lia.com> # for Sched
> > Reviewed-by: Maíra Canal <mcanal@...lia.com> # for v3d
> > ---
> > Changes in v4:
> >    - Add forgotten driver accel/amdxdna. (Me)
> >    - Rephrase the "init to NULL" comments. (Tvrtko)
> >    - Apply RBs by Tvrtko and Maira.
> >    - Terminate the last struct members with a comma, so that future
> >      fields can be added with a minimal patch diff. (Me)
> > 
> > Changes in v3:
> >    - Various formatting requirements.
> > 
> > Changes in v2:
> >    - Point out that the hang-limit is deprecated. (Christian)
> >    - Initialize the structs to 0 at declaration. (Planet Earth)
> >    - Don't set stuff explicitly to 0 / NULL. (Tvrtko)
> >    - Make the structs const where possible. (Boris)
> >    - v3d: Use just 1, universal, function for sched-init. (Maíra)
> > ---
> 
> 8><
> 
> > diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c
> > b/drivers/gpu/drm/panfrost/panfrost_job.c
> > index 9b8e82fb8bc4..5657106c2f7d 100644
> > --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> > +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> > @@ -836,8 +836,16 @@ static irqreturn_t
> > panfrost_job_irq_handler(int irq, void *data)
> >   
> >   int panfrost_job_init(struct panfrost_device *pfdev)
> >   {
> > +	struct drm_sched_init_args args = {
> > +		.ops = &panfrost_sched_ops,
> > +		.num_rqs = DRM_SCHED_PRIORITY_COUNT,
> > +		.credit_limit = 2,
> > +		.timeout = msecs_to_jiffies(JOB_TIMEOUT_MS),
> > +		.timeout_wq = pfdev->reset.wq,
> 
> ^^^
> 
> > +		.name = "pan_js",
> > +		.dev = pfdev->dev,
> > +	};
> >   	struct panfrost_job_slot *js;
> > -	unsigned int nentries = 2;
> >   	int ret, j;
> >   
> >   	/* All GPUs have two entries per queue, but without
> > jobchain
> > @@ -845,7 +853,7 @@ int panfrost_job_init(struct panfrost_device
> > *pfdev)
> >   	 * so let's just advertise one entry in that case.
> >   	 */
> >   	if (!panfrost_has_hw_feature(pfdev,
> > HW_FEATURE_JOBCHAIN_DISAMBIGUATION))
> > -		nentries = 1;
> > +		args.credit_limit = 1;
> >   
> >   	pfdev->js = js = devm_kzalloc(pfdev->dev, sizeof(*js),
> > GFP_KERNEL);
> >   	if (!js)
> 
> Stumbled on this while looking at drm_sched_init() workqueue usage.
> 
> I think this patch might need a fixup. Because somewhere around here
> in 
> the code there is this:
> 
> 	pfdev->reset.wq = alloc_ordered_workqueue("panfrost-reset",
> 0);
> 	if (!pfdev->reset.wq)
> 		return -ENOMEM;
> 
> Which means that after the patch panfrost is using system_wq for the 
> timeout handler instead the one it creates.

Ouch yes, that's definitely a very subtle bug. AFAICS it comes to be by
pfdev being initialized to 0.

Let me provide a fix..

P.

> 
> > @@ -875,13 +883,7 @@ int panfrost_job_init(struct panfrost_device
> > *pfdev)
> >   	for (j = 0; j < NUM_JOB_SLOTS; j++) {
> >   		js->queue[j].fence_context =
> > dma_fence_context_alloc(1);
> >   
> > -		ret = drm_sched_init(&js->queue[j].sched,
> > -				     &panfrost_sched_ops, NULL,
> > -				     DRM_SCHED_PRIORITY_COUNT,
> > -				     nentries, 0,
> > -				    
> > msecs_to_jiffies(JOB_TIMEOUT_MS),
> > -				     pfdev->reset.wq,
> > -				     NULL, "pan_js", pfdev->dev);
> > +		ret = drm_sched_init(&js->queue[j].sched, &args);
> 
> ^^^
> 
> >   		if (ret) {
> >   			dev_err(pfdev->dev, "Failed to create
> > scheduler: %d.", ret);
> >   			goto err_sched;
> 
> Regards,
> 
> Tvrtko
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ