lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5jnpsmjef5ibegbsbelkfmudv4wagpcfb25nptqs5z4ccitq4c@3bdtrbrrmtil>
Date: Wed, 28 May 2025 14:07:34 -0500
From: Lucas De Marchi <lucas.demarchi@...el.com>
To: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@....com>
CC: Min Ma <min.ma@....com>, Lizhi Hou <lizhi.hou@....com>, Oded Gabbay
	<ogabbay@...nel.org>, Felix Kuehling <Felix.Kuehling@....com>, Alex Deucher
	<alexander.deucher@....com>, Christian König
	<christian.koenig@....com>, David Airlie <airlied@...il.com>, Simona Vetter
	<simona@...ll.ch>, Lucas Stach <l.stach@...gutronix.de>, Russell King
	<linux+etnaviv@...linux.org.uk>, Christian Gmeiner
	<christian.gmeiner@...il.com>, Frank Binns <frank.binns@...tec.com>, "Matt
 Coster" <matt.coster@...tec.com>, Maarten Lankhorst
	<maarten.lankhorst@...ux.intel.com>, Maxime Ripard <mripard@...nel.org>,
	Thomas Zimmermann <tzimmermann@...e.de>, Qiang Yu <yuq825@...il.com>, "Rob
 Clark" <robdclark@...il.com>, Abhinav Kumar <quic_abhinavk@...cinc.com>,
	Dmitry Baryshkov <lumag@...nel.org>, Sean Paul <sean@...rly.run>, "Marijn
 Suijten" <marijn.suijten@...ainline.org>, Lyude Paul <lyude@...hat.com>,
	Danilo Krummrich <dakr@...nel.org>, Boris Brezillon
	<boris.brezillon@...labora.com>, Rob Herring <robh@...nel.org>, Steven Price
	<steven.price@....com>, Liviu Dudau <liviu.dudau@....com>, Matthew Brost
	<matthew.brost@...el.com>, Philipp Stanner <phasta@...nel.org>, Melissa Wen
	<mwen@...lia.com>, Maíra Canal <mcanal@...lia.com>, Thomas
 Hellström <thomas.hellstrom@...ux.intel.com>, Rodrigo Vivi
	<rodrigo.vivi@...el.com>, Christian König
	<ckoenig.leichtzumerken@...il.com>, <dri-devel@...ts.freedesktop.org>,
	<linux-kernel@...r.kernel.org>, <amd-gfx@...ts.freedesktop.org>,
	<etnaviv@...ts.freedesktop.org>, <lima@...ts.freedesktop.org>,
	<linux-arm-msm@...r.kernel.org>, <freedreno@...ts.freedesktop.org>,
	<nouveau@...ts.freedesktop.org>, <intel-xe@...ts.freedesktop.org>
Subject: Re: [PATCH v11 02/10] drm/sched: Store the drm client_id in
 drm_sched_fence

On Mon, May 26, 2025 at 02:54:44PM +0200, Pierre-Eric Pelloux-Prayer wrote:
> drivers/gpu/drm/xe/xe_sched_job.c                |  3 ++-
>diff --git a/drivers/gpu/drm/xe/xe_sched_job.c b/drivers/gpu/drm/xe/xe_sched_job.c
>index f0a6ce610948..5921293b25db 100644
>--- a/drivers/gpu/drm/xe/xe_sched_job.c
>+++ b/drivers/gpu/drm/xe/xe_sched_job.c
>@@ -113,7 +113,8 @@ struct xe_sched_job *xe_sched_job_create(struct xe_exec_queue *q,
> 	kref_init(&job->refcount);
> 	xe_exec_queue_get(job->q);
>
>-	err = drm_sched_job_init(&job->drm, q->entity, 1, NULL);
>+	err = drm_sched_job_init(&job->drm, q->entity, 1, NULL,
>+				 q->xef->drm->client_id);

you can't do this here. xef is only !NULL if it's a job from userspace.
For in-kernel jobs, xef is NULL and this explodes. Right now this
completely breaks xe since one of the very first things we do is
to submit a job to save the default context. Example:
https://intel-gfx-ci.01.org/tree/intel-xe/xe-3151-56d2b14961751a677ff1f7ff8b93a6c814ce2be3/bat-bmg-1/igt@xe_module_load@load.html

	<4> [] RIP: 0010:xe_sched_job_create+0xbd/0x390 [xe]
	<4> [] Code: c1 43 18 85 c0 0f 84 6f 02 00 00 8d 50 01 09 c2 0f 88 3e 02 00 00 48 8b 03 48 8b b3 d8 00 00 00 31 c9 4c 89 ef ba 01 00 00 00 <48> 8b 40 08 4c 8b 40 60 e8 86 64 7c ff 41 89 c4 85 c0 0f 85 9b 01
	<4> [] RSP: 0018:ffffc900031972d8 EFLAGS: 00010246
	<4> [] RAX: 0000000000000000 RBX: ffff88815fc40d00 RCX: 0000000000000000
	<4> [] RDX: 0000000000000001 RSI: ffff88812e6552a8 RDI: ffff88815f939c40
	<4> [] RBP: ffffc90003197318 R08: 0000000000000000 R09: 0000000000000000
	<4> [] R10: 0000000000000000 R11: 0000000000000000 R12: ffffc90003197428
	<4> [] R13: ffff88815f939c40 R14: ffff88811f054000 R15: ffff88815fc40d00
	<4> [] FS:  00007681f2948940(0000) GS:ffff8888daf14000(0000) knlGS:0000000000000000
	<4> [] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
	<4> [] CR2: 0000000000000008 CR3: 0000000118315004 CR4: 0000000000f72ef0
	<4> [] PKRU: 55555554
	<4> [] Call Trace:
	<4> []  <TASK>
	<4> []  __xe_bb_create_job+0xa2/0x240 [xe]
	<4> []  ? find_held_lock+0x31/0x90
	<4> []  ? xa_find_after+0x12c/0x250
	<4> []  xe_bb_create_job+0x6e/0x380 [xe]
	<4> []  ? xa_find_after+0x136/0x250
	<4> []  ? __drm_dev_dbg+0x7d/0xb0
	<4> []  xe_gt_record_default_lrcs+0x542/0xb00 [xe]

Can we use 0 for in-kernel client since drm_file starts them from 1?
Like this:

| diff --git a/drivers/gpu/drm/xe/xe_sched_job.c b/drivers/gpu/drm/xe/xe_sched_job.c
| index 5921293b25db3..d21bf8f269640 100644
| --- a/drivers/gpu/drm/xe/xe_sched_job.c
| +++ b/drivers/gpu/drm/xe/xe_sched_job.c
| @@ -114,7 +114,7 @@ struct xe_sched_job *xe_sched_job_create(struct xe_exec_queue *q,
|         xe_exec_queue_get(job->q);
|  
|         err = drm_sched_job_init(&job->drm, q->entity, 1, NULL,
| -                                q->xef->drm->client_id);
| +                                q->xef ? q->xef->drm->client_id : 0);
|         if (err)
|                 goto err_free;

I tested with the above diff and it at least loads...

Also, I see this in intel-xe mailing list, but I'm not sure why we
didn't have any CI results... I will check that.

Lucas De Marchi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ