lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACSVV00Yo=cZxUztB5Jf7153bsnnuATrh3L1utw4SrOQmYERug@mail.gmail.com>
Date: Mon, 18 Aug 2025 06:18:17 -0700
From: Rob Clark <rob.clark@....qualcomm.com>
To: Akhil P Oommen <akhilpo@....qualcomm.com>
Cc: Antonino Maniscalco <antomani103@...il.com>, Sean Paul <sean@...rly.run>,
        Konrad Dybcio <konradybcio@...nel.org>,
        Dmitry Baryshkov <lumag@...nel.org>,
        Abhinav Kumar <abhinav.kumar@...ux.dev>,
        Jessica Zhang <jessica.zhang@....qualcomm.com>,
        Marijn Suijten <marijn.suijten@...ainline.org>,
        David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>,
        linux-arm-msm@...r.kernel.org, dri-devel@...ts.freedesktop.org,
        freedreno@...ts.freedesktop.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] drm/msm: skip re-emitting IBs for unusable VMs

On Mon, Aug 18, 2025 at 5:10 AM Akhil P Oommen <akhilpo@....qualcomm.com> wrote:
>
> On 8/13/2025 6:34 PM, Antonino Maniscalco wrote:
> > When a VM is marked as an usuable we disallow new submissions from it,
> > however submissions that where already scheduled on the ring would still
> > be re-sent.
> >
> > Since this can lead to further hangs, avoid emitting the actual IBs.
> >
> > Fixes: 6a4d287a1ae6 ("drm/msm: Mark VM as unusable on GPU hangs")
> > Signed-off-by: Antonino Maniscalco <antomani103@...il.com>
> > ---
> >  drivers/gpu/drm/msm/msm_gpu.c | 9 ++++++++-
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
> > index c317b25a8162edba0d594f61427eac4440871b73..e6cd85c810bd2314c8bba53644a622464713b7f2 100644
> > --- a/drivers/gpu/drm/msm/msm_gpu.c
> > +++ b/drivers/gpu/drm/msm/msm_gpu.c
> > @@ -553,8 +553,15 @@ static void recover_worker(struct kthread_work *work)
> >                       unsigned long flags;
> >
> >                       spin_lock_irqsave(&ring->submit_lock, flags);
> > -                     list_for_each_entry(submit, &ring->submits, node)
> > +                     list_for_each_entry(submit, &ring->submits, node) {
> > +                             /*
> > +                              * If the submit uses an unusable vm make sure
> > +                              * we don't actually run it
> > +                              */
> > +                             if (to_msm_vm(submit->vm)->unusable)
> > +                                     submit->nr_cmds = 0;
>
> Just curious, why not just retire this submit here?

Because then you'd end up with submits retiring out of order (ie.
fences on the same timeline signaling out of order)

BR,
-R

> -Akhil
>
> >                               gpu->funcs->submit(gpu, submit);
> > +                     }
> >                       spin_unlock_irqrestore(&ring->submit_lock, flags);
> >               }
> >       }
> >
> > ---
> > base-commit: 8290d37ad2b087bbcfe65fa5bcaf260e184b250a
> > change-id: 20250813-unusable_fix_b4-10bde6f3b756
> >
> > Best regards,
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ