lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Mon, 24 Aug 2020 13:04:13 +0200
From:   Lucas Stach <l.stach@...gutronix.de>
To:     Russell King - ARM Linux admin <linux@...linux.org.uk>,
        Christian Gmeiner <christian.gmeiner@...il.com>
Cc:     "Ing. Josua Mayer" <josua.mayer@....eu>,
        LKML <linux-kernel@...r.kernel.org>, stable@...r.kernel.org,
        David Airlie <airlied@...ux.ie>,
        Daniel Vetter <daniel@...ll.ch>,
        The etnaviv authors <etnaviv@...ts.freedesktop.org>,
        DRI mailing list <dri-devel@...ts.freedesktop.org>
Subject: Re: [PATCH] drm/etnaviv: fix external abort seen on GC600 rev 0x19

Hi Russell,

Am Sonntag, den 23.08.2020, 20:19 +0100 schrieb Russell King - ARM Linux admin:
> On Sun, Aug 23, 2020 at 09:10:25PM +0200, Christian Gmeiner wrote:
> > Hi
> > 
> > > I have formally tested the patch with 5.7.10 - and it doesn't resolve
> > > the issue - sadly :(
> > > 
> > > From my testing, the reads on
> > > VIVS_HI_CHIP_PRODUCT_ID
> > > VIVS_HI_CHIP_ECO_ID
> > > need to be conditional - while
> > > VIVS_HI_CHIP_CUSTOMER_ID
> > > seems to be okay.
> > > 
> > 
> > Uhh.. okay.. just send a V2 - thanks for testing :)
> 
> There is also something else going on with the GC600 - 5.4 worked fine,
> 5.8 doesn't - my 2D Xorg driver gets stuck waiting on a BO after just
> a couple of minutes.  Looking in debugfs, there's a whole load of BOs
> that are listed as "active", yet the GPU is idle:
> 
>    00020000: A  0 ( 7) 00000000 00000000 8294400
>    00010000: I  0 ( 1) 00000000 00000000 4096
>    00010000: I  0 ( 1) 00000000 00000000 4096
>    00010000: I  0 ( 1) 00000000 00000000 327680
>    00010000: A  0 ( 7) 00000000 00000000 8388608
>    00010000: I  0 ( 1) 00000000 00000000 8388608
>    00010000: I  0 ( 1) 00000000 00000000 8388608
>    00010000: A  0 ( 7) 00000000 00000000 8388608
>    00010000: A  0 ( 3) 00000000 00000000 8388608
>    00010000: A  0 ( 4) 00000000 00000000 8388608
>    00010000: A  0 ( 3) 00000000 00000000 8388608
>    00010000: A  0 ( 3) 00000000 00000000 8388608
>    00010000: A  0 ( 3) 00000000 00000000 8388608
> ....
>    00010000: A  0 ( 3) 00000000 00000000 8388608
> Total 38 objects, 293842944 bytes
> 
> My guess is there's something up with the way a job completes that's
> causing the BOs not to be marked inactive.  I haven't yet been able
> to debug any further.

The patch I just sent out should fix this issue. The DRM scheduler is
doing some funny business which breaks our job done signalling if the
GPU timeout has been hit, even if our timeout handler is just extending
the timeout as the GPU is still working normally.

Regards,
Lucas

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ