lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20af7963-1d5a-d274-a46e-ca9a287d745a@collabora.com>
Date:   Mon, 11 May 2020 07:43:47 +0200
From:   Tomeu Vizoso <tomeu.vizoso@...labora.com>
To:     Clément Péron <peron.clem@...il.com>,
        Rob Herring <robh@...nel.org>,
        Steven Price <steven.price@....com>,
        Alyssa Rosenzweig <alyssa.rosenzweig@...labora.com>,
        Viresh Kumar <vireshk@...nel.org>, Nishanth Menon <nm@...com>,
        Stephen Boyd <sboyd@...nel.org>,
        Maxime Ripard <mripard@...nel.org>,
        Chen-Yu Tsai <wens@...e.org>
Cc:     dri-devel@...ts.freedesktop.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 00/15][RFC] Add regulator devfreq support to Panfrost

On 5/10/20 6:55 PM, Clément Péron wrote:
> Hi,
> 
> This serie cleans and adds regulator support to Panfrost devfreq.
> This is mostly based on comment for the freshly introduced lima
> devfreq.
> 
> We need to add regulator support because on Allwinner the GPU OPP
> table defines both frequencies and voltages.
> 
> First patches [01-08] should not change the actual behavior
> and introduce a proper panfrost_devfreq struct.
> 
> Fatches after are WIP and add regulator support.
> 
> However I got several issues first we need to avoid getting regulator
> if devfreq get by itself the regulator, but as of today the OPP
> framework only get and don't enable the regulator...
> An HACK for now is to add regulator-always-on in the device-tree.
> 
> Then when I enable devfreq I got several faults like.
> I'm totally noob on GPU sched/fault and couldn't be helpfull with this.

Do you know at which frequencies do the faults happen? From what I can 
see, it's just the GPU behaving erratically, and the CPU reading random 
values from the GPU registers. Given the subject of this series, I guess 
the GPU isn't getting enough power.

There could be a problem with the OPP table, might be a good idea to see 
what levels are problematic and try with a more conservative table.

Besides that, there could be a problem with clock frequency changes, or 
voltage changes. It may take some time for the final state to be stable, 
depending how the regulation happens.

Thanks,

Tomeu




> I got this running glmark2 on T720 (Allwinner H6) with Mesa 20.0.5.
> # glmark2-es2-drm
> =======================================================
>      glmark2 2017.07
> =======================================================
>      OpenGL Information
>      GL_VENDOR:     Panfrost
>      GL_RENDERER:   Mali T720 (Panfrost)
>      GL_VERSION:    OpenGL ES 2.0 Mesa 20.0.5
> =======================================================
> 
> [   93.550063] panfrost 1800000.gpu: GPU Fault 0x00000088 (UNKNOWN) at 0x0000000080117100
> [   94.045401] panfrost 1800000.gpu: gpu sched timeout, js=0, config=0x3700, status=0x8, head=0x21d6c00, tail=0x21d6c00, sched_job=00000000e3c2132f
> 
> [  328.871070] panfrost 1800000.gpu: Unhandled Page fault in AS0 at VA 0x0000000000000000
> [  328.871070] Reason: TODO
> [  328.871070] raw fault status: 0xAA0003C2
> [  328.871070] decoded fault status: SLAVE FAULT
> [  328.871070] exception type 0xC2: TRANSLATION_FAULT_LEVEL2
> [  328.871070] access type 0x3: WRITE
> [  328.871070] source id 0xAA00
> [  329.373327] panfrost 1800000.gpu: gpu sched timeout, js=1, config=0x3700, status=0x8, head=0xa1a4900, tail=0xa1a4900, sched_job=000000007ac31097
> [  329.386527] panfrost 1800000.gpu: js fault, js=0, status=DATA_INVALID_FAULT, head=0xa1a4c00, tail=0xa1a4c00
> [  329.396293] panfrost 1800000.gpu: gpu sched timeout, js=0, config=0x3700, status=0x58, head=0xa1a4c00, tail=0xa1a4c00, sched_job=0000000004c90381
> [  329.411521] panfrost 1800000.gpu: Unhandled Page fault in AS0 at VA 0x0000000000000000
> [  329.411521] Reason: TODO
> [  329.411521] raw fault status: 0xAA0003C2
> [  329.411521] decoded fault status: SLAVE FAULT
> [  329.411521] exception type 0xC2: TRANSLATION_FAULT_LEVEL2
> [  329.411521] access type 0x3: WRITE
> [  329.411521] source id 0xAA00
> 
> Thanks for your reviews, help on this serie,
> Clement
> 
> Clément Péron (15):
>    drm/panfrost: avoid static declaration
>    drm/panfrost: clean headers in devfreq
>    drm/panfrost: don't use pfdevfreq.busy_count to know if hw is idle
>    drm/panfrost: introduce panfrost_devfreq struct
>    drm/panfrost: use spinlock instead of atomic
>    drm/panfrost: properly handle error in probe
>    drm/panfrost: use device_property_present to check for OPP
>    drm/panfrost: move devfreq_init()/fini() in device
>    drm/panfrost: dynamically alloc regulators
>    drm/panfrost: add regulators to devfreq
>    drm/panfrost: set devfreq clock name
>    arm64: defconfig: Enable devfreq cooling device
>    arm64: dts: allwinner: h6: Add cooling map for GPU
>    [DO NOT MERGE] arm64: dts: allwinner: h6: Add GPU OPP table
>    [DO NOT MERGE] arm64: dts: allwinner: force GPU regulator to be always
> 
>   .../dts/allwinner/sun50i-h6-beelink-gs1.dts   |   1 +
>   arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi  | 102 ++++++++++
>   arch/arm64/configs/defconfig                  |   1 +
>   drivers/gpu/drm/panfrost/panfrost_devfreq.c   | 190 ++++++++++++------
>   drivers/gpu/drm/panfrost/panfrost_devfreq.h   |  32 ++-
>   drivers/gpu/drm/panfrost/panfrost_device.c    |  56 ++++--
>   drivers/gpu/drm/panfrost/panfrost_device.h    |  14 +-
>   drivers/gpu/drm/panfrost/panfrost_drv.c       |  15 +-
>   drivers/gpu/drm/panfrost/panfrost_job.c       |  10 +-
>   9 files changed, 310 insertions(+), 111 deletions(-)
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ