lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <84b7dfcd-05e9-53b2-e0a0-b03105937eb3@arm.com>
Date:   Fri, 4 Sep 2020 11:53:10 +0100
From:   Robin Murphy <robin.murphy@....com>
To:     Mauro Carvalho Chehab <mchehab+huawei@...nel.org>
Cc:     linuxarm@...wei.com, mauro.chehab@...wei.com,
        John Stultz <john.stultz@...aro.org>,
        Manivannan Sadhasivam <mani@...nel.org>,
        Krzysztof Kozlowski <krzk@...nel.org>,
        linux-kernel@...r.kernel.org, Mark Brown <broonie@...nel.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Derek Kiernan <derek.kiernan@...inx.com>,
        Arnd Bergmann <arnd@...db.de>, Vinod Koul <vkoul@...nel.org>,
        Kishon Vijay Abraham I <kishon@...com>,
        "David S. Miller" <davem@...emloft.net>,
        Rob Herring <robh@...nel.org>, Yu Chen <chenyu56@...wei.com>,
        Rob Herring <robh+dt@...nel.org>,
        Dragan Cvetic <dragan.cvetic@...inx.com>,
        linux-arm-kernel@...ts.infradead.org, devicetree@...r.kernel.org,
        Liam Girdwood <lgirdwood@...il.com>,
        Wei Xu <xuwei5@...ilicon.com>
Subject: Re: [RFC 00/10] Add USB support for Hikey 970

On 2020-09-04 11:23, Mauro Carvalho Chehab wrote:
> This RFC adds what seems to be needed for USB to work with Hikey 970.
> While this driver works fine on Kernel 4.9 and 4.19, there's a hack there,
> in the form of some special binding logic under dwg3 driver,  that seems to
>   be just adding some delay,  and doing some different initializations at
> PM (basically, disabling autosuspend).
> 
> With upstream Kernel, however, I'm getting a hard to track
> Kernel panic:
> 
> [    1.837458] SError Interrupt on CPU0, code 0xbf000002 -- SError
> [    1.837462] CPU: 0 PID: 74 Comm: kworker/0:1 Not tainted 5.8.0+ #205
> [    1.837463] Hardware name: HiKey970 (DT)
> [    1.837465] Workqueue: events deferred_probe_work_func
> [    1.837467] pstate: 20000005 (nzCv daif -PAN -UAO BTYPE=--)
> [    1.837468] pc : _raw_spin_unlock_irqrestore+0x18/0x50
> [    1.837469] lr : regmap_unlock_spinlock+0x14/0x20
> [    1.837470] sp : ffff8000124dba60
> [    1.837471] x29: ffff8000124dba60 x28: 0000000000000000
> [    1.837474] x27: ffff0001b7e854c8 x26: ffff80001204ea18
> [    1.837476] x25: 0000000000000005 x24: ffff800011f918f8
> [    1.837479] x23: ffff800011fbb588 x22: ffff0001b7e40e00
> [    1.837481] x21: 0000000000000100 x20: 0000000000000000
> [    1.837483] x19: ffff0001b767ec00 x18: 00000000ff10c000
> [    1.837485] x17: 0000000000000002 x16: 0000b0740fdb9950
> [    1.837488] x15: ffff8000116c1198 x14: ffffffffffffffff
> [    1.837490] x13: 0000000000000030 x12: 0101010101010101
> [    1.837493] x11: 0000000000000020 x10: ffff0001bf17d130
> [    1.837495] x9 : 0000000000000000 x8 : ffff0001b6938080
> [    1.837497] x7 : 0000000000000000 x6 : 000000000000003f
> [    1.837500] x5 : 0000000000000000 x4 : 0000000000000000
> [    1.837502] x3 : ffff80001096a880 x2 : 0000000000000000
> [    1.837505] x1 : ffff0001b7e40e00 x0 : 0000000100000001
> [    1.837507] Kernel panic - not syncing: Asynchronous SError Interrupt
> [    1.837509] CPU: 0 PID: 74 Comm: kworker/0:1 Not tainted 5.8.0+ #205
> [    1.837510] Hardware name: HiKey970 (DT)
> [    1.837511] Workqueue: events deferred_probe_work_func
> [    1.837513] Call trace:
> [    1.837514]  dump_backtrace+0x0/0x1e0
> [    1.837515]  show_stack+0x18/0x24
> [    1.837516]  dump_stack+0xc0/0x11c
> [    1.837517]  panic+0x15c/0x324
> [    1.837518]  nmi_panic+0x8c/0x90
> [    1.837519]  arm64_serror_panic+0x78/0x84
> [    1.837520]  do_serror+0x158/0x15c
> [    1.837521]  el1_error+0x84/0x100
> [    1.837522]  _raw_spin_unlock_irqrestore+0x18/0x50
> [    1.837523]  regmap_write+0x58/0x80
> [    1.837524]  hi3660_reset_deassert+0x28/0x34
> [    1.837526]  reset_control_deassert+0x50/0x260
> [    1.837527]  reset_control_deassert+0xf4/0x260
> [    1.837528]  dwc3_probe+0x5dc/0xe6c
> [    1.837529]  platform_drv_probe+0x54/0xb0
> [    1.837530]  really_probe+0xe0/0x490
> [    1.837531]  driver_probe_device+0xf4/0x160
> [    1.837532]  __device_attach_driver+0x8c/0x114
> [    1.837533]  bus_for_each_drv+0x78/0xcc
> [    1.837534]  __device_attach+0x108/0x1a0
> [    1.837535]  device_initial_probe+0x14/0x20
> [    1.837537]  bus_probe_device+0x98/0xa0
> [    1.837538]  deferred_probe_work_func+0x88/0xe0
> [    1.837539]  process_one_work+0x1cc/0x350
> [    1.837540]  worker_thread+0x2c0/0x470
> [    1.837541]  kthread+0x154/0x160
> [    1.837542]  ret_from_fork+0x10/0x30
> [    1.837569] SMP: stopping secondary CPUs
> [    1.837570] Kernel Offset: 0x1d0000 from 0xffff800010000000
> [    1.837571] PHYS_OFFSET: 0x0
> [    1.837572] CPU features: 0x240002,20882004
> [    1.837573] Memory Limit: none
> 
> I suspect that the driver works downstream because of the extra
> delay of probing an additional driver, as porting such driver
> to upstream sometimes makes this driver to work. So, it could
> be due to some race condition.
> 
> The problem could also be due to something wrong at DT,
> as, with the additional driver, the DT bindings are different.
> 
> As I'm not familiar about the Serror error mechanism on ARM,
> I'm wondering if someone could give me some hint about how
> can I identify what's happening here.  Due to the panic(), I
> can't check what wrong happened there. Not sure if it would
> be safe to just hack the error handling part of Serror, in order
> to let the boot to finish.

As one of my colleagues so wonderfully puts it, SError is effectively 
"part of the SoC has fallen off" - it's an asynchronous notification of 
some serious error condition external to the CPU. In this case, it seems 
quite likely from trying to access a hardware block before it's powered 
up or has finished its reset cycle, and so some CPU access is raising an 
error in the interconnect instead of completing as expected.

Robin.

> PS.: the driver/misc hisi_hikey_usb driver on this series came
> from the John Stultz Hikey 960 tree.
> 
> Thanks!
> Mauro
> 
> Mauro Carvalho Chehab (7):
>    phy: hisilicon: phy-hi3670-usb3: use a consistent namespace
>    phy: hisilicon: phy-hi3670-usb3: fix coding style
>    phy: hisilicon: phy-hi3670-usb3: change some DT properties
>    dt-bindings: phy: convert phy-kirin970-usb3.txt to yaml
>    MAINTAINERS: add myself as maintainer for Kirin 970 USB PHY
>    misc: hisi_hikey_usb: add support for Hikey 970
>    dts: hisilicon: add support for USB3 on Hikey 970
> 
> Yu Chen (3):
>    phy: hisilicon: add USB physical layer for Kirin 3670
>    phy: hisilicon: phy-hi3670-usb3: fix some issues at the init code
>    misc: hisi_hikey_usb: Driver to support onboard USB gpio hub on
>      Hikey960
> 
>   .../bindings/phy/hisilicon,hi3670-usb3.yaml   |  72 ++
>   MAINTAINERS                                   |  16 +-
>   .../boot/dts/hisilicon/hi3670-hikey970.dts    | 103 +++
>   arch/arm64/boot/dts/hisilicon/hi3670.dtsi     |  49 ++
>   drivers/misc/Kconfig                          |   9 +
>   drivers/misc/Makefile                         |   1 +
>   drivers/misc/hisi_hikey_usb.c                 | 272 +++++++
>   drivers/phy/hisilicon/Kconfig                 |  10 +
>   drivers/phy/hisilicon/Makefile                |   1 +
>   drivers/phy/hisilicon/phy-hi3670-usb3.c       | 671 ++++++++++++++++++
>   10 files changed, 1203 insertions(+), 1 deletion(-)
>   create mode 100644 Documentation/devicetree/bindings/phy/hisilicon,hi3670-usb3.yaml
>   create mode 100644 drivers/misc/hisi_hikey_usb.c
>   create mode 100644 drivers/phy/hisilicon/phy-hi3670-usb3.c
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ