lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKR-sGfL5_VU9uxJHGyZ-bj2P_7R6+OOfWs6Yf-ihcCF8bD2MA@mail.gmail.com>
Date:   Sun, 12 Mar 2023 19:50:20 +0100
From:   Álvaro Fernández Rojas <noltari@...il.com>
To:     Florian Fainelli <f.fainelli@...il.com>
Cc:     William Zhang <william.zhang@...adcom.com>,
        Jonas Gorski <jonas.gorski@...il.com>,
        bcm-kernel-feedback-list@...adcom.com, tsbogend@...ha.franken.de,
        linux-mips@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mips: bmips: BCM6358: disable arch_sync_dma_for_cpu_all()

Hi Florian,

I tried what you suggested but it stil panics on EHCI:

[    0.000000] Linux version 5.15.98 (noltari@...antis)
(mips-openwrt-linux-musl-gcc (OpenWrt GCC 12.2.0 r22187+1-19817fa3f5)
12.2.0, GNU ld (GNU Binutils) 2.40.0) #0 SMP Sun Mar 12 18:23:28 2023
[    0.000000] bmips_cpu_setup: read_c0_brcm_config_0() = 0xe30e1006
[    0.000000] bmips_cpu_setup: cbr + BMIPS_RAC_CONFIG = 0x3c1b8041
[    0.000000] CPU0 revision is: 0002a010 (Broadcom BMIPS4350)

It looks like bit 29 is set so RAC should be present.
And RAC_I seems to be set, but not RAC_D...

BTW, this is what I added to bmips_cpu_setup:

case CPU_BMIPS4350:
cfg = read_c0_brcm_config_0();
pr_info("bmips_cpu_setup: read_c0_brcm_config_0() = 0x%x\n", cfg);

cfg = __raw_readl(cbr + BMIPS_RAC_CONFIG);
pr_info("bmips_cpu_setup: cbr + BMIPS_RAC_CONFIG = 0x%x\n", cfg);
__raw_writel(cfg | BIT(0) | BIT(1), cbr + BMIPS_RAC_CONFIG);
__raw_readl(cbr + BMIPS_RAC_CONFIG);
break;

Best regards,
Álvaro.


El dom, 12 mar 2023 a las 17:16, Florian Fainelli
(<f.fainelli@...il.com>) escribió:
>
>
>
> On 3/11/2023 11:31 PM, William Zhang wrote:
> >
> >
> > On 03/11/2023 11:44 AM, Jonas Gorski wrote:
> >> On Sat, 11 Mar 2023 at 18:32, Florian Fainelli <f.fainelli@...il.com>
> >> wrote:
> >>>
> >>>
> >>>
> >>> On 3/10/2023 4:13 AM, Álvaro Fernández Rojas wrote:
> >>>> arch_sync_dma_for_cpu_all() causes kernel panics on BCM6358 with
> >>>> EHCI/OHCI:
> >>>> [    3.881739] usb 1-1: new high-speed USB device number 2 using
> >>>> ehci-platform
> >>>> [    3.895011] Reserved instruction in kernel code[#1]:
> >>>> [    3.900113] CPU: 0 PID: 1 Comm: init Not tainted 5.10.16 #0
> >>>> [    3.905829] $ 0   : 00000000 10008700 00000000 77d94060
> >>>> [    3.911238] $ 4   : 7fd1f088 00000000 81431cac 81431ca0
> >>>> [    3.916641] $ 8   : 00000000 ffffefff 8075cd34 00000000
> >>>> [    3.922043] $12   : 806f8d40 f3e812b7 00000000 000d9aaa
> >>>> [    3.927446] $16   : 7fd1f068 7fd1f080 7ff559b8 81428470
> >>>> [    3.932848] $20   : 00000000 00000000 55590000 77d70000
> >>>> [    3.938251] $24   : 00000018 00000010
> >>>> [    3.943655] $28   : 81430000 81431e60 81431f28 800157fc
> >>>> [    3.949058] Hi    : 00000000
> >>>> [    3.952013] Lo    : 00000000
> >>>> [    3.955019] epc   : 80015808 setup_sigcontext+0x54/0x24c
> >>>> [    3.960464] ra    : 800157fc setup_sigcontext+0x48/0x24c
> >>>> [    3.965913] Status: 10008703       KERNEL EXL IE
> >>>> [    3.970216] Cause : 00800028 (ExcCode 0a)
> >>>> [    3.974340] PrId  : 0002a010 (Broadcom BMIPS4350)
> >>>> [    3.979170] Modules linked in: ohci_platform ohci_hcd
> >>>> fsl_mph_dr_of ehci_platform ehci_fsl ehci_hcd gpio_button_hotplug
> >>>> usbcore nls_base usb_common
> >>>> [    3.992907] Process init (pid: 1, threadinfo=(ptrval),
> >>>> task=(ptrval), tls=77e22ec8)
> >>>> [    4.000776] Stack : 81431ef4 7fd1f080 81431f28 81428470 7fd1f068
> >>>> 81431edc 7ff559b8 81428470
> >>>> [    4.009467]         81431f28 7fd1f080 55590000 77d70000 77d5498c
> >>>> 80015c70 806f0000 8063ae74
> >>>> [    4.018149]         08100002 81431f28 0000000a 08100002 81431f28
> >>>> 0000000a 77d6b418 00000003
> >>>> [    4.026831]         ffffffff 80016414 80080734 81431ecc 81431ecc
> >>>> 00000001 00000000 04000000
> >>>> [    4.035512]         77d54874 00000000 00000000 00000000 00000000
> >>>> 00000012 00000002 00000000
> >>>> [    4.044196]         ...
> >>>> [    4.046706] Call Trace:
> >>>> [    4.049238] [<80015808>] setup_sigcontext+0x54/0x24c
> >>>> [    4.054356] [<80015c70>] setup_frame+0xdc/0x124
> >>>> [    4.059015] [<80016414>] do_notify_resume+0x1dc/0x288
> >>>> [    4.064207] [<80011b50>] work_notifysig+0x10/0x18
> >>>> [    4.069036]
> >>>> [    4.070538] Code: 8fc300b4  00001025  26240008 <ac820000>
> >>>> ac830004  3c048063  0c0228aa  24846a00  26240010
> >>>> [    4.080686]
> >>>> [    4.082517] ---[ end trace 22a8edb41f5f983b ]---
> >>>> [    4.087374] Kernel panic - not syncing: Fatal exception
> >>>> [    4.092753] Rebooting in 1 seconds..
> >>>
> >>> Did you pinpoint which specific instruction within
> >>> arch_sync_dma_for_cpu_all() is causing the reserved instruction
> >>> exception?
> >>
> >> It's setup_sigcontext(), not arch_sync_dma_for_cpu_all() that's
> >> causing the exception ;-)
> >>
> >> Hand decoding the Code gives me
> >>
> >> lw $1, 0xb4($fp)
> >> or $v0, 0, 0
> >> addiu $a0, $s1, 8
> >> sw $v0, 0($a0) <- the code in brackets, so I guess EPC?
> >> sw $v1, 4($a0)
> >>
> >> which I assume is this part:
> >>
> >> err |= __put_user(regs->cp0_epc, &sc->sc_pc);
> >>
> >> (0xb4 is the offset of cp0_epc, 0x8 the offset of sc_pc)
> >>
> >> One thing I see is that we do the RAC flush for BMIPS3300, 4350 and
> >> 4380, but only initialize it for 3300 [1], but leave it at whatever
> >> state the bootloader did for the other ones. Maybe it has some invalid
> >> config in (that particuar?) 6358 that triggers issues later on after a
> >> flush? E.g. the flush puts it in an error state, and the next time
> >> something triggers a prefetch(write?) (by trying to access userspace)
> >> it generates an error exception.
> >>
> > Depending on the bootloader but likely bootloader does not use RAC at
> > all.  So agree that RAC may not be properly initialized when the flush
> > function is called and push the stale data to corrupt memory and cause
> > problem later on the userspace.
>
> Alvaro, could you do the following and let us know the results, at boot
> time in bcm6358_quirks():
>
> - issue a read_c0_brcm_config_0() and look whether bit 29 (RAC present)
> is set, if it is not set, then we should forcibly disable the use of the
> RAC using a flag
>
> - issue a __raw_readl(cbr + BMIPS_RAC_CONFIG) and check whether bits 0
> (RAC_I) or 1 (RAC_D) are set, if not, try to set them and see whether it
> works
>
> Thanks!
> --
> Florian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ