lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ad726bb4-fbd1-389c-4978-03eec77b4322@redhat.com>
Date:   Mon, 4 Sep 2023 16:16:15 -0400
From:   Waiman Long <longman@...hat.com>
To:     "Russell King (Oracle)" <linux@...linux.org.uk>
Cc:     Rafał Miłecki <zajec5@...il.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>, Will Deacon <will@...nel.org>,
        Boqun Feng <boqun.feng@...il.com>,
        Daniel Lezcano <daniel.lezcano@...aro.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Florian Fainelli <f.fainelli@...il.com>,
        linux-clk@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
        netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
        openwrt-devel@...ts.openwrt.org,
        bcm-kernel-feedback-list@...adcom.com
Subject: Re: ARM BCM53573 SoC hangs/lockups caused by locks/clock/random
 changes

On 9/4/23 11:40, Russell King (Oracle) wrote:
> On Mon, Sep 04, 2023 at 11:25:57AM -0400, Waiman Long wrote:
>> On 9/4/23 04:33, Rafał Miłecki wrote:
>>> As those hangs/lockups are related to so many different changes it's
>>> really hard to debug them.
>>>
>>> This bug seems to be specific to the slow arch clock that affects
>>> stability only when kernel locking code and symbols layout trigger some
>>> very specific timing.
>>>
>>> Enabling CONFIG_PROVE_LOCKING seems to make issue go away but it affects
>>> so much code it's hard to tell why it actually matters.
>>>
>>> Same for disabling CONFIG_SMP. I noticed Broadcom's SDK keeps it
>>> disabled. I tried it and it improves stability (I had 3 devices with 6
>>> days of uptime and counting) indeed. Again it affects a lot of kernel
>>> parts so it's hard to tell why it helps.
>>>
>>> Unless someone comes up with some magic solution I'll probably try
>>> building BCM53573 images without CONFIG_SMP for my personal needs.
>> All the locking operations rely on the fact that the instruction to acquire
>> or release a lock is atomic. Is it possible that it may not be the case
>> under certain circumstances for this ARM BCM53573 SoC? Or maybe some Kconfig
>> options are not set correctly like missing some errata that are needed.
>>
>> I don't know enough about the 32-bit arm architecture to say whether this is
>> the case or not, but that is my best guess.
> So, BCM53573 is Cortex-A7, which is ARMv7, which has the exclusive
> load/store instructions. Whether the SoC has the necessary exclusive
> monitors to support these instructions is another matter, and I
> suspect someone with documentation would need to check that.

To clarify, it is not necessary to use atomic instruction as in x86, but 
the LL/SC style of synchronization instructions with proper hardware 
support should also be enough. Again the hardware needs to have the 
proper support for the correct operation of those synchronization 
instructions.

Cheers,
Longman

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ