lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 11 Nov 2022 11:54:22 +0000
From:   Robin Murphy <robin.murphy@....com>
To:     Catalin Marinas <catalin.marinas@....com>,
        Amit Pundir <amit.pundir@...aro.org>
Cc:     Bjorn Andersson <andersson@...nel.org>,
        Sibi Sankar <quic_sibis@...cinc.com>,
        Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>,
        Will Deacon <will@...nel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
        Dmitry Baryshkov <dmitry.baryshkov@...aro.org>
Subject: Re: [GIT PULL] arm64 updates for 6.1-rc1

On 2022-11-11 11:15, Catalin Marinas wrote:
> On Tue, Nov 08, 2022 at 10:58:16PM +0530, Amit Pundir wrote:
>> On Tue, 25 Oct 2022 at 18:08, Amit Pundir <amit.pundir@...aro.org> wrote:
>>> On Wed, 12 Oct 2022 at 17:24, Catalin Marinas <catalin.marinas@....com> wrote:
>>>> On Sat, Oct 08, 2022 at 08:28:26PM +0530, Amit Pundir wrote:
>>>>> On Wed, 5 Oct 2022 at 20:11, Catalin Marinas <catalin.marinas@....com> wrote:
>>>>>> Will Deacon (2):
>>>>>>        arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()
>>>>>
>>>>> This patch broke AOSP on Dragonboard 845c (SDM845). I don't see any
>>>>> relevant crash in the attached log and device silently reboots into
>>>>> USB crash dump mode. The crash is fairly reproducible on db845c. I
>>>>> could trigger it twice in 5 reboots and it always crash at the same
>>>>> point during the boot process. Reverting this patch fixes the crash.
>>>>>
>>>>> I'm happy to test run any debug patche(s), that would help narrow
>>>>> down this breakage.
> [...]
>>> Further narrowed down the breakage to the userspace daemon rmtfs
>>> https://github.com/andersson/rmtfs. Is there anything specific in the
>>> userspace code that I should be paying attention to?

FWIW, this scenario appears to have pretty much everything going on - 
buffers allocated from no-map carveouts, being shared with firmware as 
well as DMA devices, being poked by userspace through /dev/mem, and 
presumably with the funky Qualcomm sort-of-coherent outer cache in the 
mix too (where IIRC the outer non-cacheable attribute behaves 
differently for CPUs vs. DMA). If anything's ever going to go awry with 
mismatched attributes and stale cachelines, it's probably in that setup 
somewhere.

> Since you don't see anything in the logs like a crash and the system
> restarts, I suspect it's some deadlock and that's triggering the
> watchdog. We have an erratum (826319) but that's for Cortex-A53. IIUC
> SDM845 has Kryo 3xx series which based on some random google searches is
> derived from A75/A55. Unfortunately the MIDR_EL1 register doesn't match
> the Arm Ltd numbering, so I have no idea what CPUs these are by looking
> at the boot log.

Note that the EL2 firmware on these things tends to happily reset the 
system without warning if you so much as look at it funny, so I'd 
imagine a straightforward timeout or other unexpected condition due to 
coherency getting lost somewhere in the kernel/firmware/device handoff 
process is probably more than enough.

Robin.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ