linux-kernel - Re: [PATCH] arm64: dts: qcom: sc8280xp: fix PCIe DMA coherency

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Y4DUr7tVqnFT5HV9@hovoldconsulting.com>
Date:   Fri, 25 Nov 2022 15:43:59 +0100
From:   Johan Hovold <johan@...nel.org>
To:     Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>
Cc:     Johan Hovold <johan+linaro@...nel.org>,
        Bjorn Andersson <andersson@...nel.org>,
        Andy Gross <agross@...nel.org>,
        Konrad Dybcio <konrad.dybcio@...ainline.org>,
        Rob Herring <robh+dt@...nel.org>,
        Krzysztof Kozlowski <krzysztof.kozlowski+dt@...aro.org>,
        Will Deacon <will@...nel.org>,
        Robin Murphy <robin.murphy@....com>,
        Christoph Hellwig <hch@....de>,
        Ard Biesheuvel <ardb@...nel.org>,
        Catalin Marinas <catalin.marinas@....com>,
        linux-arm-msm@...r.kernel.org,
        linux-arm-kernel@...ts.infradead.org, devicetree@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] arm64: dts: qcom: sc8280xp: fix PCIe DMA coherency

On Fri, Nov 25, 2022 at 07:56:25PM +0530, Manivannan Sadhasivam wrote:
> On Thu, Nov 24, 2022 at 03:25:01PM +0100, Johan Hovold wrote:
> > The devices on the SC8280XP PCIe buses are cache coherent and must be
> > marked as such to avoid data corruption.
> > 
> > A coherent device can, for example, end up snooping stale data from the
> > caches instead of using data written by the CPU through the
> > non-cacheable mapping which is used for consistent DMA buffers for
> > non-coherent devices.
> > 
> 
> Also, the device may write into the L2 cache (or whatever cache that is
> accessible) if there is an entry and the CPU may invalidate it before reading
> from the DMA buffer. This will end up in a data loss.

I mentioned the above as an example, but clearly it can affect also the
other direction (e.g. as described below).

> > Note that this is much more likely to happen since commit c44094eee32f
> > ("arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()")
> > that was added in 6.1 and which removed the cache invalidation when
> > setting up the non-cacheable mapping.
> > 
> > Marking the PCIe devices as coherent specifically fixes the intermittent
> > NVMe probe failures observed on the Thinkpad X13s, which was due to
> > corruption of the submission and completion queues. This was typically
> > observed as corruption of the admin submission queue (with well-formed
> > completion):
> > 
> > 	could not locate request for tag 0x0
> > 	nvme nvme0: invalid id 0 completed on queue 0
> > 
> > or corruption of the admin or I/O completion queues (malformed
> > completion):
> > 
> > 	could not locate request for tag 0x45f
> > 	nvme nvme0: invalid id 25695 completed on queue 25965
> > 
> > presumably as these queues are small enough to not be allocated using
> > CMA which in turn make them more likely to be cached (e.g. due to
> > accesses to nearby pages through the cacheable linear map). Increasing
> > the buffer sizes to two pages to force CMA allocation also appears to
> > make the problem go away.
> > 
> 
> I don't think the problem will go away if the allocation happens from CMA
> region. It may just decrease the chances of cache hit but it could always
> happen due to the existence of linear mapping with cacheable attribute.

I never claimed it would fix the problem, I explicitly wrote that it
made it less likely to occur (to the point where my reproducer no longer
triggers).

Johan