linux-kernel - Re: [PATCH 1/4] spi: spi-fsl-dspi: Clear completion counter before initiating transfer

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <9852a22a-1a09-4559-9775-2ccbb44c43c0@linaro.org>
Date: Tue, 10 Jun 2025 16:41:04 +0100
From: James Clark <james.clark@...aro.org>
To: Vladimir Oltean <vladimir.oltean@....com>
Cc: Vladimir Oltean <olteanv@...il.com>, Mark Brown <broonie@...nel.org>,
 linux-spi@...r.kernel.org, imx@...ts.linux.dev, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/4] spi: spi-fsl-dspi: Clear completion counter before
 initiating transfer

On 10/06/2025 12:34 pm, Vladimir Oltean wrote:
> On Mon, Jun 09, 2025 at 04:32:38PM +0100, James Clark wrote:
>> In target mode, extra interrupts can be received between the end of a
>> transfer and halting the module if the host continues sending more data.
> 
> Presumably you mean not just any extra interrupts can be received, but
> specifically CMDTCF, since that triggers the complete(&dspi->xfer_done)
> call. Other interrupt sources are masked in XSPI mode and should be
> irrelevant.
> 

Yes complete(&dspi->xfer_done) is called so CMDTCF is set. For example 
in one case of underflow I get SPI_SR = 0xca8b0450, which is these flags:

   TCF, TXRXS, TFUF, TFFF, CMDTCF, RFOF, RFDF, CMDFFF

Compared to a successful transfer I get 0xc2830330:

   TCF, TXRXS,       TFFF, CMDTCF,       RFDF, CMDFFF

>> If the interrupt from this occurs after the reinit_completion() then the
>> completion counter is left at a non-zero value. The next unrelated
>> transfer initiated by userspace will then complete immediately without
>> waiting for the interrupt or writing to the RX buffer.
>>
>> Fix it by resetting the counter before the transfer so that lingering
>> values are cleared. This is done after clearing the FIFOs and the
>> status register but before the transfer is initiated, so no interrupts
>> should be received at this point resulting in other race conditions.
> 
> Sorry, I don't have a lot of experience with the target mode, and when I
> introduced the XSPI FIFO mode, I didn't take target mode into consideration.
> 
> The question is, does the module support XSPI FIFO writes in target
> mode? In the LS1028A reference manual, I see PUSHR_SLAVE has the upper
> 16 bits (for the command) hidden, specifically there is no CTAS field
> there that would point to one of the CTARE0/CTARE1 registers.
> Cross-checking with the S32G3 RM, I see nothing fundamentally different.
> 
> I am surprised, given this fact, that the CMDTCF interrupt would fire at
> all in target mode.
> 

It's working in my testing where I've forced it to XSPI mode instead of 
DMA mode on S32G3. I assume the command is blank because in target mode 
CTAR0 (aka CTAR0_SLAVE) is always used regardless of the frame.

CTARE0 isn't explicitly relabeled like CTAR0, but this paragraph states 
that CTARE0 is used:

   50.4.3.2 Slave mode

   ... The SPI Slave mode transfer attributes are configured in the CTAR0
   and CTARE0 registers ...

Any transfers smaller than the FIFO are working in interrupt mode, 
although larger ones are problematic because there isn't enough time to 
reload the FIFOs while the host is still sending (hence the error I 
added in patch 4).

Polling mode isn't working at all because it has a timeout which gets 
hit and returns -ETIMEDOUT before the host sends anything. Although I 
added the check there for consistency and for catching host mode errors.

>>
>> Fixes: 4f5ee75ea171 ("spi: spi-fsl-dspi: Replace interruptible wait queue with a simple completion")
> 
> To be clear, if you ran 'git bisect' to track down this issue, it
> wouldn't have pointed you to this commit, would it?

I didn't test it no, but I did assume that the wake_up_interruptible() 
that got replaced wasn't vulnerable to this same issue. Because the 
spurious wake_up_interruptible() would be "lost", and a fresh one from 
the next transfer would have been required to proceed past the 
wait_event_interruptible().

Whereas wait_for_completion() is just a counter so it has the memory 
problem explained in the commit message.