lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <20250610210147.kwuwwjtcl36hrxjc@skbuf> Date: Wed, 11 Jun 2025 00:01:47 +0300 From: Vladimir Oltean <vladimir.oltean@....com> To: James Clark <james.clark@...aro.org> Cc: Vladimir Oltean <olteanv@...il.com>, Mark Brown <broonie@...nel.org>, linux-spi@...r.kernel.org, imx@...ts.linux.dev, linux-kernel@...r.kernel.org Subject: Re: [PATCH 1/4] spi: spi-fsl-dspi: Clear completion counter before initiating transfer On Tue, Jun 10, 2025 at 04:41:04PM +0100, James Clark wrote: > On 10/06/2025 12:34 pm, Vladimir Oltean wrote: > > On Mon, Jun 09, 2025 at 04:32:38PM +0100, James Clark wrote: > > > In target mode, extra interrupts can be received between the end of a > > > transfer and halting the module if the host continues sending more data. > > > > Presumably you mean not just any extra interrupts can be received, but > > specifically CMDTCF, since that triggers the complete(&dspi->xfer_done) > > call. Other interrupt sources are masked in XSPI mode and should be > > irrelevant. > > Yes complete(&dspi->xfer_done) is called so CMDTCF is set. For example in > one case of underflow I get SPI_SR = 0xca8b0450, which is these flags: > > TCF, TXRXS, TFUF, TFFF, CMDTCF, RFOF, RFDF, CMDFFF > > Compared to a successful transfer I get 0xc2830330: > > TCF, TXRXS, TFFF, CMDTCF, RFDF, CMDFFF Ok, so my new question would be: if CMDTCF is set, presumably it means a command was transferred. What command was transferred, and who put data in the FIFO for it? Because the answer to the above is AFAIU "no one", I guess the driver should ignore CMDTCF when TFUF (TX FIFO underflow) is set; I consider that to be the logic bug. You are also doing that in patch 4/4, except you still call complete() for some reason. If you don't call complete(), there is no reason to fend against spurious completions. I think I would prefer seeing more deliberate decisions in the driver, it helps if things don't just work by coincidence. > > > If the interrupt from this occurs after the reinit_completion() then the > > > completion counter is left at a non-zero value. The next unrelated > > > transfer initiated by userspace will then complete immediately without > > > waiting for the interrupt or writing to the RX buffer. > > > > > > Fix it by resetting the counter before the transfer so that lingering > > > values are cleared. This is done after clearing the FIFOs and the > > > status register but before the transfer is initiated, so no interrupts > > > should be received at this point resulting in other race conditions. > > > > Sorry, I don't have a lot of experience with the target mode, and when I > > introduced the XSPI FIFO mode, I didn't take target mode into consideration. > > > > The question is, does the module support XSPI FIFO writes in target > > mode? In the LS1028A reference manual, I see PUSHR_SLAVE has the upper > > 16 bits (for the command) hidden, specifically there is no CTAS field > > there that would point to one of the CTARE0/CTARE1 registers. > > Cross-checking with the S32G3 RM, I see nothing fundamentally different. > > > > I am surprised, given this fact, that the CMDTCF interrupt would fire at > > all in target mode. > > It's working in my testing where I've forced it to XSPI mode instead of DMA > mode on S32G3. I assume the command is blank because in target mode CTAR0 > (aka CTAR0_SLAVE) is always used regardless of the frame. > > CTARE0 isn't explicitly relabeled like CTAR0, but this paragraph states that > CTARE0 is used: > > 50.4.3.2 Slave mode > > ... The SPI Slave mode transfer attributes are configured in the CTAR0 > and CTARE0 registers ... That's an interesting piece of data which I wasn't aware of, thanks. > Any transfers smaller than the FIFO are working in interrupt mode, although > larger ones are problematic because there isn't enough time to reload the > FIFOs while the host is still sending (hence the error I added in patch 4). > > Polling mode isn't working at all because it has a timeout which gets hit > and returns -ETIMEDOUT before the host sends anything. Although I added the > check there for consistency and for catching host mode errors. > > > > > > > Fixes: 4f5ee75ea171 ("spi: spi-fsl-dspi: Replace interruptible wait queue with a simple completion") > > > > To be clear, if you ran 'git bisect' to track down this issue, it > > wouldn't have pointed you to this commit, would it? > > I didn't test it no, but I did assume that the wake_up_interruptible() that > got replaced wasn't vulnerable to this same issue. Because the spurious > wake_up_interruptible() would be "lost", and a fresh one from the next > transfer would have been required to proceed past the > wait_event_interruptible(). > > Whereas wait_for_completion() is just a counter so it has the memory problem > explained in the commit message. Why would a spurious wake_up_interruptible() be lost? Is it because of the dspi->waitflags condition not becoming 1? It would also become 1...
Powered by blists - more mailing lists