linux-kernel - Re: fsl_espi errors on v5.7.15

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1598510348.1g7wt0s02s.astroid@bobo.none>
Date:   Thu, 27 Aug 2020 17:12:30 +1000
From:   Nicholas Piggin <npiggin@...il.com>
To:     "benh@...nel.crashing.org" <benh@...nel.crashing.org>,
        "broonie@...nel.org" <broonie@...nel.org>,
        Chris Packham <Chris.Packham@...iedtelesis.co.nz>,
        Heiner Kallweit <hkallweit1@...il.com>,
        "mpe@...erman.id.au" <mpe@...erman.id.au>,
        "paulus@...ba.org" <paulus@...ba.org>
Cc:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linuxppc-dev@...ts.ozlabs.org" <linuxppc-dev@...ts.ozlabs.org>,
        "linux-spi@...r.kernel.org" <linux-spi@...r.kernel.org>
Subject: Re: fsl_espi errors on v5.7.15

Excerpts from Heiner Kallweit's message of August 26, 2020 4:38 pm:
> On 26.08.2020 08:07, Chris Packham wrote:
>> 
>> On 26/08/20 1:48 pm, Chris Packham wrote:
>>>
>>> On 26/08/20 10:22 am, Chris Packham wrote:
>>>> On 25/08/20 7:22 pm, Heiner Kallweit wrote:
>>>>
>>>> <snip>
>>>>> I've been staring at spi-fsl-espi.c for while now and I think I've
>>>>>> identified a couple of deficiencies that may or may not be related 
>>>>>> to my
>>>>>> issue.
>>>>>>
>>>>>> First I think the 'Transfer done but SPIE_DON isn't set' message 
>>>>>> can be
>>>>>> generated spuriously. In fsl_espi_irq() we read the ESPI_SPIE 
>>>>>> register.
>>>>>> We also write back to it to clear the current events. We re-read it in
>>>>>> fsl_espi_cpu_irq() and complain when SPIE_DON is not set. But we can
>>>>>> naturally end up in that situation if we're doing a large read. 
>>>>>> Consider
>>>>>> the messages for reading a block of data from a spi-nor chip
>>>>>>
>>>>>>    tx = READ_OP + ADDR
>>>>>>    rx = data
>>>>>>
>>>>>> We setup the transfer and pump out the tx_buf. The first interrupt 
>>>>>> goes
>>>>>> off and ESPI_SPIE has SPIM_DON and SPIM_RXT set. We empty the rx fifo,
>>>>>> clear ESPI_SPIE and wait for the next interrupt. The next interrupt
>>>>>> fires and this time we have ESPI_SPIE with just SPIM_RXT set. This
>>>>>> continues until we've received all the data and we finish with 
>>>>>> ESPI_SPIE
>>>>>> having only SPIM_RXT set. When we re-read it we complain that SPIE_DON
>>>>>> isn't set.
>>>>>>
>>>>>> The other deficiency is that we only get an interrupt when the 
>>>>>> amount of
>>>>>> data in the rx fifo is above FSL_ESPI_RXTHR. If there are fewer than
>>>>>> FSL_ESPI_RXTHR left to be received we will never pull them out of 
>>>>>> the fifo.
>>>>>>
>>>>> SPIM_DON will trigger an interrupt once the last characters have been
>>>>> transferred, and read the remaining characters from the FIFO.
>>>>
>>>> The T2080RM that I have says the following about the DON bit
>>>>
>>>> "Last character was transmitted. The last character was transmitted 
>>>> and a new command can be written for the next frame."
>>>>
>>>> That does at least seem to fit with my assertion that it's all about 
>>>> the TX direction. But the fact that it doesn't happen all the time 
>>>> throws some doubt on it.
>>>>
>>>>> I think the reason I'm seeing some variability is because of how fast
>>>>>> (or slow) the interrupts get processed and how fast the spi-nor 
>>>>>> chip can
>>>>>> fill the CPUs rx fifo.
>>>>>>
>>>>> To rule out timing issues at high bus frequencies I initially asked
>>>>> for re-testing at lower frequencies. If you e.g. limit the bus to 1 MHz
>>>>> or even less, then timing shouldn't be an issue.
>>>> Yes I've currently got spi-max-frequency = <1000000>; in my dts. I 
>>>> would also expect a slower frequency would fit my "DON is for TX" 
>>>> narrative.
>>>>> Last relevant functional changes have been done almost 4 years ago.
>>>>> And yours is the first such report I see. So question is what could 
>>>>> be so
>>>>> special with your setup that it seems you're the only one being 
>>>>> affected.
>>>>> The scenarios you describe are standard, therefore much more people
>>>>> should be affected in case of a driver bug.
>>>> Agreed. But even on my hardware (which may have a latent issue 
>>>> despite being in the field for going on 5 years) the issue only 
>>>> triggers under some fairly specific circumstances.
>>>>> You said that kernel config impacts how frequently the issue happens.
>>>>> Therefore question is what's the diff in kernel config, and how could
>>>>> the differences be related to SPI.
>>>>
>>>> It did seem to be somewhat random. Things like CONFIG_PREEMPT have an 
>>>> impact but every time I found something that seemed to be having an 
>>>> impact I've been able to disprove it. I actually think its about how 
>>>> busy the system is which may or may not affect when we get round to 
>>>> processing the interrupts.
>>>>
>>>> I have managed to get the 'Transfer done but SPIE_DON isn't set!' to 
>>>> occur on the T2080RDB.
>>>>
>>>> I've had to add the following to expose the environment as a mtd 
>>>> partition
>>>>
>>>> diff --git a/arch/powerpc/boot/dts/fsl/t208xrdb.dtsi 
>>>> b/arch/powerpc/boot/dts/fsl/t208xrdb.dtsi
>>>> index ff87e67c70da..fbf95fc1fd68 100644
>>>> --- a/arch/powerpc/boot/dts/fsl/t208xrdb.dtsi
>>>> +++ b/arch/powerpc/boot/dts/fsl/t208xrdb.dtsi
>>>> @@ -116,6 +116,15 @@ flash@0 {
>>>>                                 compatible = "micron,n25q512ax3", 
>>>> "jedec,spi-nor";
>>>>                                 reg = <0>;
>>>>                                 spi-max-frequency = <10000000>; /* 
>>>> input clock */
>>>> +
>>>> +                               partition@...oot {
>>>> +                                        reg = <0x00000000 0x00100000>;
>>>> +                                        label = "u-boot";
>>>> +                                };
>>>> +                                partition@...oot-env {
>>>> +                                        reg = <0x00100000 0x00010000>;
>>>> +                                        label = "u-boot-env";
>>>> +                                };
>>>>                         };
>>>>                 };
>>>>
>>>> And I'm using the following script to poke at the environment 
>>>> (warning if anyone does try this and the bug hits it can render your 
>>>> u-boot environment invalid).
>>>>
>>>> cat flash/fw_env_test.sh
>>>> #!/bin/sh
>>>>
>>>> generate_fw_env_config()
>>>> {
>>>>   cat /proc/mtd | sed 's/[:"]//g' | while read dev size erasesize 
>>>> name ; do
>>>>      echo "$dev $size $erasesize $name"
>>>>      [ "$name" = "u-boot-env" ] && echo "/dev/$dev 0x0000 0x2000 
>>>> $erasesize" >/flash/fw_env.config
>>>>   done
>>>> }
>>>>
>>>> cycles=10
>>>> [ $# -ge 1 ] && cycles=$1
>>>>
>>>> generate_fw_env_config
>>>>
>>>> fw_printenv -c /flash/fw_env.config
>>>>
>>>> dmesg -c >/dev/null
>>>> x=0
>>>> while [ $x -lt $cycles ]; do
>>>>     fw_printenv -c /flash/fw_env.config >/dev/null || break
>>>>     fw_setenv -c /flash/fw_env.config foo $RANDOM || break;
>>>>     dmesg -c | grep -q fsl_espi && break;
>>>>     let x=x+1
>>>> done
>>>>
>>>> echo "Ran $x cycles"
>>>
>>> I've also now seen the RX FIFO not empty error on the T2080RDB
>>>
>>> fsl_espi ffe110000.spi: Transfer done but SPIE_DON isn't set!
>>> fsl_espi ffe110000.spi: Transfer done but SPIE_DON isn't set!
>>> fsl_espi ffe110000.spi: Transfer done but SPIE_DON isn't set!
>>> fsl_espi ffe110000.spi: Transfer done but SPIE_DON isn't set!
>>> fsl_espi ffe110000.spi: Transfer done but rx/tx fifo's aren't empty!
>>> fsl_espi ffe110000.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>>
>>> With my current workaround of emptying the RX FIFO. It seems 
>>> survivable. Interestingly it only ever seems to be 1 extra byte in the 
>>> RX FIFO and it seems to be after either a READ_SR or a READ_FSR.
>>>
>>> fsl_espi ffe110000.spi: tx 70
>>> fsl_espi ffe110000.spi: rx 03
>>> fsl_espi ffe110000.spi: Extra RX 00
>>> fsl_espi ffe110000.spi: Transfer done but SPIE_DON isn't set!
>>> fsl_espi ffe110000.spi: Transfer done but rx/tx fifo's aren't empty!
>>> fsl_espi ffe110000.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>> fsl_espi ffe110000.spi: tx 05
>>> fsl_espi ffe110000.spi: rx 00
>>> fsl_espi ffe110000.spi: Extra RX 03
>>> fsl_espi ffe110000.spi: Transfer done but SPIE_DON isn't set!
>>> fsl_espi ffe110000.spi: Transfer done but rx/tx fifo's aren't empty!
>>> fsl_espi ffe110000.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>> fsl_espi ffe110000.spi: tx 05
>>> fsl_espi ffe110000.spi: rx 00
>>> fsl_espi ffe110000.spi: Extra RX 03
>>>
>>> From all the Micron SPI-NOR datasheets I've got access to it is 
>>> possible to continually read the SR/FSR. But I've no idea why it 
>>> happens some times and not others.
>> 
>> So I think I've got a reproduction and I think I've bisected the problem 
>> to commit 3282a3da25bd ("powerpc/64: Implement soft interrupt replay in 
>> C"). My day is just finishing now so I haven't applied too much scrutiny 
>> to this result. Given the various rabbit holes I've been down on this 
>> issue already I'd take this information with a good degree of skepticism.
>> 
> OK, so an easy test should be to re-test with a 5.4 kernel.
> It doesn't have yet the change you're referring to, and the fsl-espi driver
> is basically the same as in 5.7 (just two small changes in 5.7).

There's 6cc0c16d82f88 and maybe also other interrupt related patches 
around this time that could affect book E, so it's good if that exact
patch is confirmed.

I've been staring at 3282a3da25bd for a while and nothing immediately 
stands out. It doesn't look like the low level handlers do anything
special (well 0x900 does ack the decrementer, but so does the masked
handler).

Can you try this patch and also enable CONFIG_PPC_IRQ_SOFT_MASK_DEBUG?

Thanks,
Nick

---
diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index bf21ebd36190..10d339042330 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -214,7 +214,7 @@ void replay_soft_interrupts(void)
 	struct pt_regs regs;
 
 	ppc_save_regs(&regs);
-	regs.softe = IRQS_ALL_DISABLED;
+	regs.softe = IRQS_ENABLED;
 
 again:
 	if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
@@ -349,6 +349,7 @@ notrace void arch_local_irq_restore(unsigned long mask)
 		if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
 			WARN_ON_ONCE(!(mfmsr() & MSR_EE));
 		__hard_irq_disable();
+		local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
 	} else {
 		/*
 		 * We should already be hard disabled here. We had bugs
@@ -368,6 +369,7 @@ notrace void arch_local_irq_restore(unsigned long mask)
 		}
 	}
 
+	preempt_disable();
 	irq_soft_mask_set(IRQS_ALL_DISABLED);
 	trace_hardirqs_off();
 
@@ -377,6 +379,7 @@ notrace void arch_local_irq_restore(unsigned long mask)
 	trace_hardirqs_on();
 	irq_soft_mask_set(IRQS_ENABLED);
 	__hard_irq_enable();
+	preempt_enable();
 }
 EXPORT_SYMBOL(arch_local_irq_restore);