linux-kernel - Re: fsl_espi errors on v5.7.15

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0068446e-06f8-6648-2f40-56f324c1ee6e@alliedtelesis.co.nz>
Date:   Thu, 27 Aug 2020 22:07:12 +0000
From:   Chris Packham <Chris.Packham@...iedtelesis.co.nz>
To:     Nicholas Piggin <npiggin@...il.com>,
        "benh@...nel.crashing.org" <benh@...nel.crashing.org>,
        "broonie@...nel.org" <broonie@...nel.org>,
        "Heiner Kallweit" <hkallweit1@...il.com>,
        "mpe@...erman.id.au" <mpe@...erman.id.au>,
        "paulus@...ba.org" <paulus@...ba.org>
CC:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linuxppc-dev@...ts.ozlabs.org" <linuxppc-dev@...ts.ozlabs.org>,
        "linux-spi@...r.kernel.org" <linux-spi@...r.kernel.org>
Subject: Re: fsl_espi errors on v5.7.15

On 27/08/20 7:12 pm, Nicholas Piggin wrote:
> Excerpts from Heiner Kallweit's message of August 26, 2020 4:38 pm:
>> On 26.08.2020 08:07, Chris Packham wrote:
>>> On 26/08/20 1:48 pm, Chris Packham wrote:
>>>> On 26/08/20 10:22 am, Chris Packham wrote:
>>>>> On 25/08/20 7:22 pm, Heiner Kallweit wrote:
>>>>>
>>>>> <snip>
>>>>>> I've been staring at spi-fsl-espi.c for while now and I think I've
>>>>>>> identified a couple of deficiencies that may or may not be related
>>>>>>> to my
>>>>>>> issue.
>>>>>>>
>>>>>>> First I think the 'Transfer done but SPIE_DON isn't set' message
>>>>>>> can be
>>>>>>> generated spuriously. In fsl_espi_irq() we read the ESPI_SPIE
>>>>>>> register.
>>>>>>> We also write back to it to clear the current events. We re-read it in
>>>>>>> fsl_espi_cpu_irq() and complain when SPIE_DON is not set. But we can
>>>>>>> naturally end up in that situation if we're doing a large read.
>>>>>>> Consider
>>>>>>> the messages for reading a block of data from a spi-nor chip
>>>>>>>
>>>>>>>     tx = READ_OP + ADDR
>>>>>>>     rx = data
>>>>>>>
>>>>>>> We setup the transfer and pump out the tx_buf. The first interrupt
>>>>>>> goes
>>>>>>> off and ESPI_SPIE has SPIM_DON and SPIM_RXT set. We empty the rx fifo,
>>>>>>> clear ESPI_SPIE and wait for the next interrupt. The next interrupt
>>>>>>> fires and this time we have ESPI_SPIE with just SPIM_RXT set. This
>>>>>>> continues until we've received all the data and we finish with
>>>>>>> ESPI_SPIE
>>>>>>> having only SPIM_RXT set. When we re-read it we complain that SPIE_DON
>>>>>>> isn't set.
>>>>>>>
>>>>>>> The other deficiency is that we only get an interrupt when the
>>>>>>> amount of
>>>>>>> data in the rx fifo is above FSL_ESPI_RXTHR. If there are fewer than
>>>>>>> FSL_ESPI_RXTHR left to be received we will never pull them out of
>>>>>>> the fifo.
>>>>>>>
>>>>>> SPIM_DON will trigger an interrupt once the last characters have been
>>>>>> transferred, and read the remaining characters from the FIFO.
>>>>> The T2080RM that I have says the following about the DON bit
>>>>>
>>>>> "Last character was transmitted. The last character was transmitted
>>>>> and a new command can be written for the next frame."
>>>>>
>>>>> That does at least seem to fit with my assertion that it's all about
>>>>> the TX direction. But the fact that it doesn't happen all the time
>>>>> throws some doubt on it.
>>>>>
>>>>>> I think the reason I'm seeing some variability is because of how fast
>>>>>>> (or slow) the interrupts get processed and how fast the spi-nor
>>>>>>> chip can
>>>>>>> fill the CPUs rx fifo.
>>>>>>>
>>>>>> To rule out timing issues at high bus frequencies I initially asked
>>>>>> for re-testing at lower frequencies. If you e.g. limit the bus to 1 MHz
>>>>>> or even less, then timing shouldn't be an issue.
>>>>> Yes I've currently got spi-max-frequency = <1000000>; in my dts. I
>>>>> would also expect a slower frequency would fit my "DON is for TX"
>>>>> narrative.
>>>>>> Last relevant functional changes have been done almost 4 years ago.
>>>>>> And yours is the first such report I see. So question is what could
>>>>>> be so
>>>>>> special with your setup that it seems you're the only one being
>>>>>> affected.
>>>>>> The scenarios you describe are standard, therefore much more people
>>>>>> should be affected in case of a driver bug.
>>>>> Agreed. But even on my hardware (which may have a latent issue
>>>>> despite being in the field for going on 5 years) the issue only
>>>>> triggers under some fairly specific circumstances.
>>>>>> You said that kernel config impacts how frequently the issue happens.
>>>>>> Therefore question is what's the diff in kernel config, and how could
>>>>>> the differences be related to SPI.
>>>>> It did seem to be somewhat random. Things like CONFIG_PREEMPT have an
>>>>> impact but every time I found something that seemed to be having an
>>>>> impact I've been able to disprove it. I actually think its about how
>>>>> busy the system is which may or may not affect when we get round to
>>>>> processing the interrupts.
>>>>>
>>>>> I have managed to get the 'Transfer done but SPIE_DON isn't set!' to
>>>>> occur on the T2080RDB.
>>>>>
>>>>> I've had to add the following to expose the environment as a mtd
>>>>> partition
>>>>>
>>>>> diff --git a/arch/powerpc/boot/dts/fsl/t208xrdb.dtsi
>>>>> b/arch/powerpc/boot/dts/fsl/t208xrdb.dtsi
>>>>> index ff87e67c70da..fbf95fc1fd68 100644
>>>>> --- a/arch/powerpc/boot/dts/fsl/t208xrdb.dtsi
>>>>> +++ b/arch/powerpc/boot/dts/fsl/t208xrdb.dtsi
>>>>> @@ -116,6 +116,15 @@ flash@0 {
>>>>>                                  compatible = "micron,n25q512ax3",
>>>>> "jedec,spi-nor";
>>>>>                                  reg = <0>;
>>>>>                                  spi-max-frequency = <10000000>; /*
>>>>> input clock */
>>>>> +
>>>>> +                               partition@...oot {
>>>>> +                                        reg = <0x00000000 0x00100000>;
>>>>> +                                        label = "u-boot";
>>>>> +                                };
>>>>> +                                partition@...oot-env {
>>>>> +                                        reg = <0x00100000 0x00010000>;
>>>>> +                                        label = "u-boot-env";
>>>>> +                                };
>>>>>                          };
>>>>>                  };
>>>>>
>>>>> And I'm using the following script to poke at the environment
>>>>> (warning if anyone does try this and the bug hits it can render your
>>>>> u-boot environment invalid).
>>>>>
>>>>> cat flash/fw_env_test.sh
>>>>> #!/bin/sh
>>>>>
>>>>> generate_fw_env_config()
>>>>> {
>>>>>    cat /proc/mtd | sed 's/[:"]//g' | while read dev size erasesize
>>>>> name ; do
>>>>>       echo "$dev $size $erasesize $name"
>>>>>       [ "$name" = "u-boot-env" ] && echo "/dev/$dev 0x0000 0x2000
>>>>> $erasesize" >/flash/fw_env.config
>>>>>    done
>>>>> }
>>>>>
>>>>> cycles=10
>>>>> [ $# -ge 1 ] && cycles=$1
>>>>>
>>>>> generate_fw_env_config
>>>>>
>>>>> fw_printenv -c /flash/fw_env.config
>>>>>
>>>>> dmesg -c >/dev/null
>>>>> x=0
>>>>> while [ $x -lt $cycles ]; do
>>>>>      fw_printenv -c /flash/fw_env.config >/dev/null || break
>>>>>      fw_setenv -c /flash/fw_env.config foo $RANDOM || break;
>>>>>      dmesg -c | grep -q fsl_espi && break;
>>>>>      let x=x+1
>>>>> done
>>>>>
>>>>> echo "Ran $x cycles"
>>>> I've also now seen the RX FIFO not empty error on the T2080RDB
>>>>
>>>> fsl_espi ffe110000.spi: Transfer done but SPIE_DON isn't set!
>>>> fsl_espi ffe110000.spi: Transfer done but SPIE_DON isn't set!
>>>> fsl_espi ffe110000.spi: Transfer done but SPIE_DON isn't set!
>>>> fsl_espi ffe110000.spi: Transfer done but SPIE_DON isn't set!
>>>> fsl_espi ffe110000.spi: Transfer done but rx/tx fifo's aren't empty!
>>>> fsl_espi ffe110000.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>>>
>>>> With my current workaround of emptying the RX FIFO. It seems
>>>> survivable. Interestingly it only ever seems to be 1 extra byte in the
>>>> RX FIFO and it seems to be after either a READ_SR or a READ_FSR.
>>>>
>>>> fsl_espi ffe110000.spi: tx 70
>>>> fsl_espi ffe110000.spi: rx 03
>>>> fsl_espi ffe110000.spi: Extra RX 00
>>>> fsl_espi ffe110000.spi: Transfer done but SPIE_DON isn't set!
>>>> fsl_espi ffe110000.spi: Transfer done but rx/tx fifo's aren't empty!
>>>> fsl_espi ffe110000.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>>> fsl_espi ffe110000.spi: tx 05
>>>> fsl_espi ffe110000.spi: rx 00
>>>> fsl_espi ffe110000.spi: Extra RX 03
>>>> fsl_espi ffe110000.spi: Transfer done but SPIE_DON isn't set!
>>>> fsl_espi ffe110000.spi: Transfer done but rx/tx fifo's aren't empty!
>>>> fsl_espi ffe110000.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>>> fsl_espi ffe110000.spi: tx 05
>>>> fsl_espi ffe110000.spi: rx 00
>>>> fsl_espi ffe110000.spi: Extra RX 03
>>>>
>>>>  From all the Micron SPI-NOR datasheets I've got access to it is
>>>> possible to continually read the SR/FSR. But I've no idea why it
>>>> happens some times and not others.
>>> So I think I've got a reproduction and I think I've bisected the problem
>>> to commit 3282a3da25bd ("powerpc/64: Implement soft interrupt replay in
>>> C"). My day is just finishing now so I haven't applied too much scrutiny
>>> to this result. Given the various rabbit holes I've been down on this
>>> issue already I'd take this information with a good degree of skepticism.
>>>
>> OK, so an easy test should be to re-test with a 5.4 kernel.
>> It doesn't have yet the change you're referring to, and the fsl-espi driver
>> is basically the same as in 5.7 (just two small changes in 5.7).
> There's 6cc0c16d82f88 and maybe also other interrupt related patches
> around this time that could affect book E, so it's good if that exact
> patch is confirmed.

My confirmation is basically that I can induce the issue in a 5.4 kernel 
by cherry-picking 3282a3da25bd. I'm also able to "fix" the issue in 
5.9-rc2 by reverting that one commit.

I both cases it's not exactly a clean cherry-pick/revert so I also 
confirmed the bisection result by building at 3282a3da25bd (which sees 
the issue) and the commit just before (which does not).

> I've been staring at 3282a3da25bd for a while and nothing immediately
> stands out. It doesn't look like the low level handlers do anything
> special (well 0x900 does ack the decrementer, but so does the masked
> handler).
>
> Can you try this patch and also enable CONFIG_PPC_IRQ_SOFT_MASK_DEBUG?
>
> Thanks,
> Nick
>
> ---
> diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
> index bf21ebd36190..10d339042330 100644
> --- a/arch/powerpc/kernel/irq.c
> +++ b/arch/powerpc/kernel/irq.c
> @@ -214,7 +214,7 @@ void replay_soft_interrupts(void)
>   	struct pt_regs regs;
>   
>   	ppc_save_regs(&regs);
> -	regs.softe = IRQS_ALL_DISABLED;
> +	regs.softe = IRQS_ENABLED;
>   
>   again:
>   	if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
> @@ -349,6 +349,7 @@ notrace void arch_local_irq_restore(unsigned long mask)
>   		if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
>   			WARN_ON_ONCE(!(mfmsr() & MSR_EE));
>   		__hard_irq_disable();
> +		local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
>   	} else {
>   		/*
>   		 * We should already be hard disabled here. We had bugs
> @@ -368,6 +369,7 @@ notrace void arch_local_irq_restore(unsigned long mask)
>   		}
>   	}
>   
> +	preempt_disable();
>   	irq_soft_mask_set(IRQS_ALL_DISABLED);
>   	trace_hardirqs_off();
>   
> @@ -377,6 +379,7 @@ notrace void arch_local_irq_restore(unsigned long mask)
>   	trace_hardirqs_on();
>   	irq_soft_mask_set(IRQS_ENABLED);
>   	__hard_irq_enable();
> +	preempt_enable();
>   }
>   EXPORT_SYMBOL(arch_local_irq_restore);
>   
I still saw the issue with this change applied. PPC_IRQ_SOFT_MASK_DEBUG 
didn't report anything (either with or without the change above).