lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <c0c19d40-58b3-af51-40e9-0b44578c04f5@alliedtelesis.co.nz>
Date:   Tue, 26 Apr 2022 01:17:19 +0000
From:   Chris Packham <Chris.Packham@...iedtelesis.co.nz>
To:     Pali Rohár <pali@...nel.org>
CC:     "jason@...edaemon.net" <jason@...edaemon.net>,
        Joshua Scott <Joshua.Scott@...iedtelesis.co.nz>,
        "gregory.clement@...tlin.com" <gregory.clement@...tlin.com>,
        "andrew@...n.ch" <andrew@...n.ch>,
        "sebastian.hesselbarth@...il.com" <sebastian.hesselbarth@...il.com>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] ARM: mvebu: Enable MBUS error propagation


On 22/04/22 21:27, Pali Rohár wrote:
> On Thursday 03 June 2021 21:03:55 Chris Packham wrote:
>> On 4/06/21 12:55 am, Pali Rohár wrote:
>>> On Wednesday 08 January 2020 19:42:12 Chris Packham wrote:
>>>> Hi Gregory,
>>>>
>>>> On Wed, 2020-01-08 at 11:22 +0100, Gregory CLEMENT wrote:
>>>>> Hello Chris,
>>>>>
>>>>>> U-boot disables MBUS error propagation for Armada-385. The effect of
>>>>>> this on the kernel is that any access to a mapped but inaccessible
>>>>>> address causes the system to hang.
>>>>>>
>>>>>> By enabling MBUS error propagation the kernel can raise a Bus Error and
>>>>>> panic to restart the system.
>>>>> Unless I miss it, it seems that nobody comment this patch: sorry for the
>>>>> delay.
>>>>>
>>>> Thanks for the response.
>>>>
>>>>>> Signed-off-by: Chris Packham <chris.packham@...iedtelesis.co.nz>
>>>>>> ---
>>>>>>
>>>>>> Notes:
>>>>>>       We've encountered an issue where rogue accesses to PCI-e space cause an
>>>>>>       Armada-385 system to lockup. We've found that enabling MBUS error
>>>>>>       propagation lets us get a bus error which at least gives us a panic to
>>>>>>       help identify what was accessed.
>>>>>>       
>>>>>>       U-boot clears the IO Err Prop Enable Bit[1] but so far no-one seems to
>>>>>>       know why.
>>>>>>       
>>>>>>       I wasn't sure where to put this code. There is similar code for kirwood
>>>>>>       in the equivalent dt_init function. On Armada-XP the register is part of
>>>>>>       the Core Coherency Fabric block (for A385 it's documented as part of the
>>>>>>       CCF block).
>>>>> What about adding a new set of register to the mvebu mbus driver?
>>>>>
>>>> After more testing we found that some previously "good" boards started
>>>> throwing up panics with this change. I think that this might require
>>>> handling some of the PCI-e interrupts (for correctable errors) via the
>>>> EDAC subsystem.
>>>>
>>>> We're still working with Marvell to track down exactly why this is
>>>> happening on our system.
>>> Hello Chris! Have you somehow solved this issue? Or do you have some
>>> contacts in Marvell for A385 PCIe subsystem?
>> The problem seemed to be a specific CPU Erratum for the A385 which I
>> brought into upstream u-boot[1].
> Hello Chris, could you point me to this CPU Erratum? What is its number
> or some other identifier? Because I'm unable to find any such erratum in
> the list for A385 erratums.

I haven't seen the actual errata either. My comment is based on the 
following change from one of Marvell's u-boot forks

https://github.com/MarvellEmbeddedProcessors/u-boot-marvell/commit/20cd2704072512de176e048970f2883db901674b

It just says "ERRATA# TBD". Perhaps they just haven't published a new 
document (latest I can find is Rev D from 2016) or they decided it was a 
SW bug and didn't assign it an official errata number.

>> But even then we had to update our base
>> u-boot version from 2016.11 to 2019.10 so there may have been more going
>> on than just that one change.
>>
>> [1] -
>> https://scanmail.trustwave.com/?c=20988&d=8fTi4h2MjPVKSgkegODmtLtuBcIeiZ2hXGl9DulH7A&u=https%3a%2f%2fsource%2edenx%2ede%2fu-boot%2fu-boot%2f-%2fcommit%2fad91fdfff0bd6ea471afe838e0f6d58ed898694e
>>
>>>>> In this case it will be called even earlier allowing to see bus error
>>>>> earlier.
>>>>>
>>>>> In any case, you should separate the device tree change from the code
>>>>> change and at least provide 2 patches.
>>>> Agreed. If/when something solid eventuates we'll do it as a proper
>>>> series.
>>>>
>>>>> Gregory
>>>>>
>>>>>>       
>>>>>>       --
>>>>>>       [1] - https://scanmail.trustwave.com/?c=20988&d=8fTi4h2MjPVKSgkegODmtLtuBcIeiZ2hXGl-X75J5A&u=https%3a%2f%2fgitlab%2edenx%2ede%2fu-boot%2fu-boot%2fblob%2fmaster%2farch%2farm%2fmach-mvebu%2fcpu%2ec%23L489
>> ^^^ sorry, as a result of a reasonably high profile cyber attack on a
>> local institution URLs in our incoming mail are intercepted.
>>>>>>    arch/arm/boot/dts/armada-38x.dtsi |  5 +++++
>>>>>>    arch/arm/mach-mvebu/board-v7.c    | 27 +++++++++++++++++++++++++++
>>>>>>    2 files changed, 32 insertions(+)
>>>>>>
>>>>>> diff --git a/arch/arm/boot/dts/armada-38x.dtsi b/arch/arm/boot/dts/armada-38x.dtsi
>>>>>> index 3f4bb44d85f0..3214c67433eb 100644
>>>>>> --- a/arch/arm/boot/dts/armada-38x.dtsi
>>>>>> +++ b/arch/arm/boot/dts/armada-38x.dtsi
>>>>>> @@ -386,6 +386,11 @@
>>>>>>    				      <0x20250 0x8>;
>>>>>>    			};
>>>>>>    
>>>>>> +			ioerrc: io-err-control@...00 {
>>>>>> +				compatible = "marvell,io-err-control";
>>>>>> +				reg = <0x20200 0x4>;
>>>>>> +			};
>>>>>> +
>>>>>>    			mpic: interrupt-controller@...00 {
>>>>>>    				compatible = "marvell,mpic";
>>>>>>    				reg = <0x20a00 0x2d0>, <0x21070 0x58>;
>>>>>> diff --git a/arch/arm/mach-mvebu/board-v7.c b/arch/arm/mach-mvebu/board-v7.c
>>>>>> index d2df5ef9382b..fb7718386ef9 100644
>>>>>> --- a/arch/arm/mach-mvebu/board-v7.c
>>>>>> +++ b/arch/arm/mach-mvebu/board-v7.c
>>>>>> @@ -138,10 +138,36 @@ static void __init i2c_quirk(void)
>>>>>>    	}
>>>>>>    }
>>>>>>    
>>>>>> +#define MBUS_ERR_PROP_EN BIT(8)
>>>>>> +
>>>>>> +/*
>>>>>> + * U-boot disables MBUS error propagation. Re-enable it so we
>>>>>> + * can handle them as Bus Errors.
>>>>>> + */
>>>>>> +static void __init enable_mbus_error_propagation(void)
>>>>>> +{
>>>>>> +	struct device_node *np =
>>>>>> +		of_find_compatible_node(NULL, NULL, "marvell,io-err-control");
>>>>>> +
>>>>>> +	if (np) {
>>>>>> +		void __iomem *reg;
>>>>>> +
>>>>>> +		reg = of_iomap(np, 0);
>>>>>> +		if (reg) {
>>>>>> +			u32 val;
>>>>>> +
>>>>>> +			val = readl_relaxed(reg);
>>>>>> +			writel_relaxed(val | MBUS_ERR_PROP_EN, reg);
>>>>>> +		}
>>>>>> +		of_node_put(np);
>>>>>> +	}
>>>>>> +}
>>>>>> +
>>>>>>    static void __init mvebu_dt_init(void)
>>>>>>    {
>>>>>>    	if (of_machine_is_compatible("marvell,armadaxp"))
>>>>>>    		i2c_quirk();
>>>>>> +	enable_mbus_error_propagation();
>>>>>>    }
>>>>>>    
>>>>>>    static void __init armada_370_xp_dt_fixup(void)
>>>>>> @@ -191,6 +217,7 @@ DT_MACHINE_START(ARMADA_38X_DT, "Marvell Armada 380/385 (Device Tree)")
>>>>>>    	.l2c_aux_val	= 0,
>>>>>>    	.l2c_aux_mask	= ~0,
>>>>>>    	.init_irq       = mvebu_init_irq,
>>>>>> +	.init_machine	= mvebu_dt_init,
>>>>>>    	.restart	= mvebu_restart,
>>>>>>    	.dt_compat	= armada_38x_dt_compat,
>>>>>>    MACHINE_END
>>>>>> -- 
>>>>>> 2.24.0
>>>>>>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ