lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230320104719.mane5faxvv6ofpiv@bogus>
Date:   Mon, 20 Mar 2023 10:47:19 +0000
From:   Sudeep Holla <sudeep.holla@....com>
To:     Shanker Donthineni <sdonthineni@...dia.com>
Cc:     Marc Zyngier <maz@...nel.org>,
        Catalin Marinas <catalin.marinas@....com>,
        Sudeep Holla <sudeep.holla@....com>,
        Will Deacon <will@...nel.org>,
        Jonathan Corbet <corbet@....net>,
        Mark Rutland <mark.rutland@....com>,
        Lorenzo Pieralisi <lpieralisi@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        linux-arm-kernel@...ts.infradead.org, linux-doc@...r.kernel.org,
        linux-kernel@...r.kernel.org, Vikram Sethi <vsethi@...dia.com>,
        Thierry Reding <treding@...dia.com>
Subject: Re: [PATCH v5] irqchip/gicv3: Workaround for NVIDIA erratum
 T241-FABRIC-4

On Sat, Mar 18, 2023 at 09:43:14PM -0500, Shanker Donthineni wrote:
> The T241 platform suffers from the T241-FABRIC-4 erratum which causes
> unexpected behavior in the GIC when multiple transactions are received
> simultaneously from different sources. This hardware issue impacts
> NVIDIA server platforms that use more than two T241 chips
> interconnected. Each chip has support for 320 {E}SPIs.
> 
> This issue occurs when multiple packets from different GICs are
> incorrectly interleaved at the target chip. The erratum text below
> specifies exactly what can cause multiple transfer packets susceptible
> to interleaving and GIC state corruption. GIC state corruption can
> lead to a range of problems, including kernel panics, and unexpected
> behavior.
> 
> From the erratum text:
>   "In some cases, inter-socket AXI4 Stream packets with multiple
>   transfers, may be interleaved by the fabric when presented to ARM
>   Generic Interrupt Controller. GIC expects all transfers of a packet
>   to be delivered without any interleaving.
> 
>   The following GICv3 commands may result in multiple transfer packets
>   over inter-socket AXI4 Stream interface:
>    - Register reads from GICD_I* and GICD_N*
>    - Register writes to 64-bit GICD registers other than GICD_IROUTERn*
>    - ITS command MOVALL
> 
>   Multiple commands in GICv4+ utilize multiple transfer packets,
>   including VMOVP, VMOVI, VMAPP, and 64-bit register accesses."
> 
>   This issue impacts system configurations with more than 2 sockets,
>   that require multi-transfer packets to be sent over inter-socket
>   AXI4 Stream interface between GIC instances on different sockets.
>   GICv4 cannot be supported. GICv3 SW model can only be supported
>   with the workaround. Single and Dual socket configurations are not
>   impacted by this issue and support GICv3 and GICv4."
> 
> Link: https://developer.nvidia.com/docs/t241-fabric-4/nvidia-t241-fabric-4-errata.pdf
> 
> Writing to the chip alias region of the GICD_In{E} registers except
> GICD_ICENABLERn has an equivalent effect as writing to the global
> distributor. The SPI interrupt deactivate path is not impacted by
> the erratum.
> 
> To fix this problem, implement a workaround that ensures read accesses
> to the GICD_In{E} registers are directed to the chip that owns the
> SPI, and disable GICv4.x features. To simplify code changes, the
> gic_configure_irq() function uses the same alias region for both read
> and write operations to GICD_ICFGR.
> 
> Co-developed-by: Vikram Sethi <vsethi@...dia.com>
> Signed-off-by: Vikram Sethi <vsethi@...dia.com>
> Signed-off-by: Shanker Donthineni <sdonthineni@...dia.com>
> ---
> Changes since v4:
>  - Resolve Marc's comments https://lore.kernel.org/all/871qlqif9v.wl-maz@kernel.org/
> Changes since v3:
>  - Fix the build issue for the 32bit arch
> Changes since v2:
>  - Add accessors for the SOC-ID version & revision

SMCCC/SOC ID part looks good to me. In case you spin another version for any
reason, I would prefer you split those changes into separate patch. Otherwise

Acked-by: Sudeep Holla <sudeep.holla@....com> (for SMCCC/SOC ID bits)

-- 
Regards,
Sudeep

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ