[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <cba6da940d7b60ce7362c9588ebb2a33@codeaurora.org>
Date: Fri, 27 Aug 2021 10:29:59 -0700
From: Bhaumik Bhatt <bbhatt@...eaurora.org>
To: Jeffrey Hugo <quic_jhugo@...cinc.com>
Cc: mani@...nel.org, hemantk@...eaurora.org,
linux-arm-msm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] bus: mhi: core: Use cached values for calculating the
shared write pointer
On 2021-08-27 07:13 AM, Jeffrey Hugo wrote:
> mhi_recycle_ev_ring() computes the shared write pointer for the ring
> (ctxt_wp) using a read/modify/write pattern where the ctxt_wp value in
> the
> shared memory is read, incremented, and written back. There are no
> checks
> on the read value, it is assumed that it is kept in sync with the
> locally
> cached value. Per the MHI spec, this is correct. The device should
> only
> read ctxt_wp, never write it.
>
> However, there are devices in the wild that violate the spec, and can
> update the ctxt_wp in a specific scenario. This can cause corruption,
> and
> violate the above assumption that the ctxt_wp is in sync with the
> cached
> value.
>
> This can occur when the device has loaded firmware from the host, and
> is
> transitioning from the SBL EE to the AMSS EE. As part of shutting down
> SBL, the SBL flushes it's local MHI context to the shared memory since
> the local context will not persist across an EE change. In the case of
> the event ring, SBL will flush its entire context, not just the parts
> that
> it is allowed to update. This means SBL will write to ctxt_wp, and
> possibly corrupt it.
>
> An example:
>
> Host Device
> ---- ---
> Update ctxt_wp to 0x1f0
> SBL observes 0x1f0
> Update ctxt_wp to 0x0
> Starts transition to AMSS EE
> Context flush, writes 0x1f0 to ctxt_wp
> Update ctxt_wp to 0x200
> Update ctxt_wp to 0x210
> AMSS observes 0x210
> 0x210 exceeds ring size
> AMSS signals syserr
>
> The reason the ctxt_wp goes off the end of the ring is that the
> rollover
> check is only performed on the cached wp, which is out of sync with
> ctxt_wp.
>
> Since the host is the authority of the value of ctxt_wp per the MHI
> spec,
> we can fix this issue by not reading ctxt_wp from the shared memory,
> and
> instead compute it based on the cached value. If SBL corrupts ctxt_wp,
> the host won't observe it, and will correct the value at some point
> later.
>
> Signed-off-by: Jeffrey Hugo <quic_jhugo@...cinc.com>
> Reviewed-by: Hemant Kumar <hemantk@...eaurora.org>
> ---
>
> v2:
> Fix typo on the ring base
>
> drivers/bus/mhi/core/main.c | 9 ++-------
> 1 file changed, 2 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/bus/mhi/core/main.c b/drivers/bus/mhi/core/main.c
> index c01ec2f..dc86fdb3 100644
> --- a/drivers/bus/mhi/core/main.c
> +++ b/drivers/bus/mhi/core/main.c
> @@ -533,18 +533,13 @@ irqreturn_t mhi_intvec_handler(int irq_number,
> void *dev)
> static void mhi_recycle_ev_ring_element(struct mhi_controller
> *mhi_cntrl,
> struct mhi_ring *ring)
> {
> - dma_addr_t ctxt_wp;
> -
> /* Update the WP */
> ring->wp += ring->el_size;
> - ctxt_wp = *ring->ctxt_wp + ring->el_size;
>
> - if (ring->wp >= (ring->base + ring->len)) {
> + if (ring->wp >= (ring->base + ring->len))
> ring->wp = ring->base;
> - ctxt_wp = ring->iommu_base;
> - }
>
> - *ring->ctxt_wp = ctxt_wp;
> + *ring->ctxt_wp = ring->iommu_base + (ring->wp - ring->base);
>
> /* Update the RP */
> ring->rp += ring->el_size;
Reviewed-by: Bhaumik Bhatt <bbhatt@...eaurora.org>
Thanks,
Bhaumik
---
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora
Forum,
a Linux Foundation Collaborative Project
Powered by blists - more mailing lists