lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Fri, 6 May 2022 11:01:45 -0700 From: Hemant Kumar <quic_hemantk@...cinc.com> To: Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>, Loic Poulain <loic.poulain@...aro.org> CC: <mhi@...ts.linux.dev>, <linux-arm-msm@...r.kernel.org>, <linux-kernel@...r.kernel.org>, <quic_bbhatt@...cinc.com> Subject: Re: [PATCH 5/5] bus: mhi: host: Remove redundant dma_wmb() before ctx wp update Hi Loic, On 5/6/2022 10:41 AM, Hemant Kumar wrote: > Hi Loic, > > On 5/4/2022 8:58 AM, Manivannan Sadhasivam wrote: >> On Wed, May 04, 2022 at 11:25:33AM +0200, Loic Poulain wrote: >>> On Wed, 4 May 2022 at 10:17, Manivannan Sadhasivam >>> <manivannan.sadhasivam@...aro.org> wrote: >>>> Hi Loic, >>>> >>>> On Wed, May 04, 2022 at 09:21:20AM +0200, Loic Poulain wrote: >>>>> Hi Mani, >>>>> >>>>> On Mon, 2 May 2022 at 12:42, Manivannan Sadhasivam >>>>> <manivannan.sadhasivam@...aro.org> wrote: >>>>>> The endpoint device will only read the context wp when the host rings >>>>>> the doorbell. >>>>> Are we sure about this statement? what if we update ctxt_wp while the >>>>> device is still processing the previous ring? is it going to continue >>>>> processing the new ctxt_wp or wait for a new doorbell interrupt? what >>>>> about burst mode in which we don't ring at all (ring_db is no-op)? >>>>> >>>> Good point. I think my statement was misleading. But still this scenario won't >>>> happen as per my undestanding. Please see below. >>>> >>>>>> And moreover the doorbell write is using writel(). This >>>>>> guarantess that the prior writes will be completed before ringing >>>>>> doorbell. >>>>> Yes but the barrier is to ensure that descriptor/ring content is >>>>> updated before we actually pass it to device ownership, it's not about >>>>> ordering with the doorbell write, but the memory coherent ones. >>>>> >>>> I see a clear data dependency between writing the ring element and updating the >>>> context pointer. For instance, >>>> >>>> ``` >>>> struct mhi_ring_element *mhi_tre; >>>> >>>> mhi_tre = ring->wp; >>>> /* Populate mhi_tre */ >>>> ... >>>> >>>> /* Increment wp */ >>>> ring->wp += el_size; >>>> >>>> /* Update ctx wp */ >>>> ring->ctx_wp = ring->iommu_base + (ring->wp - ring->base); >>>> ``` >>>> >>>> This is analogous to: >>>> >>>> ``` >>>> Read PTR A; >>>> Update PTR A; >>>> Increment PTR A; >>>> Write PTR A to PTR B; >>>> ``` >>> Interesting point, but shouldn't it be more correct to translate it as: >>> >>> 1. Write PTR A to PTR B (mhi_tre); >>> 2. Update PTR B DATA; >>> 3. Increment PTR A; >>> 4. Write PTR A to PTR C; >>> >>> In that case, it looks like line 2. has no ordering constraint with 3. >>> & 4? whereas the following guarantee it: >>> >>> 1. Write PTR A to PTR B (mhi_tre); >>> 2. Update PTR B DATA; >>> 3. Increment PTR A; >>> dma_wmb() >>> 4. Write PTR A to PTR C; >>> >>> To be honest, compiler optimization is beyond my knowledge, so I don't >>> know if a specific compiler arch/version could be able to mess up the >>> sequence or not. But this pattern is really close to what is described >>> for dma_wmb() usage in Documentation/memory-barriers.txt. That's why I >>> challenged this change and would be conservative, keeping the explicit >>> barrier. >>> >> Hmm. Since I was reading the memory model and going through the MHI code, I >> _thought_ that this dma_wmb() is redundant. But I missed the fact that the >> updating to memory pointed by "wp" happens implicitly via a pointer. So that >> won't qualify as a direct dependency. >> >>>> Here, because of the data dependency due to "ring->wp", the CPU or compiler >>>> won't be ordering the instructions. I think that's one of the reason we never >>>> hit any issue due to this. >>> You may be right here about the implicit ordering guarantee... So if >>> you're sure, I think it would deserve an inline comment to explain why >>> we don't need a memory barrier as in the 'usual' dma descriptor update >>> sequences. >>> >> I think the barrier makes sense now. Sorry for the confusion and thanks for the >> explanations. >> >> Thanks, >> Mani >> >>> Loic > > You made a good point. After following your conversation, in case of > burst mode is enabled and currently > > we are in polling mode, does it make sense to move dma_wmb after > updating channel WP context ? > > DB ring is going to get skipped when we are in pilling mode. > > instead of dma_wmb(); > *ring->ctxt_wp = cpu_to_le64(db); > > *ring->ctxt_wp = cpu_to_le64(db); dma_wmb(); > > Thanks, > Hemant > i think i spoke too fast. I think we dont need to worry about the polling mode as the context_wp update would happen at some point of time and that does not require dma_wmb after update context wp. Thanks, Hemant
Powered by blists - more mailing lists