lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 12 Sep 2023 10:52:10 +0200
From:   Köry Maincent <kory.maincent@...tlin.com>
To:     Serge Semin <fancer.lancer@...il.com>
Cc:     Cai Huoqing <cai.huoqing@...ux.dev>,
        Manivannan Sadhasivam <mani@...nel.org>,
        Vinod Koul <vkoul@...nel.org>,
        Gustavo Pimentel <Gustavo.Pimentel@...opsys.com>,
        dmaengine@...r.kernel.org, linux-kernel@...r.kernel.org,
        Thomas Petazzoni <thomas.petazzoni@...tlin.com>,
        Herve Codina <herve.codina@...tlin.com>
Subject: Re: [PATCH 4/9] dmaengine: dw-edma: HDMA: Add memory barrier before
 starting the DMA transfer in remote setup

Hello Serge,

I am back with an hardware design answer:
> "Even though the PCIe itself respects the transactions ordering, the 
> AXI bus does not have an end-to-end completion acknowledgement (it
> terminates at the PCIe EP boundary with bus), and does not guaranteed
> ordering if accessing different destinations on the Bus. So, an access to LL
> could be declared complete even though the transactions is still being
> pipelined in the AXI Bus. (a dozen or so clocks, I can give an accurate
> number if needed)
> 
> The access to DMA registers is done through BAR0 “rolling”
> so the transaction does not actually go out on the AXI bus and
> looped-back to PCIe DMA, rather it stays inside the PCIe EP.
> 
> For the above reasons, hypothetically, there’s a chance that even if the DMA
> LL is accessed before the DM DB from PCIe RC side, the DB could be updated
> before the LL in local memory."

On Thu, 22 Jun 2023 19:22:20 +0300
Serge Semin <fancer.lancer@...il.com> wrote:
 
> If we get assured that hardware with such problem exists (if you'll get
> confirmation about the supposition 3. above) then we'll need to
> activate your trick for that hardware only. Adding dummy reads for all
> the remote eDMA setups doesn't look correct since it adds additional
> delay to the execution path and especially seeing nobody has noticed
> and reported such problem so far (for instance Gustavo didn't see the
> problem on his device otherwise he would have fixed it).
> 
> So if assumption 3. is correct then I'd suggest the next
> implementation: add a new dw_edma_chip_flags flag defined (a.k.a
> DW_EDMA_SLOW_MEM), have it specified via the dw_edma_chip.flags field
> in the Akida device probe() method and activate your trick only if
> that flag is set.

The flag you suggested is about slow memory write but as said above the issue
comes from the AXI bus and not the memory. I am wondering why you don't see
this issue. If I understand well it should be present on all IP as the DMA
register is internal to the IP and the LL memory is external through AXI bus.
Did you stress your IP? On my side it appears with lots of operation using
several (at least 3) thread through 2 DMA channels.

Köry

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ