lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGngYiUvJE+L4-tw91ozPaq7mGUbh0PS0q7MpLnHVwDqGrFwEw@mail.gmail.com>
Date:   Tue, 8 Dec 2020 22:49:16 -0500
From:   Sven Van Asbroeck <thesven73@...il.com>
To:     Florian Fainelli <f.fainelli@...il.com>
Cc:     Andrew Lunn <andrew@...n.ch>, Jakub Kicinski <kuba@...nel.org>,
        Bryan Whitehead <bryan.whitehead@...rochip.com>,
        Microchip Linux Driver Support <UNGLinuxDriver@...rochip.com>,
        David S Miller <davem@...emloft.net>,
        netdev <netdev@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH net v1 2/2] lan743x: boost performance: limit PCIe
 bandwidth requirement

On Tue, Dec 8, 2020 at 6:36 PM Florian Fainelli <f.fainelli@...il.com> wrote:
>
> dma_sync_single_for_{cpu,device} is what you would need in order to make
> a partial cache line invalidation. You would still need to unmap the
> same address+length pair that was used for the initial mapping otherwise
> the DMA-API debugging will rightfully complain.

I tried replacing
    dma_unmap_single(9K, DMA_FROM_DEVICE);
with
    dma_sync_single_for_cpu(received_size=1500 bytes, DMA_FROM_DEVICE);
    dma_unmap_single_attrs(9K, DMA_FROM_DEVICE, DMA_ATTR_SKIP_CPU_SYNC);

and that works! But the bandwidth is still pretty bad, because the cpu
now spends most of its time doing
    dma_map_single(9K, DMA_FROM_DEVICE);
which spends a lot of time doing __dma_page_cpu_to_dev.

When I try and replace that with
    dma_map_single_attrs(9K, DMA_FROM_DEVICE, DMA_ATTR_SKIP_CPU_SYNC);
Then I get lots of dropped packets, which seems to indicate data corruption.

Interestingly, when I do
    dma_map_single_attrs(9K, DMA_FROM_DEVICE, DMA_ATTR_SKIP_CPU_SYNC);
    dma_sync_single_for_{cpu|device}(9K, DMA_FROM_DEVICE);
then the dropped packets disappear, but things are still very slow.

What am I missing?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ