lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <84115706-6eac-d4c9-a815-6def8675fb56@intel.com>
Date:   Mon, 3 Jul 2023 15:38:01 +0200
From:   Alexander Lobakin <aleksander.lobakin@...el.com>
To:     Alexander Duyck <alexander.duyck@...il.com>
CC:     "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>,
        Maciej Fijalkowski <maciej.fijalkowski@...el.com>,
        Larysa Zaremba <larysa.zaremba@...el.com>,
        Yunsheng Lin <linyunsheng@...wei.com>,
        Alexander Duyck <alexanderduyck@...com>,
        "Jesper Dangaard Brouer" <hawk@...nel.org>,
        Ilias Apalodimas <ilias.apalodimas@...aro.org>,
        <netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH RFC net-next 2/4] net: page_pool: avoid calling no-op
 externals when possible

From: Alexander Duyck <alexander.duyck@...il.com>
Date: Fri, 30 Jun 2023 11:28:30 -0700

> On Fri, Jun 30, 2023 at 8:34 AM Alexander Lobakin
> <aleksander.lobakin@...el.com> wrote:

[...]

>> On my setup and with patch #4, I have literally 0 allocations once a
>> ring is filled. This means dma_need_sync() is not called at all during
>> Rx, while sync_for_device() would be called all the time.
>> When pages go through ptr_ring, sometimes new allocations happen, but
>> still the number of times dma_need_sync() is called is thousands times
>> lower.
> 
> I see, so you are using it as a screener for pages as they are added
> to the pool. However the first time somebody trips the dma_need_sync
> then everybody in the pool is going to be getting hit with the sync
> code.

Right. "Better safe than sorry". If at least one page needs sync, let's
drop the shortcut. It won't make it worse than the mainline anyway.

> 
>>> dma_need_sync for every frame then maybe we should look at folding it
>>> into page_pool_dma_sync_for_device itself since that is the only
>>> consumer of it it or just fold it into the PP_FLAG_DMA_SYNC_DEV if
>>> statement after the flag check rather than adding yet another flag
>>> that will likely always be true for most devices. Otherwise you are
>>
>> What you suggest is either calling dma_need_sync() each time a page is
>> requested or introducing a flag to store it somewhere in struct page to
>> allow some optimization for really-not-common-cases when dma_need_sync()
>> might return different values due to swiotlb etc. Did I get it right?
> 
> Yeah, my thought would be to have a flag in the page to indicate if it
> will need the sync bits or not. Then you could look at exposing that
> to the drivers as well so that they could cut down on their own
> overhead. We could probably look at just embedding a flag in the lower
> bits of the DMA address stored in the page since I suspect at a
> minimum the resultant DMA address for a page would always be at least
> aligned to a long if not a full page.

As for drivers, I could add a wrapper like page_pool_need_sync() to test
for DMA_SYNC_DEV. I'm not sure it's worth it to check on a per-page basis.
Also, having that bit in the struct page forces us to always fetch it to
a cacheline. Right now, if the sync is skipped, this also avoids
touching struct page or at least postpones it. page_address() (for
&xdp_buff or build_skb()) doesn't touch it.

As for possible implementation, I also thought of the lowest bit of DMA
address. It probably can be lower than %PAGE_SIZE in some cases (at
least non-PP), but not to the 1-byte granularity.
But again, we need some group decision on if it's worth to do on a
per-page basis :)

Thanks,
Olek

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ