netdev - Re: [net-next PATCH 13/15] eth: fbnic: add basic Rx handling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKgT0Ud366SsaLftQ6Gd4hg+MW9VixOhG9nA9pa4VKh0maozBg@mail.gmail.com>
Date: Mon, 15 Apr 2024 11:55:37 -0700
From: Alexander Duyck <alexander.duyck@...il.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Yunsheng Lin <linyunsheng@...wei.com>, netdev@...r.kernel.org, 
	Alexander Duyck <alexanderduyck@...com>, davem@...emloft.net, pabeni@...hat.com
Subject: Re: [net-next PATCH 13/15] eth: fbnic: add basic Rx handling

On Mon, Apr 15, 2024 at 11:19 AM Jakub Kicinski <kuba@...nel.org> wrote:
>
> On Mon, 15 Apr 2024 11:03:13 -0700 Alexander Duyck wrote:
> > This wasn't my full argument. You truncated the part where I
> > specifically called out that it is hard to justify us pushing a
> > proprietary API that is only used by our driver.
>
> I see. Please be careful when making such arguments, tho.
>
> > The "we know best" is more of an "I know best" as someone who has
> > worked with page pool and the page fragment API since well before it
> > existed. My push back is based on the fact that we don't want to
> > allocate fragments, we want to allocate pages and fragment them
> > ourselves after the fact. As such it doesn't make much sense to add an
> > API that will have us trying to use the page fragment API which holds
> > onto the page when the expectation is that we will take the whole
> > thing and just fragment it ourselves.
>
> To be clear I'm not arguing for the exact use of the API as suggested.
> Or even that we should support this in the shared API. One would
> probably have to take a stab at coding it up to find out what works
> best. My first try FWIW would be to mask off the low bits of the
> page index, eg. for 64k page making entries 0-15 all use rx_buf
> index 0...

It would take a few more changes to make it all work. Basically we
would need to map the page into every descriptor entry since the worst
case scenario would be that somehow we end up with things getting so
tight that the page is only partially mapped and we are working
through it as a subset of 4K slices with some at the beginning being
unmapped from the descriptor ring while some are still waiting to be
assigned to a descriptor and used. What I would probably have to look
at doing is adding some sort of cache on the ring to hold onto it
while we dole it out 4K at a time to the descriptors. Either that or
enforce a hard 16 descriptor limit where we have to assign a full page
with every allocation meaning we are at a higher risk for starving the
device for memory.

The bigger issue would be how could we test it? This is an OCP NIC and
as far as I am aware we don't have any systems available that would
support a 64K page. I suppose I could rebuild the QEMU for an
architecture that supports 64K pages and test it. It would just be
painful to have to set up a virtual system to test code that would
literally never be used again. I am not sure QEMU can generate enough
stress to really test the page allocator and make sure all corner
cases are covered.