netdev - Re: [PATCH net] net: avoid 32 x truesize under-estimation for tiny skbs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <f3f867cf6814510817b253e6aca997cdd3acc48a.camel@redhat.com>
Date:   Thu, 08 Sep 2022 20:01:42 +0200
From:   Paolo Abeni <pabeni@...hat.com>
To:     Alexander H Duyck <alexander.duyck@...il.com>,
        Eric Dumazet <eric.dumazet@...il.com>,
        "David S . Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>
Cc:     netdev <netdev@...r.kernel.org>,
        Eric Dumazet <edumazet@...gle.com>,
        Alexander Duyck <alexanderduyck@...com>,
        "Michael S . Tsirkin" <mst@...hat.com>,
        Greg Thelen <gthelen@...gle.com>
Subject: Re: [PATCH net] net: avoid 32 x truesize under-estimation for tiny
 skbs

On Thu, 2022-09-08 at 07:53 -0700, Alexander H Duyck wrote:
> On Thu, 2022-09-08 at 13:00 +0200, Paolo Abeni wrote:
> > In most build GRO_MAX_HEAD packets are even larger (should be 640)
> 
> Right, which is why I am thinking we may want to default to a 1K slice.

Ok it looks like there is agreement to force a minimum frag size of 1K.
Side note: that should not cause a memory usage increase compared to
the slab allocator as kmalloc(640) should use the kmalloc-1k slab.

[...]

> > > 
> > If the pagecnt optimization should be dropped, it would be probably
> > more straight-forward to use/adapt 'page_frag' for the page_order0
> > allocator.
> 
> That would make sense. Basically we could get rid of the pagecnt bias
> and add the fixed number of slices to the count at allocation so we
> would just need to track the offset to decide when we need to allocate
> a new page. In addtion if we are flushing the page when it is depleted
> we don't have to mess with the pfmemalloc logic.

Uhmm... it looks like that the existing page_frag allocator does not
always flush the depleted page:

bool skb_page_frag_refill(unsigned int sz, struct page_frag *pfrag, gfp_t gfp)
{
        if (pfrag->page) {
                if (page_ref_count(pfrag->page) == 1) {
                        pfrag->offset = 0;
                        return true;
                }

so I'll try adding some separate/specialized code and see if the
overall complexity would be reasonable.

> > BTW it's quite strange/confusing having to very similar APIs (page_frag
> > and page_frag_cache) with very similar names and no references between
> > them.
> 
> I'm not sure what you are getting at here. There are plenty of
> references between them, they just aren't direct.

Looking/greping the tree I could not trivially understand when 'struct
page_frag' should be preferred over 'struct page_frag_cache' and/or
vice versa, I had to look at the respective implementation details.

Thanks,

Paolo