[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YH6MchNQPgFjfuQ+@apalos.home>
Date: Tue, 20 Apr 2021 11:10:26 +0300
From: Ilias Apalodimas <ilias.apalodimas@...aro.org>
To: Matthew Wilcox <willy@...radead.org>
Cc: Jesper Dangaard Brouer <brouer@...hat.com>,
Shakeel Butt <shakeelb@...gle.com>,
Matteo Croce <mcroce@...ux.microsoft.com>,
netdev <netdev@...r.kernel.org>, Linux MM <linux-mm@...ck.org>,
Ayush Sawal <ayush.sawal@...lsio.com>,
Vinay Kumar Yadav <vinay.yadav@...lsio.com>,
Rohit Maheshwari <rohitm@...lsio.com>,
"David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Thomas Petazzoni <thomas.petazzoni@...tlin.com>,
Marcin Wojtas <mw@...ihalf.com>,
Russell King <linux@...linux.org.uk>,
Mirko Lindner <mlindner@...vell.com>,
Stephen Hemminger <stephen@...workplumber.org>,
Tariq Toukan <tariqt@...dia.com>,
Jesper Dangaard Brouer <hawk@...nel.org>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
John Fastabend <john.fastabend@...il.com>,
Boris Pismenny <borisp@...dia.com>,
Arnd Bergmann <arnd@...db.de>,
Andrew Morton <akpm@...ux-foundation.org>,
"Peter Zijlstra (Intel)" <peterz@...radead.org>,
Vlastimil Babka <vbabka@...e.cz>, Yu Zhao <yuzhao@...gle.com>,
Will Deacon <will@...nel.org>,
Michel Lespinasse <walken@...gle.com>,
Fenghua Yu <fenghua.yu@...el.com>,
Roman Gushchin <guro@...com>, Hugh Dickins <hughd@...gle.com>,
Peter Xu <peterx@...hat.com>, Jason Gunthorpe <jgg@...pe.ca>,
Guoqing Jiang <guoqing.jiang@...ud.ionos.com>,
Jonathan Lemon <jonathan.lemon@...il.com>,
Alexander Lobakin <alobakin@...me>,
Cong Wang <cong.wang@...edance.com>, wenxu <wenxu@...oud.cn>,
Kevin Hao <haokexin@...il.com>,
Aleksandr Nogikh <nogikh@...gle.com>,
Jakub Sitnicki <jakub@...udflare.com>,
Marco Elver <elver@...gle.com>,
Willem de Bruijn <willemb@...gle.com>,
Miaohe Lin <linmiaohe@...wei.com>,
Yunsheng Lin <linyunsheng@...wei.com>,
Guillaume Nault <gnault@...hat.com>,
LKML <linux-kernel@...r.kernel.org>, linux-rdma@...r.kernel.org,
bpf <bpf@...r.kernel.org>, Eric Dumazet <edumazet@...gle.com>,
David Ahern <dsahern@...il.com>,
Lorenzo Bianconi <lorenzo@...nel.org>,
Saeed Mahameed <saeedm@...dia.com>,
Andrew Lunn <andrew@...n.ch>, Paolo Abeni <pabeni@...hat.com>
Subject: Re: [PATCH net-next v3 2/5] mm: add a signature in struct page
Hi Matthew,
[...]
>
> And the contents of this page already came from that device ... if it
> wanted to write bad data, it could already have done so.
>
> > > > (3) The page_pool is optimized for refcnt==1 case, and AFAIK TCP-RX
> > > > zerocopy will bump the refcnt, which means the page_pool will not
> > > > recycle the page when it see the elevated refcnt (it will instead
> > > > release its DMA-mapping).
> > >
> > > Yes this is right but the userspace might have already consumed and
> > > unmapped the page before the driver considers to recycle the page.
> >
> > That is a good point. So, there is a race window where it is possible
> > to gain recycling.
> >
> > It seems my page_pool co-maintainer Ilias is interested in taking up the
> > challenge to get this working with TCP RX zerocopy. So, lets see how
> > this is doable.
>
> You could also check page_ref_count() - page_mapcount() instead of
> just checking page_ref_count(). Assuming mapping/unmapping can't
> race with recycling?
>
That's not a bad idea. As I explained on my last reply to Shakeel, I don't
think the current patch will blow up anywhere. If the page is unmapped prior
to kfree_skb() it will be recycled. If it's done in a reverse order, we'll
just free the page entirely and will have to re-allocate it.
The only thing I need to test is potential races (assuming those can even
happen?).
Trying to recycle the page outside of kfree_skb() means we'd have to 'steal'
the page, during put_page() (or some function that's outside the networking
scope). I think this is going to have a measurable performance penalty though
not in networking, but in general.
In any case, that should be orthogonal to the current patchset. So unless
someone feels strongly about it, I'd prefer keeping the current code and
trying to enable recycling in the skb zc case, when we have enough users of
the API.
Thanks
/Ilias
Powered by blists - more mailing lists