[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20160412172011.GB93307@ast-mbp.thefacebook.com>
Date: Tue, 12 Apr 2016 10:20:12 -0700
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: Jesper Dangaard Brouer <brouer@...hat.com>
Cc: "lsf@...ts.linux-foundation.org" <lsf@...ts.linux-foundation.org>,
James Bottomley <James.Bottomley@...senPartnership.com>,
Sagi Grimberg <sagi@...mberg.me>,
Tom Herbert <tom@...bertland.com>,
Brenden Blanco <bblanco@...mgrid.com>,
Christoph Hellwig <hch@...radead.org>,
linux-mm <linux-mm@...ck.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
Bart Van Assche <bart.vanassche@...disk.com>,
"lsf-pc@...ts.linux-foundation.org"
<lsf-pc@...ts.linux-foundation.org>
Subject: Re: [Lsf] [Lsf-pc] [LSF/MM TOPIC] Generic page-pool recycle facility?
On Tue, Apr 12, 2016 at 08:16:49AM +0200, Jesper Dangaard Brouer wrote:
>
> On Mon, 11 Apr 2016 15:21:26 -0700
> Alexei Starovoitov <alexei.starovoitov@...il.com> wrote:
>
> > On Mon, Apr 11, 2016 at 11:41:57PM +0200, Jesper Dangaard Brouer wrote:
> > >
> > > On Sun, 10 Apr 2016 21:45:47 +0300 Sagi Grimberg <sagi@...mberg.me> wrote:
> > >
> [...]
> > > >
> > > > If we go down this road how about also attaching some driver opaques
> > > > to the page sets?
> > >
> > > That was the ultimate plan... to leave some opaques bytes left in the
> > > page struct that drivers could use.
> > >
> > > In struct page I would need a pointer back to my page_pool struct and a
> > > page flag. Then, I would need room to store the dma_unmap address.
> > > (And then some of the usual fields are still needed, like the refcnt,
> > > and reusing some of the list constructs). And a zero-copy cross-domain
> > > id.
> >
> > I don't think we need to add anything to struct page.
> > This is supposed to be small cache of dma_mapped pages with lockless access.
> > It can be implemented as an array or link list where every element
> > is dma_addr and pointer to page. If it is full, dma_unmap_page+put_page to
> > send it to back to page allocator.
>
> It sounds like the Intel drivers recycle facility, where they split the
> page into two parts, and keep page in RX-ring, by swapping to other
> half of page, if page_count(page) is <= 2. Thus, they use the atomic
> page ref count to synchronize on.
actually I'm proposing the opposite. one page = one packet.
I'm perfectly happy to waste half a page, since number of such pages is small
and performance matter more. Typical performance vs memory tradeoff.
> Thus, we end-up having two atomic operations per RX packet, on the page
> refcnt. Where DPDK have zero...
the page recycling cache should have zero atomic ops per packet
otherwise it's non starter.
> By fully taking over the page as an allocator, almost like slab. I can
> optimize the common case (of the packet-page getting allocated and
> free'ed on the same CPU), and remove these atomic operations.
slub is doing local cmpxchg. 40G networking cannot afford it per packet.
If it's amortized due to batching that will be ok.
Powered by blists - more mailing lists