[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACGkMEvmaH9NE-5VDBPpZOpAAg4bX39Lf0-iGiYzxdV5JuZWww@mail.gmail.com>
Date: Mon, 15 Apr 2024 14:43:24 +0800
From: Jason Wang <jasowang@...hat.com>
To: Xuan Zhuo <xuanzhuo@...ux.alibaba.com>
Cc: virtualization@...ts.linux.dev, "Michael S. Tsirkin" <mst@...hat.com>,
"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, netdev@...r.kernel.org
Subject: Re: [PATCH vhost 3/6] virtio_net: replace private by pp struct inside page
On Mon, Apr 15, 2024 at 10:35 AM Xuan Zhuo <xuanzhuo@...ux.alibaba.com> wrote:
>
> On Fri, 12 Apr 2024 13:49:12 +0800, Jason Wang <jasowang@...hat.com> wrote:
> > On Fri, Apr 12, 2024 at 1:39 PM Xuan Zhuo <xuanzhuo@...ux.alibaba.com> wrote:
> > >
> > > On Fri, 12 Apr 2024 12:47:55 +0800, Jason Wang <jasowang@...hat.com> wrote:
> > > > On Thu, Apr 11, 2024 at 10:51 AM Xuan Zhuo <xuanzhuo@...ux.alibaba.com> wrote:
> > > > >
> > > > > Now, we chain the pages of big mode by the page's private variable.
> > > > > But a subsequent patch aims to make the big mode to support
> > > > > premapped mode. This requires additional space to store the dma addr.
> > > > >
> > > > > Within the sub-struct that contains the 'private', there is no suitable
> > > > > variable for storing the DMA addr.
> > > > >
> > > > > struct { /* Page cache and anonymous pages */
> > > > > /**
> > > > > * @lru: Pageout list, eg. active_list protected by
> > > > > * lruvec->lru_lock. Sometimes used as a generic list
> > > > > * by the page owner.
> > > > > */
> > > > > union {
> > > > > struct list_head lru;
> > > > >
> > > > > /* Or, for the Unevictable "LRU list" slot */
> > > > > struct {
> > > > > /* Always even, to negate PageTail */
> > > > > void *__filler;
> > > > > /* Count page's or folio's mlocks */
> > > > > unsigned int mlock_count;
> > > > > };
> > > > >
> > > > > /* Or, free page */
> > > > > struct list_head buddy_list;
> > > > > struct list_head pcp_list;
> > > > > };
> > > > > /* See page-flags.h for PAGE_MAPPING_FLAGS */
> > > > > struct address_space *mapping;
> > > > > union {
> > > > > pgoff_t index; /* Our offset within mapping. */
> > > > > unsigned long share; /* share count for fsdax */
> > > > > };
> > > > > /**
> > > > > * @private: Mapping-private opaque data.
> > > > > * Usually used for buffer_heads if PagePrivate.
> > > > > * Used for swp_entry_t if PageSwapCache.
> > > > > * Indicates order in the buddy system if PageBuddy.
> > > > > */
> > > > > unsigned long private;
> > > > > };
> > > > >
> > > > > But within the page pool struct, we have a variable called
> > > > > dma_addr that is appropriate for storing dma addr.
> > > > > And that struct is used by netstack. That works to our advantage.
> > > > >
> > > > > struct { /* page_pool used by netstack */
> > > > > /**
> > > > > * @pp_magic: magic value to avoid recycling non
> > > > > * page_pool allocated pages.
> > > > > */
> > > > > unsigned long pp_magic;
> > > > > struct page_pool *pp;
> > > > > unsigned long _pp_mapping_pad;
> > > > > unsigned long dma_addr;
> > > > > atomic_long_t pp_ref_count;
> > > > > };
> > > > >
> > > > > On the other side, we should use variables from the same sub-struct.
> > > > > So this patch replaces the "private" with "pp".
> > > > >
> > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@...ux.alibaba.com>
> > > > > ---
> > > >
> > > > Instead of doing a customized version of page pool, can we simply
> > > > switch to use page pool for big mode instead? Then we don't need to
> > > > bother the dma stuffs.
> > >
> > >
> > > The page pool needs to do the dma by the DMA APIs.
> > > So we can not use the page pool directly.
> >
> > I found this:
> >
> > define PP_FLAG_DMA_MAP BIT(0) /* Should page_pool do the DMA
> > * map/unmap
> >
> > It seems to work here?
>
>
> I have studied the page pool mechanism and believe that we cannot use it
> directly. We can make the page pool to bypass the DMA operations.
> This allows us to handle DMA within virtio-net for pages allocated from the page
> pool. Furthermore, we can utilize page pool helpers to associate the DMA address
> to the page.
>
> However, the critical issue pertains to unmapping. Ideally, we want to return
> the mapped pages to the page pool and reuse them. In doing so, we can omit the
> unmapping and remapping steps.
>
> Currently, there's a caveat: when the page pool cache is full, it disconnects
> and releases the pages. When the pool hits its capacity, pages are relinquished
> without a chance for unmapping.
Technically, when ptr_ring is full there could be a fallback, but then
it requires expensive synchronization between producer and consumer.
For virtio-net, it might not be a problem because add/get has been
synchronized. (It might be relaxed in the future, actually we've
already seen a requirement in the past for virito-blk).
> If we were to unmap pages each time before
> returning them to the pool, we would negate the benefits of bypassing the
> mapping and unmapping process altogether.
Yes, but the problem in this approach is that it creates a corner
exception where dma_addr is used outside the page pool.
Maybe for big mode it doesn't matter too much if there's no
performance improvement.
Thanks
>
> Thanks.
>
>
>
> >
> > Thanks
> >
> > >
> > > Thanks.
> > >
> > >
> > > >
> > > > Thanks
> > > >
> > >
> >
>
Powered by blists - more mailing lists