netdev - Re: [PATCH vhost 3/6] virtio_net: replace private by pp struct inside page

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1713146919.8867755-1-xuanzhuo@linux.alibaba.com>
Date: Mon, 15 Apr 2024 10:08:39 +0800
From: Xuan Zhuo <xuanzhuo@...ux.alibaba.com>
To: Jason Wang <jasowang@...hat.com>
Cc: virtualization@...ts.linux.dev,
 "Michael S. Tsirkin" <mst@...hat.com>,
 "David S. Miller" <davem@...emloft.net>,
 Eric Dumazet <edumazet@...gle.com>,
 Jakub Kicinski <kuba@...nel.org>,
 Paolo Abeni <pabeni@...hat.com>,
 netdev@...r.kernel.org
Subject: Re: [PATCH vhost 3/6] virtio_net: replace private by pp struct inside page

On Fri, 12 Apr 2024 13:49:12 +0800, Jason Wang <jasowang@...hat.com> wrote:
> On Fri, Apr 12, 2024 at 1:39 PM Xuan Zhuo <xuanzhuo@...ux.alibaba.com> wrote:
> >
> > On Fri, 12 Apr 2024 12:47:55 +0800, Jason Wang <jasowang@...hat.com> wrote:
> > > On Thu, Apr 11, 2024 at 10:51 AM Xuan Zhuo <xuanzhuo@...ux.alibaba.com> wrote:
> > > >
> > > > Now, we chain the pages of big mode by the page's private variable.
> > > > But a subsequent patch aims to make the big mode to support
> > > > premapped mode. This requires additional space to store the dma addr.
> > > >
> > > > Within the sub-struct that contains the 'private', there is no suitable
> > > > variable for storing the DMA addr.
> > > >
> > > >                 struct {        /* Page cache and anonymous pages */
> > > >                         /**
> > > >                          * @lru: Pageout list, eg. active_list protected by
> > > >                          * lruvec->lru_lock.  Sometimes used as a generic list
> > > >                          * by the page owner.
> > > >                          */
> > > >                         union {
> > > >                                 struct list_head lru;
> > > >
> > > >                                 /* Or, for the Unevictable "LRU list" slot */
> > > >                                 struct {
> > > >                                         /* Always even, to negate PageTail */
> > > >                                         void *__filler;
> > > >                                         /* Count page's or folio's mlocks */
> > > >                                         unsigned int mlock_count;
> > > >                                 };
> > > >
> > > >                                 /* Or, free page */
> > > >                                 struct list_head buddy_list;
> > > >                                 struct list_head pcp_list;
> > > >                         };
> > > >                         /* See page-flags.h for PAGE_MAPPING_FLAGS */
> > > >                         struct address_space *mapping;
> > > >                         union {
> > > >                                 pgoff_t index;          /* Our offset within mapping. */
> > > >                                 unsigned long share;    /* share count for fsdax */
> > > >                         };
> > > >                         /**
> > > >                          * @private: Mapping-private opaque data.
> > > >                          * Usually used for buffer_heads if PagePrivate.
> > > >                          * Used for swp_entry_t if PageSwapCache.
> > > >                          * Indicates order in the buddy system if PageBuddy.
> > > >                          */
> > > >                         unsigned long private;
> > > >                 };
> > > >
> > > > But within the page pool struct, we have a variable called
> > > > dma_addr that is appropriate for storing dma addr.
> > > > And that struct is used by netstack. That works to our advantage.
> > > >
> > > >                 struct {        /* page_pool used by netstack */
> > > >                         /**
> > > >                          * @pp_magic: magic value to avoid recycling non
> > > >                          * page_pool allocated pages.
> > > >                          */
> > > >                         unsigned long pp_magic;
> > > >                         struct page_pool *pp;
> > > >                         unsigned long _pp_mapping_pad;
> > > >                         unsigned long dma_addr;
> > > >                         atomic_long_t pp_ref_count;
> > > >                 };
> > > >
> > > > On the other side, we should use variables from the same sub-struct.
> > > > So this patch replaces the "private" with "pp".
> > > >
> > > > Signed-off-by: Xuan Zhuo <xuanzhuo@...ux.alibaba.com>
> > > > ---
> > >
> > > Instead of doing a customized version of page pool, can we simply
> > > switch to use page pool for big mode instead? Then we don't need to
> > > bother the dma stuffs.
> >
> >
> > The page pool needs to do the dma by the DMA APIs.
> > So we can not use the page pool directly.
>
> I found this:
>
> define PP_FLAG_DMA_MAP         BIT(0) /* Should page_pool do the DMA
>                                         * map/unmap
>
> It seems to work here?


I have studied the page pool mechanism and believe that we cannot use it
directly. We can make the page pool to bypass the DMA operations.
This allows us to handle DMA within virtio-net for pages allocated from the page
pool. Furthermore, we can utilize page pool helpers to associate the DMA address
to the page.

However, the critical issue pertains to unmapping. Ideally, we want to return
the mapped pages to the page pool and reuse them. In doing so, we can omit the
unmapping and remapping steps.

Currently, there's a caveat: when the page pool cache is full, it disconnects
and releases the pages. When the pool hits its capacity, pages are relinquished
without a chance for unmapping. If we were to unmap pages each time before
returning them to the pool, we would negate the benefits of bypassing the
mapping and unmapping process altogether.

Thanks.



>
> Thanks
>
> >
> > Thanks.
> >
> >
> > >
> > > Thanks
> > >
> >
>