[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+CK2bC=XyUhoSP9f0XBqEnQ-P5mMT2U=5dfzRSc9C=2b+bstQ@mail.gmail.com>
Date: Fri, 9 Feb 2024 21:21:10 -0500
From: Pasha Tatashin <pasha.tatashin@...een.com>
To: Robin Murphy <robin.murphy@....com>
Cc: akpm@...ux-foundation.org, alim.akhtar@...sung.com, alyssa@...enzweig.io,
asahi@...ts.linux.dev, baolu.lu@...ux.intel.com, bhelgaas@...gle.com,
cgroups@...r.kernel.org, corbet@....net, david@...hat.com,
dwmw2@...radead.org, hannes@...xchg.org, heiko@...ech.de,
iommu@...ts.linux.dev, jernej.skrabec@...il.com, jonathanh@...dia.com,
joro@...tes.org, krzysztof.kozlowski@...aro.org, linux-doc@...r.kernel.org,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, linux-rockchip@...ts.infradead.org,
linux-samsung-soc@...r.kernel.org, linux-sunxi@...ts.linux.dev,
linux-tegra@...r.kernel.org, lizefan.x@...edance.com, marcan@...can.st,
mhiramat@...nel.org, m.szyprowski@...sung.com, paulmck@...nel.org,
rdunlap@...radead.org, samuel@...lland.org, suravee.suthikulpanit@....com,
sven@...npeter.dev, thierry.reding@...il.com, tj@...nel.org,
tomas.mudrunka@...il.com, vdumpa@...dia.com, wens@...e.org, will@...nel.org,
yu-cheng.yu@...el.com, rientjes@...gle.com, bagasdotme@...il.com,
mkoutny@...e.com
Subject: Re: [PATCH v4 01/10] iommu/vt-d: add wrapper functions for page allocations
Hi Robin,
Thank you for reviewing this.
> > +#ifndef __IOMMU_PAGES_H
> > +#define __IOMMU_PAGES_H
> > +
> > +#include <linux/vmstat.h>
> > +#include <linux/gfp.h>
> > +#include <linux/mm.h>
> > +
> > +/*
> > + * All page allocation that are performed in the IOMMU subsystem must use one of
>
> "All page allocations" is too broad; As before, this is only about
> pagetable allocations, or I guess for the full nuance, allocations of
> pagetables and other per-iommu_domain configuration structures which are
> reasonable to report as "pagetables" to userspace.
I will update the comment.
>
> > + * the functions below. This is necessary for the proper accounting as IOMMU
> > + * state can be rather large, i.e. multiple gigabytes in size.
> > + */
> > +
> > +/**
> > + * __iommu_alloc_pages_node - allocate a zeroed page of a given order from
> > + * specific NUMA node.
> > + * @nid: memory NUMA node id
> > + * @gfp: buddy allocator flags
> > + * @order: page order
> > + *
> > + * returns the head struct page of the allocated page.
> > + */
> > +static inline struct page *__iommu_alloc_pages_node(int nid, gfp_t gfp,
> > + int order)
> > +{
> > + struct page *page;
> > +
> > + page = alloc_pages_node(nid, gfp | __GFP_ZERO, order);
> > + if (unlikely(!page))
> > + return NULL;
> > +
> > + return page;
> > +}
>
> All 3 invocations of this only use the returned struct page to trivially
> derive page_address(), so we really don't need it; just clean up these
> callsites a bit more.
I will remove this function, and update all invocations to use
iommu_alloc_pages_node() directly.
> > + * __iommu_alloc_pages - allocate a zeroed page of a given order.
> > + * @gfp: buddy allocator flags
> > + * @order: page order
> > + *
> > + * returns the head struct page of the allocated page.
> > + */
> > +static inline struct page *__iommu_alloc_pages(gfp_t gfp, int order)
> > +{
> > + struct page *page;
> > +
> > + page = alloc_pages(gfp | __GFP_ZERO, order);
> > + if (unlikely(!page))
> > + return NULL;
> > +
> > + return page;
> > +}
>
> Same for the single invocation of this one.
I kept this function, but removed __iommu_alloc_page() that depends on
it. This is because tegra-smmu needs a "struct page" allocator.
>
> > +
> > +/**
> > + * __iommu_alloc_page_node - allocate a zeroed page at specific NUMA node.
> > + * @nid: memory NUMA node id
> > + * @gfp: buddy allocator flags
> > + *
> > + * returns the struct page of the allocated page.
> > + */
> > +static inline struct page *__iommu_alloc_page_node(int nid, gfp_t gfp)
> > +{
> > + return __iommu_alloc_pages_node(nid, gfp, 0);
> > +}
>
> There are no users of this at all.
Yes, I added it just for completeness, I will remove it.
> > + * __iommu_alloc_page - allocate a zeroed page
> > + * @gfp: buddy allocator flags
> > + *
> > + * returns the struct page of the allocated page.
> > + */
> > +static inline struct page *__iommu_alloc_page(gfp_t gfp)
> > +{
> > + return __iommu_alloc_pages(gfp, 0);
> > +}
> > +
> > +/**
> > + * __iommu_free_pages - free page of a given order
> > + * @page: head struct page of the page
> > + * @order: page order
> > + */
> > +static inline void __iommu_free_pages(struct page *page, int order)
> > +{
> > + if (!page)
> > + return;
> > +
> > + __free_pages(page, order);
> > +}
> > +
> > +/**
> > + * __iommu_free_page - free page
> > + * @page: struct page of the page
> > + */
> > +static inline void __iommu_free_page(struct page *page)
> > +{
> > + __iommu_free_pages(page, 0);
> > +}
>
> Beyond one more trivial Intel cleanup for __iommu_alloc_pages(), these 3
> are then only used by tegra-smmu, so honestly I'd be inclined to just
> open-code there page_address()/virt_to_page() conversions as appropriate
> there (once again I think the whole thing could in fact be refactored to
> not use struct pages at all because all it's ever ultimately doing with
> them is page_address(), but that would be a bigger job so definitely
> out-of-scope for this series).
I removed __iommu_free_page(), but kept __iommu_free_pages() variant.
>
> > +
> > +/**
> > + * iommu_alloc_pages_node - allocate a zeroed page of a given order from
> > + * specific NUMA node.
> > + * @nid: memory NUMA node id
> > + * @gfp: buddy allocator flags
> > + * @order: page order
> > + *
> > + * returns the virtual address of the allocated page
> > + */
> > +static inline void *iommu_alloc_pages_node(int nid, gfp_t gfp, int order)
> > +{
> > + struct page *page = __iommu_alloc_pages_node(nid, gfp, order);
> > +
> > + if (unlikely(!page))
> > + return NULL;
>
> As a general point I'd prefer to fold these checks into the accounting
> function itself rather than repeat them all over.
For the free functions this saves a few cycles by not repeating this
check again inside __free_pages(), to keep things symmetrical it makes
sense to keep __iomu_free_account and __iomu_alloc_account the same.
With the other clean-up there are not that many of these checks left.
> > + */
> > +static inline void *iommu_alloc_page_node(int nid, gfp_t gfp)
> > +{
> > + return iommu_alloc_pages_node(nid, gfp, 0);
> > +}
>
> TBH I'm not entirely convinced that saving 4 characters per invocation
> times 11 invocations makes this wrapper worthwhile :/
Let's keep them. After the clean-up that you suggested, there are
fewer functions left in this file, but I think that it is cleaner to
keep these remaining, as it is beneficial to easily distinguish when
exactly one page is allocated vs when multiple are allocated via code
search.
> > + *
> > + * There are no locking requirement for these pages, as they are going to be
> > + * put on a free list as soon as refcount reaches 0. Pages are put on this LRU
> > + * list once they are removed from the IOMMU page tables. However, they can
> > + * still be access through debugfs.
> > + */
> > +static inline void iommu_free_pages_list(struct list_head *page)
>
> Nit: I'd be inclined to call this iommu_put_pages_list for consistency.
I will rename it to iommu_put_pages_list(), indeed a better name.
>
> > +{
> > + while (!list_empty(page)) {
> > + struct page *p = list_entry(page->prev, struct page, lru);
> > +
> > + list_del(&p->lru);
> > + put_page(p);
> > + }
> > +}
>
> I realise now you've also missed the common freelist freeing sites in
> iommu-dma.
Ah yes, thank you for catching that. I will fix it.
Pasha
Powered by blists - more mailing lists