lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ad691cb4a744cbdc7da283c5c068331801482b36.camel@gmail.com>
Date: Tue, 30 Jul 2024 08:12:42 -0700
From: Alexander H Duyck <alexander.duyck@...il.com>
To: Yunsheng Lin <linyunsheng@...wei.com>
Cc: davem@...emloft.net, kuba@...nel.org, pabeni@...hat.com, 
	netdev@...r.kernel.org, linux-kernel@...r.kernel.org, Andrew Morton
	 <akpm@...ux-foundation.org>, linux-mm@...ck.org
Subject: Re: [RFC v11 08/14] mm: page_frag: some minor refactoring before
 adding new API

On Tue, 2024-07-30 at 21:20 +0800, Yunsheng Lin wrote:
> On 2024/7/23 21:19, Yunsheng Lin wrote:
> > > 

...

> > > The only piece that is really reused here is the pagecnt_bias
> > > assignment. What is obfuscated away is that the order is gotten
> > > through one of two paths. Really order isn't order here it is size.
> > > Which should have been fetched already. What you end up doing with
> > > this change is duplicating a bunch of code throughout the function.
> > > You end up having to fetch size multiple times multiple ways. here you
> > > are generating it with order. Then you have to turn around and get it
> > > again at the start of the function, and again after calling this
> > > function in order to pull it back out.
> > 
> > I am assuming you would like to reserve old behavior as below?
> > 
> > 	if(!encoded_va) {
> > refill:
> > 		__page_frag_cache_refill()
> > 	}
> > 
> > 
> > 	if(remaining < fragsz) {
> > 		if(!__page_frag_cache_recharge())
> > 			goto refill;
> > 	}
> > 
> > As we are adding new APIs, are we expecting new APIs also duplicate
> > the above pattern?
> > 
> > > 
> 
> How about something like below? __page_frag_cache_refill() and
> __page_frag_cache_reuse() does what their function name suggests
> as much as possible, __page_frag_cache_reload() is added to avoid
> new APIs duplicating similar pattern as much as possible, also
> avoid fetching size multiple times multiple ways as much as possible.

This is better. Still though with the code getting so spread out we
probably need to start adding more comments to explain things.

> static struct page *__page_frag_cache_reuse(unsigned long encoded_va,
>                                             unsigned int pagecnt_bias)
> {
>         struct page *page;
> 
>         page = virt_to_page((void *)encoded_va);
>         if (!page_ref_sub_and_test(page, pagecnt_bias))
>                 return NULL;
> 
>         if (unlikely(encoded_page_pfmemalloc(encoded_va))) {
>                 VM_BUG_ON(compound_order(page) !=
>                           encoded_page_order(encoded_va));

This VM_BUG_ON here makes no sense. If we are going to have this
anywhere it might make more sense in the cache_refill case below to
verify we are setting the order to match when we are generating the
encoded_va.

>                 free_unref_page(page, encoded_page_order(encoded_va));
>                 return NULL;
>         }
> 
>         /* OK, page count is 0, we can safely set it */
>         set_page_count(page, PAGE_FRAG_CACHE_MAX_SIZE + 1);
>         return page;

Why are you returning page here? It isn't used by any of the callers.
We are refilling the page here anyway so any caller should already have
access to the page since it wasn't changed.

> }
> 
> static struct page *__page_frag_cache_refill(struct page_frag_cache *nc,
>                                              gfp_t gfp_mask)
> {
>         unsigned long order = PAGE_FRAG_CACHE_MAX_ORDER;
>         struct page *page = NULL;
>         gfp_t gfp = gfp_mask;
> 
> #if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE)
>         gfp_mask = (gfp_mask & ~__GFP_DIRECT_RECLAIM) |  __GFP_COMP |
>                    __GFP_NOWARN | __GFP_NORETRY | __GFP_NOMEMALLOC;
>         page = alloc_pages_node(NUMA_NO_NODE, gfp_mask,
>                                 PAGE_FRAG_CACHE_MAX_ORDER);

I suspect the compliler is probably already doing this, but we should
probably not be updating gfp_mask but instead gfp since that is our
local variable.

> #endif
>         if (unlikely(!page)) {
>                 page = alloc_pages_node(NUMA_NO_NODE, gfp, 0);
>                 if (unlikely(!page)) {
>                         memset(nc, 0, sizeof(*nc));
>                         return NULL;
>                 }
> 
>                 order = 0;
>         }
> 
>         nc->encoded_va = encode_aligned_va(page_address(page), order,
>                                            page_is_pfmemalloc(page));
> 
>         /* Even if we own the page, we do not use atomic_set().
>          * This would break get_page_unless_zero() users.
>          */
>         page_ref_add(page, PAGE_FRAG_CACHE_MAX_SIZE);
> 
>         return page;

Again, returning page here doesn't make much sense. You are better off
not exposing internals as you have essentially locked the page down for
use by the frag API so you shouldn't be handing out the page directly
to callers.

> }
> 
> static struct page *__page_frag_cache_reload(struct page_frag_cache *nc,
>                                              gfp_t gfp_mask)
> {
>         struct page *page;
> 
>         if (likely(nc->encoded_va)) {
>                 page = __page_frag_cache_reuse(nc->encoded_va, nc->pagecnt_bias);
>                 if (page)
>                         goto out;
>         }
> 
>         page = __page_frag_cache_refill(nc, gfp_mask);
>         if (unlikely(!page))
>                 return NULL;
> 
> out:
>         nc->remaining = page_frag_cache_page_size(nc->encoded_va);
>         nc->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1;
>         return page;

Your one current caller doesn't use the page value provided here. I
would recommend just not bothering with the page variable until you
actually need it.

> }
> 
> void *__page_frag_alloc_va_align(struct page_frag_cache *nc,
>                                  unsigned int fragsz, gfp_t gfp_mask,
>                                  unsigned int align_mask)
> {
>         unsigned long encoded_va = nc->encoded_va;
>         unsigned int remaining;
> 
>         remaining = nc->remaining & align_mask;
>         if (unlikely(remaining < fragsz)) {

You might as well swap the code paths. It would likely be much easier
to read the case where you are handling remaining >= fragsz in here
rather than having more if statements buried within the if statement.
With that you will have more room for the comment and such below.

>                 if (unlikely(fragsz > PAGE_SIZE)) {
>                         /*
>                          * The caller is trying to allocate a fragment
>                          * with fragsz > PAGE_SIZE but the cache isn't big
>                          * enough to satisfy the request, this may
>                          * happen in low memory conditions.
>                          * We don't release the cache page because
>                          * it could make memory pressure worse
>                          * so we simply return NULL here.
>                          */
>                         return NULL;
>                 }
> 
>                 if (unlikely(!__page_frag_cache_reload(nc, gfp_mask)))
>                         return NULL;

This is what I am talking about in the earlier comments. You go to the
trouble of returning page through all the callers just to not use it
here. So there isn't any point in passing it through the functions.

> 
>                 nc->pagecnt_bias--;
>                 nc->remaining -= fragsz;
> 
>                 return encoded_page_address(nc->encoded_va);
>         }
> 
>         nc->pagecnt_bias--;
>         nc->remaining = remaining - fragsz;
> 
>         return encoded_page_address(encoded_va) +
>                 (page_frag_cache_page_size(encoded_va) - remaining);

Parenthesis here shouldn't be needed, addition and subtractions
operations can happen in any order with the result coming out the same.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ