lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKgT0Ud6EnT0ggwmVUEubX9TPkgGb9+5TTWK-_XnTBbqaec8Gw@mail.gmail.com>
Date: Thu, 15 Aug 2024 08:09:07 -0700
From: Alexander Duyck <alexander.duyck@...il.com>
To: Yunsheng Lin <linyunsheng@...wei.com>
Cc: davem@...emloft.net, kuba@...nel.org, pabeni@...hat.com, 
	netdev@...r.kernel.org, linux-kernel@...r.kernel.org, 
	Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org
Subject: Re: [PATCH net-next v13 08/14] mm: page_frag: some minor refactoring
 before adding new API

On Wed, Aug 14, 2024 at 8:04 PM Yunsheng Lin <linyunsheng@...wei.com> wrote:
>
> On 2024/8/15 1:54, Alexander H Duyck wrote:
> > On Thu, 2024-08-08 at 20:37 +0800, Yunsheng Lin wrote:
> >> Refactor common codes from __page_frag_alloc_va_align()
> >> to __page_frag_cache_reload(), so that the new API can
> >> make use of them.
> >>
> >> CC: Alexander Duyck <alexander.duyck@...il.com>
> >> Signed-off-by: Yunsheng Lin <linyunsheng@...wei.com>
> >> ---
> >>  include/linux/page_frag_cache.h |   2 +-
> >>  mm/page_frag_cache.c            | 138 ++++++++++++++++++--------------
> >>  2 files changed, 81 insertions(+), 59 deletions(-)
> >>
> >> diff --git a/include/linux/page_frag_cache.h b/include/linux/page_frag_cache.h
> >> index 4ce924eaf1b1..0abffdd10a1c 100644
> >> --- a/include/linux/page_frag_cache.h
> >> +++ b/include/linux/page_frag_cache.h
> >> @@ -52,7 +52,7 @@ static inline void *encoded_page_address(unsigned long encoded_va)
> >>
> >>  static inline void page_frag_cache_init(struct page_frag_cache *nc)
> >>  {
> >> -    nc->encoded_va = 0;
> >> +    memset(nc, 0, sizeof(*nc));
> >>  }
> >>
> >
> > Still not a fan of this. Just setting encoded_va to 0 should be enough
> > as the other fields will automatically be overwritten when the new page
> > is allocated.
> >
> > Relying on memset is problematic at best since you then introduce the
> > potential for issues where remaining somehow gets corrupted but
> > encoded_va/page is 0. I would rather have both of these being checked
> > as a part of allocation than just just assuming it is valid if
> > remaining is set.
>
> Does adding something like VM_BUG_ON(!nc->encoded_va && nc->remaining) to
> catch the above problem address your above concern?

Not really. I would prefer to just retain the existing behavior.

> >
> > I would prefer to keep the check for a non-0 encoded_page value and
> > then check remaining rather than just rely on remaining as it creates a
> > single point of failure. With that we can safely tear away a page and
> > the next caller to try to allocate will populated a new page and the
> > associated fields.
>
> As mentioned before, the memset() is used mainly because of:
> 1. avoid a checking in the fast path.
> 2. avoid duplicating the checking pattern you mentioned above for the
>    new API.

I'm not a fan of the new code flow after getting rid of the checking
in the fast path. The code is becoming a tangled mess of spaghetti
code in my opinion. Arguably the patches don't help as you are taking
huge steps in many of these patches and it is making it hard to read.
In addition the code becomes more obfuscated with each patch which is
one of the reasons why I would have preferred to see this set broken
into a couple sets so we can give it some time for any of the kinks to
get worked out.

> >
> >>  static inline bool page_frag_cache_is_pfmemalloc(struct page_frag_cache *nc)
> >> diff --git a/mm/page_frag_cache.c b/mm/page_frag_cache.c
> >> index 2544b292375a..4e6b1c4684f0 100644
> >> --- a/mm/page_frag_cache.c
> >> +++ b/mm/page_frag_cache.c
> >> @@ -19,8 +19,27 @@
> >>  #include <linux/page_frag_cache.h>
> >>  #include "internal.h"
> >>
>
> ...
>
> >> +
> >> +/* Reload cache by reusing the old cache if it is possible, or
> >> + * refilling from the page allocator.
> >> + */
> >> +static bool __page_frag_cache_reload(struct page_frag_cache *nc,
> >> +                                 gfp_t gfp_mask)
> >> +{
> >> +    if (likely(nc->encoded_va)) {
> >> +            if (__page_frag_cache_reuse(nc->encoded_va, nc->pagecnt_bias))
> >> +                    goto out;
> >> +    }
> >> +
> >> +    if (unlikely(!__page_frag_cache_refill(nc, gfp_mask)))
> >> +            return false;
> >> +
> >> +out:
> >> +    /* reset page count bias and remaining to start of new frag */
> >> +    nc->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1;
> >> +    nc->remaining = page_frag_cache_page_size(nc->encoded_va);
> >
> > One thought I am having is that it might be better to have the
> > pagecnt_bias get set at the same time as the page_ref_add or the
> > set_page_count call. In addition setting the remaining value at the
> > same time probably would make sense as in the refill case you can make
> > use of the "order" value directly instead of having to write/read it
> > out of the encoded va/page.
>
> Probably, there is always tradeoff to make regarding avoid code
> duplication and avoid reading the order, I am not sure it matters
> for both for case, I would rather keep the above pattern if there
> is not obvious benefit for the other pattern.

Part of it is more about keeping the functions contained to generating
self contained objects. I am not a fan of us splitting up the page
init into a few sections as it makes it much easier to mess up a page
by changing one spot and overlooking the fact that an additional page
is needed somewhere else.

> >
> > With that we could simplify this function and get something closer to
> > what we had for the original alloc_va_align code.
> >
> >> +    return true;
> >>  }
> >>
> >>  void page_frag_cache_drain(struct page_frag_cache *nc)
> >> @@ -55,7 +100,7 @@ void page_frag_cache_drain(struct page_frag_cache *nc)
> >>
> >>      __page_frag_cache_drain(virt_to_head_page((void *)nc->encoded_va),
> >>                              nc->pagecnt_bias);
> >> -    nc->encoded_va = 0;
> >> +    memset(nc, 0, sizeof(*nc));
> >>  }
> >>  EXPORT_SYMBOL(page_frag_cache_drain);
> >>
> >> @@ -73,67 +118,44 @@ void *__page_frag_alloc_va_align(struct page_frag_cache *nc,
> >>                               unsigned int align_mask)
> >>  {
> >>      unsigned long encoded_va = nc->encoded_va;
> >> -    unsigned int size, remaining;
> >> -    struct page *page;
> >> -
> >> -    if (unlikely(!encoded_va)) {
> >
> > We should still be checking this before we even touch remaining.
> > Otherwise we greatly increase the risk of providing a bad virtual
> > address and have greatly decreased the likelihood of us catching
> > potential errors gracefully.
> >
> >> -refill:
> >> -            page = __page_frag_cache_refill(nc, gfp_mask);
> >> -            if (!page)
> >> -                    return NULL;
> >> -
> >> -            encoded_va = nc->encoded_va;
> >> -            size = page_frag_cache_page_size(encoded_va);
> >> -
> >> -            /* Even if we own the page, we do not use atomic_set().
> >> -             * This would break get_page_unless_zero() users.
> >> -             */
> >> -            page_ref_add(page, PAGE_FRAG_CACHE_MAX_SIZE);
> >> -
> >> -            /* reset page count bias and remaining to start of new frag */
> >> -            nc->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1;
> >> -            nc->remaining = size;
> >
> > With my suggested change above you could essentially just drop the
> > block starting from the comment and this function wouldn't need to
> > change as much as it is.
>
> It seems you are still suggesting that new API also duplicates the old
> checking pattern in __page_frag_alloc_va_align()?
>
> I would rather avoid the above if something like VM_BUG_ON() can address
> your above concern.

Yes, that is what I am suggesting. It makes the code much less prone
to any sort of possible races as resetting encoded_va would make it so
that reads for all the other fields would be skipped versus having to
use a memset which differs in implementation depending on the
architecture.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ