lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 22 Sep 2021 00:13:37 +0200
From:   Uladzislau Rezki <urezki@...il.com>
To:     David Hildenbrand <david@...hat.com>
Cc:     Uladzislau Rezki <urezki@...il.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Ping Fang <pifang@...hat.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Roman Gushchin <guro@...com>, Michal Hocko <mhocko@...e.com>,
        Oscar Salvador <osalvador@...e.de>,
        Linux Memory Management List <linux-mm@...ck.org>
Subject: Re: [PATCH v1] mm/vmalloc: fix exact allocations with an alignment >
 1

On Fri, Sep 17, 2021 at 10:47:54AM +0200, David Hildenbrand wrote:
> > > > This patch looks like a KASAN specific to me. So i would like to avoid
> > > > such changes to
> > > > the vmalloc internals in order to simplify further maintenance and
> > > > keeping it generic
> > > > instead.
> > > 
> > > There is nothing KASAN specific in there :) It's specific to exact
> > > applications -- and KASAN may be one of the only users for now.
> > > 
> > Well, you can name it either way :) So it should not be specific by the
> > design, otherwise as i mentioned the allocator would be like a peace of
> > code that handles corner cases what is actually not acceptable.
> 
> Let's not overstress the situation of adding essentially 3 LOC in order to
> fix a sane use case of the vmalloc allocator that was not considered
> properly by internal changes due to 68ad4a330433 ("mm/vmalloc.c: keep track
> of free blocks for vmap allocation").
> 
> > 
> > > > 
> > > > Currently the find_vmap_lowest_match() adjusts the search size
> > > > criteria for any alignment
> > > > values even for PAGE_SIZE alignment. That is not optimal. Because a
> > > > PAGE_SIZE or one
> > > > page is a minimum granularity we operate on vmalloc allocations thus
> > > > all free blocks are
> > > > page aligned.
> > > > 
> > > > All that means that we should adjust the search length only if an
> > > > alignment request is bigger than
> > > > one page, in case of one page, that corresponds to PAGE_SIZE value,
> > > > there is no reason in such
> > > > extra adjustment because all free->va_start have a page boundary anyway.
> > > > 
> > > > Could you please test below patch that is a generic improvement?
> > > 
> > > I played with the exact approach below (almost exactly the same code in
> > > find_vmap_lowest_match()), and it might make sense as a general improvement
> > > -- if we can guarantee that start+end of ranges are always PAGE-aligned; I
> > > was not able to come to that conclusion...
> > All free blocks are PAGE aligned that is how it has to be. A vstart also
> > should be aligned otherwise the __vunmap() will complain about freeing
> > a bad address:
> > 
> > <snip>
> >      if (WARN(!PAGE_ALIGNED(addr), "Trying to vfree() bad address (%p)\n",
> >              addr))
> >          return;
> > <snip>
> > 
> > BTW, we should add an extra check to the alloc_vmap_area(), something like
> > below:
> > 
> > <snip>
> >      if (!PAGE_ALIGNED(ALIGN(vstart, align))) {
> >          WARN_ONCE(1, "vmalloc: vstart or align are not page aligned (0x%lx, 0x%lx)\n",
> >              vstart, align);
> >          return ERR_PTR(-EBUSY);
> > 	}
> > <snip>
> > 
> > to check that passed pair of vstart and align are correct.
> > 
> 
> Yes we better should.
> 
> > > 
> > > vmap_init_free_space() shows me that our existing alignment code/checks
> > > might be quite fragile.
> > > 
> > It is not important to page align a first address. As i mentioned before
> > vstart and align have to be correct and we should add such check.
> > 
> > > 
> > > But I mainly decided to go with my patch instead because it will also work
> > > for exact allocations with align > PAGE_SIZE: most notably, when we try
> > > population of hugepages instead of base pages in __vmalloc_node_range(), by
> > > increasing the alignment. If the exact range allows for using huge pages,
> > > placing huge pages will work just fine with my modifications, while it won't
> > > with your approach.
> > > 
> > > Long story short: my approach handles exact allocations also for bigger
> > > alignment, Your optimization makes sense as a general improvement for any
> > > vmalloc allocations.
> > > 
> > > Thanks for having a look!
> > > 
> > The problem is that there are no users(seems only KASAN) who set the range
> > that corresponds to exact size. If you add an aligment overhead on top of
> 
> So there is one user and it was broken by 68ad4a330433 ("mm/vmalloc.c: keep
> track of free blocks for vmap allocation").
> 
> > it means that search size should include it because we may end up with exact
> > free block and after applying aligment it will not fit. This is how allocators
> > handle aligment.
> 
> This is an implementation detail of the vmalloc allocator ever since
> 68ad4a330433 ("mm/vmalloc.c: keep track of free blocks for vmap
> allocation").
> 
> If callers pass an exact range, and the alignment they specify applies, why
> should we reject such an allocation? It's leaking an implementation detail
> fixed easily internally into callers.
> 
> > 
> > Another approach is(you mentioned it in your commit message):
> > 
> > urezki@...38:~/data/raid0/coding/linux-next.git$ git diff mm/kasan/shadow.c
> > diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c
> > index 082ee5b6d9a1..01d3bd76c851 100644
> > --- a/mm/kasan/shadow.c
> > +++ b/mm/kasan/shadow.c
> > @@ -200,7 +200,7 @@ static int __meminit kasan_mem_notifier(struct notifier_block *nb,
> >                  if (shadow_mapped(shadow_start))
> >                          return NOTIFY_OK;
> > -               ret = __vmalloc_node_range(shadow_size, PAGE_SIZE, shadow_start,
> > +               ret = __vmalloc_node_range(shadow_size, 1, shadow_start,
> >                                          shadow_end, GFP_KERNEL,
> >                                          PAGE_KERNEL, VM_NO_GUARD,
> >                                          pfn_to_nid(mem_data->start_pfn),
> > urezki@...38:~/data/raid0/coding/linux-next.git$
> > 
> > why not? Also you can increase the range in KASAN code.
> 
> No, that's leaking implementation details to the caller. And no, increasing
> the range and eventually allocating something bigger (e.g., placing a huge
> page where it might not have been possible) is not acceptable for KASAN.
> 
> If you're terribly unhappy with this patch,
Sorry to say but it simple does not make sense.

>
> please suggest something reasonable to fix exact allocations:
> a) Fixes the KASAN use case.
> b) Allows for automatic placement of huge pages for exact allocations.
> c) Doesn't leak implementation details into the caller.
>
I am looking at it.

--
Vlad Rezki

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ