lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230324135550.q2xfuj4hjs7odbu5@box>
Date:   Fri, 24 Mar 2023 16:55:50 +0300
From:   "Kirill A. Shutemov" <kirill@...temov.name>
To:     Joel Fernandes <joel@...lfernandes.org>
Cc:     Michal Hocko <mhocko@...e.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Naresh Kamboju <naresh.kamboju@...aro.org>,
        Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: WARN_ON in move_normal_pmd

On Fri, Mar 24, 2023 at 09:43:10AM -0400, Joel Fernandes wrote:
> On Fri, Mar 24, 2023 at 9:05 AM Kirill A. Shutemov <kirill@...temov.name> wrote:
> >
> > On Fri, Mar 24, 2023 at 12:15:24PM +0100, Michal Hocko wrote:
> > > Hi,
> > > our QA is regularly hitting
> > > [  544.198822][T20518] WARNING: CPU: 1 PID: 20518 at ../mm/mremap.c:255 move_pgt_entry+0x4c6/0x510
> > > triggered by thp01 LTP test. This has been brought up in the past and
> > > resulted in f81fdd0c4ab7 ("mm: document warning in move_normal_pmd() and
> > > make it warn only once"). While it is good that the underlying problem
> > > is understood, it doesn't seem there is enough interest to fix it
> > > properly. Which is fair but I am wondering whether the WARN_ON gives
> > > us anything here.
> > >
> > > Our QA process collects all unexpected side effects of tests and a WARN*
> > > in the log is certainly one of those. This trigger bugs which are mostly
> > > ignored as there is no upstream fix for them. This alone is nothing
> > > really critical but there are workloads which do run with panic on warn
> > > configured and this issue would put the system down which is unnecessary
> > > IMHO. Would it be sufficient to replace the WARN_ON_ONCE by
> > > pr_warn_once?
> >
> > What about relaxing the check to exclude temporary stack from the WARN
> > condition:
> >
> > diff --git a/mm/mremap.c b/mm/mremap.c
> > index 411a85682b58..eb0778b9d645 100644
> > --- a/mm/mremap.c
> > +++ b/mm/mremap.c
> > @@ -247,15 +247,12 @@ static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long old_addr,
> >          * of any 4kB pages, but still there) PMD in the page table
> >          * tree.
> >          *
> > -        * Warn on it once - because we really should try to figure
> > -        * out how to do this better - but then say "I won't move
> > -        * this pmd".
> > -        *
> > -        * One alternative might be to just unmap the target pmd at
> > -        * this point, and verify that it really is empty. We'll see.
> > +        * Warn on it once unless it is initial (temporary) stack.
> >          */
> > -       if (WARN_ON_ONCE(!pmd_none(*new_pmd)))
> > +       if (!pmd_none(*new_pmd)) {
> > +               WARN_ON_ONCE(!vma_is_temporary_stack(vma));
> >                 return false;
> > +       }
> 
> Wouldn't it be better to instead fix it from the caller side? Like
> making it non-overlapping.
> 
> Reading some old threads, I had tried to fix it [1] along these lines
> but Linus was rightfully concerned about that fix [2]. Maybe we can
> revisit and fix it properly this time.
> 
> Personally I feel the safest thing to do is to not do a
> non-overlapping mremap and get rid of the warning. Or is there a
> better way like unmapping the target from the caller side first,
> before the move?

Making it non-overlapping limits randomization effectiveness. We need to
quantify it at least.

-- 
  Kiryl Shutsemau / Kirill A. Shutemov

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ