lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ch2lsauvt5zflpbsfpop5wnoxqfbx5p6yvf3wervzdkbrtmj7b@p35knagkslw5>
Date: Tue, 19 Nov 2024 10:16:48 -0500
From: "Liam R. Howlett" <Liam.Howlett@...cle.com>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, Jann Horn <jannh@...gle.com>,
        syzbot+bc6bfc25a68b7a020ee1@...kaller.appspotmail.com,
        Vlastimil Babka <vbabka@...e.cz>, stable@...r.kernel.org
Subject: Re: [PATCH 6.12.y] mm/mmap: fix __mmap_region() error handling in
 rare merge failure case

* Lorenzo Stoakes <lorenzo.stoakes@...cle.com> [241119 09:59]:
> On Mon, Nov 18, 2024 at 02:40:48PM -0500, Liam R. Howlett wrote:
> > From: "Liam R. Howlett" <Liam.Howlett@...cle.com>
> >
> > The mmap_region() function tries to install a new vma, which requires a
> > pre-allocation for the maple tree write due to the complex locking
> > scenarios involved.
> >
> > Recent efforts to simplify the error recovery required the relocation of
> > the preallocation of the maple tree nodes (via vma_iter_prealloc()
> > calling mas_preallocate()) higher in the function.
> >
> > The relocation of the preallocation meant that, if there was a file
> > associated with the vma and the driver call (mmap_file()) modified the
> > vma flags, then a new merge of the new vma with existing vmas is
> > attempted.
> >
> > During the attempt to merge the existing vma with the new vma, the vma
> > iterator is used - the same iterator that would be used for the next
> > write attempt to the tree.  In the event of needing a further allocation
> > and if the new allocations fails, the vma iterator (and contained maple
> > state) will cleaned up, including freeing all previous allocations and
> > will be reset internally.
> >
> > Upon returning to the __mmap_region() function, the error reason is lost
> > and the function sets the vma iterator limits, and then tries to
> > continue to store the new vma using vma_iter_store() - which expects
> > preallocated nodes.
> >
> > A preallocation should be performed in case the allocations were lost
> > during the failure scenario - there is no risk of over allocating.  The
> > range is already set in the vma_iter_store() call below, so it is not
> > necessary.
> >
> > Reported-by: syzbot+bc6bfc25a68b7a020ee1@...kaller.appspotmail.com
> > Closes: https://syzkaller.appspot.com/x/log.txt?x=17b0ace8580000
> > Fixes: 5de195060b2e2 ("mm: resolve faulty mmap_region() error path behaviour")
> > Signed-off-by: Liam R. Howlett <Liam.Howlett@...cle.com>
> > Cc: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
> > Cc: Vlastimil Babka <vbabka@...e.cz>
> > Cc: Jann Horn <jannh@...gle.com>
> > Cc: <stable@...r.kernel.org>
> > ---
> >  mm/mmap.c | 5 ++++-
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/mm/mmap.c b/mm/mmap.c
> > index 79d541f1502b2..5cef9a1981f1b 100644
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -1491,7 +1491,10 @@ static unsigned long __mmap_region(struct file *file, unsigned long addr,
> >  				vm_flags = vma->vm_flags;
> >  				goto file_expanded;
> >  			}
> > -			vma_iter_config(&vmi, addr, end);
> > +			if (vma_iter_prealloc(&vmi, vma)) {
> > +				error = -ENOMEM;
> > +				goto unmap_and_free_file_vma;
> > +			}
> 
> This breaks the whole thing of these changes which is to avoid having to
> abort after invoking mmap_file(), so we can't do it this way.
> 
> As discussed via IRC, it's probably best to do this with __GFP_NOFAIL or
> similar to avoid the abort altogether.
> 
> But reaching this point is very common, since drivers will often change the
> flags, and failing to merge is also very common so we need to be sure this
> is a no-op if no pre-allocation is required in the _very rare_ (if even
> possible in reality) case of a merge failure due to OOM.

The tree will do nothing if there is enough or more than needed
preallocated for the operation.  If this is common though, we should try
to avoid the call into the tree with the vmg_nomem() test.

> 
> We store in the vmg state whether a merge failed due to oom, which you can
> test for explicitly via vmg_nomem().

Okay, I will respin.

Thanks,
Liam

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ