[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <989cddf0-a3af-4f84-8b68-876e811a8795@lucifer.local>
Date: Tue, 19 Nov 2024 14:59:36 +0000
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: "Liam R. Howlett" <Liam.Howlett@...cle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, Jann Horn <jannh@...gle.com>,
syzbot+bc6bfc25a68b7a020ee1@...kaller.appspotmail.com,
Vlastimil Babka <vbabka@...e.cz>, stable@...r.kernel.org
Subject: Re: [PATCH 6.12.y] mm/mmap: fix __mmap_region() error handling in
rare merge failure case
On Mon, Nov 18, 2024 at 02:40:48PM -0500, Liam R. Howlett wrote:
> From: "Liam R. Howlett" <Liam.Howlett@...cle.com>
>
> The mmap_region() function tries to install a new vma, which requires a
> pre-allocation for the maple tree write due to the complex locking
> scenarios involved.
>
> Recent efforts to simplify the error recovery required the relocation of
> the preallocation of the maple tree nodes (via vma_iter_prealloc()
> calling mas_preallocate()) higher in the function.
>
> The relocation of the preallocation meant that, if there was a file
> associated with the vma and the driver call (mmap_file()) modified the
> vma flags, then a new merge of the new vma with existing vmas is
> attempted.
>
> During the attempt to merge the existing vma with the new vma, the vma
> iterator is used - the same iterator that would be used for the next
> write attempt to the tree. In the event of needing a further allocation
> and if the new allocations fails, the vma iterator (and contained maple
> state) will cleaned up, including freeing all previous allocations and
> will be reset internally.
>
> Upon returning to the __mmap_region() function, the error reason is lost
> and the function sets the vma iterator limits, and then tries to
> continue to store the new vma using vma_iter_store() - which expects
> preallocated nodes.
>
> A preallocation should be performed in case the allocations were lost
> during the failure scenario - there is no risk of over allocating. The
> range is already set in the vma_iter_store() call below, so it is not
> necessary.
>
> Reported-by: syzbot+bc6bfc25a68b7a020ee1@...kaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/x/log.txt?x=17b0ace8580000
> Fixes: 5de195060b2e2 ("mm: resolve faulty mmap_region() error path behaviour")
> Signed-off-by: Liam R. Howlett <Liam.Howlett@...cle.com>
> Cc: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
> Cc: Vlastimil Babka <vbabka@...e.cz>
> Cc: Jann Horn <jannh@...gle.com>
> Cc: <stable@...r.kernel.org>
> ---
> mm/mmap.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 79d541f1502b2..5cef9a1981f1b 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1491,7 +1491,10 @@ static unsigned long __mmap_region(struct file *file, unsigned long addr,
> vm_flags = vma->vm_flags;
> goto file_expanded;
> }
> - vma_iter_config(&vmi, addr, end);
> + if (vma_iter_prealloc(&vmi, vma)) {
> + error = -ENOMEM;
> + goto unmap_and_free_file_vma;
> + }
This breaks the whole thing of these changes which is to avoid having to
abort after invoking mmap_file(), so we can't do it this way.
As discussed via IRC, it's probably best to do this with __GFP_NOFAIL or
similar to avoid the abort altogether.
But reaching this point is very common, since drivers will often change the
flags, and failing to merge is also very common so we need to be sure this
is a no-op if no pre-allocation is required in the _very rare_ (if even
possible in reality) case of a merge failure due to OOM.
We store in the vmg state whether a merge failed due to oom, which you can
test for explicitly via vmg_nomem().
> }
>
> vm_flags = vma->vm_flags;
> --
> 2.43.0
>
Powered by blists - more mailing lists