[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e3c17dad-a93f-4885-8f14-69874be76482@redhat.com>
Date: Wed, 17 Jul 2024 13:00:58 +0200
From: David Hildenbrand <david@...hat.com>
To: "Borah, Chaitanya Kumar" <chaitanya.kumar.borah@...el.com>,
"peili.dev@...il.com" <peili.dev@...il.com>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>
Cc: "Nikula, Jani" <jani.nikula@...el.com>,
"Saarinen, Jani" <jani.saarinen@...el.com>,
"Kurmi, Suresh Kumar" <suresh.kumar.kurmi@...el.com>,
"intel-gfx@...ts.freedesktop.org" <intel-gfx@...ts.freedesktop.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: Regression on linux-next (next-20240712)
On 16.07.24 07:37, Borah, Chaitanya Kumar wrote:
> Hello Pei,
>
> Hope you are doing well. I am Chaitanya from the linux graphics team in Intel.
>
> This mail is regarding a regression we are seeing in our CI runs[1] on linux-next repository.
>
> In version next-20240712[2], we saw the following regression (currently being masked by another regression)
>
> `````````````````````````````````````````````````````````````````````````````````
> <4>[ 14.530533] ============================================
> <4>[ 14.530533] WARNING: possible recursive locking detected
> <4>[ 14.530534] 6.10.0-rc7-next-20240712-next-20240712-g3fe121b62282+ #1 Not tainted
> <4>[ 14.530535] --------------------------------------------
> <4>[ 14.530535] (direxec)/171 is trying to acquire lock:
> <4>[ 14.530536] ffff8881010725d8 (&mm->mmap_lock){++++}-{3:3}, at: unmap_single_vma+0xea/0x170
> <4>[ 14.530541]
> but task is already holding lock:
> <4>[ 14.530542] ffff8881010725d8 (&mm->mmap_lock){++++}-{3:3}, at: exit_mmap+0x6a/0x450
> <4>[ 14.530545]
> other info that might help us debug this:
> <4>[ 14.530545] Possible unsafe locking scenario:
> `````````````````````````````````````````````````````````````````````````````````
> Details log can be found in [3].
>
> After bisecting the tree, the following patch [4] seems to be the first "bad"
> commit
>
> `````````````````````````````````````````````````````````````````````````````````````````````````````````
> commit a13252049629a8225f38a9be7d8d4fc4ff5350e8
> Author: Pei Li mailto:peili.dev@...il.com
> Date: Wed Jul 10 22:13:17 2024 -0700
>
> mm: fix mmap_assert_locked() in follow_pte()
>
> `````````````````````````````````````````````````````````````````````````````````````````````````````````
>
> We also verified that if we revert the patch the issue is not seen.
>
> Could you please check why the patch causes this regression and provide a fix if necessary?
This is know.
There is a discussion along the original patch [1] on how to do it
differently. But likely we'll tackle it differently [2]. So this patch
should be dropped for -- which I think already happened because I cannot
spot that patch in mm-unstable anymore.
[1] https://lore.kernel.org/all/20240710-bug12-v1-1-0e5440f9b8d3@gmail.com/
[2] https://lkml.kernel.org/r/20240712144244.3090089-1-peterx@redhat.com
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists