[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250514043326.GA4318@system.software.com>
Date: Wed, 14 May 2025 13:33:27 +0900
From: Byungchul Park <byungchul@...com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: Gavin Guo <gavinguo@...lia.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, muchun.song@...ux.dev,
osalvador@...e.de, kernel-dev@...lia.com, stable@...r.kernel.org,
Hugh Dickins <hughd@...gle.com>, Florent Revest <revest@...gle.com>,
Gavin Shan <gshan@...hat.com>, kernel_team@...ynix.com
Subject: Re: [PATCH] mm/hugetlb: fix a deadlock with pagecache_folio and
hugetlb_fault_mutex_table
On Tue, May 13, 2025 at 05:56:33PM -0700, Andrew Morton wrote:
> On Tue, 13 May 2025 17:34:48 +0800 Gavin Guo <gavinguo@...lia.com> wrote:
>
> > The patch fixes a deadlock which can be triggered by an internal
> > syzkaller [1] reproducer and captured by bpftrace script [2] and its log
> > [3] in this scenario:
> >
> > Process 1 Process 2
> > --- ---
> > hugetlb_fault
> > mutex_lock(B) // take B
> > filemap_lock_hugetlb_folio
> > filemap_lock_folio
> > __filemap_get_folio
> > folio_lock(A) // take A
> > hugetlb_wp
> > mutex_unlock(B) // release B
> > ... hugetlb_fault
> > ... mutex_lock(B) // take B
> > filemap_lock_hugetlb_folio
> > filemap_lock_folio
> > __filemap_get_folio
> > folio_lock(A) // blocked
> > unmap_ref_private
> > ...
> > mutex_lock(B) // retake and blocked
> >
> > This is a ABBA deadlock involving two locks:
> > - Lock A: pagecache_folio lock
> > - Lock B: hugetlb_fault_mutex_table lock
>
> Nostalgia. A decade or three ago many of us spent much of our lives
> staring at ABBA deadlocks. Then came lockdep and after a few more
> years, it all stopped. I've long hoped that lockdep would gain a
> solution to custom locks such as folio_wait_bit_common(), but not yet.
>
> Byungchul, please take a look. Would DEPT
> (https://lkml.kernel.org/r/20250513100730.12664-1-byungchul@sk.com)
> have warned us about this?
Sure, I will check it. I think this type of deadlock is what DEPT can do
the best.
Byungchul
> >
> > ...
> >
> > The deadlock occurs between two processes as follows:
> >
> > ...
> >
> > Fixes: 40549ba8f8e0 ("hugetlb: use new vma_lock for pmd sharing synchronization")
> > Cc: <stable@...r.kernel.org>
>
> It's been there for three years so I assume we aren't in a hurry.
>
> The fix looks a bit nasty, sorry. Perhaps designed for a minimal patch
> footprint? That's good for a backportable fixup, but a more broadly
> architected solution may be needed going forward.
>
> I'll queue it for 6.16-rc1 with a cc:stable, so this should be
> presented to the -stable trees 3-4 weeks from now.
Powered by blists - more mailing lists