[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZPEjqA46OO6Rr8RN@debian.me>
Date:   Fri, 1 Sep 2023 06:35:04 +0700
From:   Bagas Sanjaya <bagasdotme@...il.com>
To:     Mikhail Gavrilov <mikhail.v.gavrilov@...il.com>,
        Hugh Dickins <hughd@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>
Cc:     Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
        Linux Memory Management List <linux-mm@...ck.org>,
        Linux Regressions <regressions@...ts.linux.dev>
Subject: Re: 6.6/regression/bisected - after commit
 a349d72fd9efc87c8fd1d16d3164752d84a7275b system stopped booting
On Fri, Sep 01, 2023 at 03:45:28AM +0500, Mikhail Gavrilov wrote:
> Hi,
> next release cycle, and another regression.
> Yesterday after another kernel update in Fedora Rawhide system stopped booting.
> Today thanks to git bisect, I found out that this is a commit:
> 
> ❯ git bisect bad
> a349d72fd9efc87c8fd1d16d3164752d84a7275b is the first bad commit
> commit a349d72fd9efc87c8fd1d16d3164752d84a7275b
> Author: Hugh Dickins <hughd@...gle.com>
> Date:   Tue Jul 11 21:30:40 2023 -0700
> 
>     mm/pgtable: add rcu_read_lock() and rcu_read_unlock()s
> 
>     Patch series "mm: free retracted page table by RCU", v3.
> 
>     Some mmap_lock avoidance i.e.  latency reduction.  Initially just for the
>     case of collapsing shmem or file pages to THPs: the usefulness of
>     MADV_COLLAPSE on shmem is being limited by that mmap_write_lock it
>     currently requires.
> 
>     Likely to be relied upon later in other contexts e.g.  freeing of empty
>     page tables (but that's not work I'm doing).  mmap_write_lock avoidance
>     when collapsing to anon THPs?  Perhaps, but again that's not work I've
>     done: a quick attempt was not as easy as the shmem/file case.
> 
>     These changes (though of course not these exact patches) have been in
>     Google's data centre kernel for three years now: we do rely upon them.
> 
> 
>     This patch (of 13):
> 
>     Before putting them to use (several commits later), add rcu_read_lock() to
>     pte_offset_map(), and rcu_read_unlock() to pte_unmap().  Make this a
>     separate commit, since it risks exposing imbalances: prior commits have
>     fixed all the known imbalances, but we may find some have been missed.
> 
>     Link: https://lkml.kernel.org/r/7cd843a9-aa80-14f-5eb2-33427363c20@google.com
>     Link: https://lkml.kernel.org/r/d3b01da5-2a6-833c-6681-67a3e024a16f@google.com
>     Signed-off-by: Hugh Dickins <hughd@...gle.com>
> <long cc list omitted>...
>     Signed-off-by: Andrew Morton <akpm@...ux-foundation.org>
> 
>  include/linux/pgtable.h | 4 ++--
>  mm/pgtable-generic.c    | 4 ++--
>  2 files changed, 4 insertions(+), 4 deletions(-)
> 
> It looks like the hang happens so early that when booting into a
> working kernel and running "journalctl -b -1" I see in the console the
> log of the previous kernel which was booted before the problematic
> kernel.
> Therefore, I apologize that I can't provide the kernel logs.
> I can provides only photos when backtrace appears on my monitor:
> Here we waiting: https://ibb.co/5xmm0BH
> And then I see backtrace: https://ibb.co/TLLGFNP
> 
> Unfortunately I can't revert commit
> a349d72fd9efc87c8fd1d16d3164752d84a7275b for testing more fresh builds
> because of conflicts.
> 
> My hardware: https://linux-hardware.org/?probe=dd5735f315
> I also attached kernel build config and full bisect log.
> 
Thanks for the regression report. I'm adding it to regzbot:
#regzbot ^introduced: a349d72fd9efc8
#regzbot title: rcu_read_{lock,unlock}() causes unbootable system with backtrace
-- 
An old man doll... just what I always wanted! - Clara
Download attachment "signature.asc" of type "application/pgp-signature" (229 bytes)
Powered by blists - more mailing lists
 
