lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMuHMdW+wNma-Nr+595kRgZfG_fS2zgbZxJNbtcNRxnmpCJ=5w@mail.gmail.com>
Date:   Thu, 31 Aug 2023 11:43:25 +0200
From:   Geert Uytterhoeven <geert@...ux-m68k.org>
To:     Peng Zhang <zhangpeng.00@...edance.com>
Cc:     Michael Ellerman <mpe@...erman.id.au>,
        "Liam R. Howlett" <Liam.Howlett@...cle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        maple-tree@...ts.infradead.org, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, stable@...r.kernel.org,
        linux-renesas-soc@...r.kernel.org
Subject: Re: [PATCH v2 1/2] maple_tree: Disable mas_wr_append() when other
 readers are possible

Hi Peng,

On Thu, Aug 31, 2023 at 10:45 AM Peng Zhang <zhangpeng.00@...edance.com> wrote:
> 在 2023/8/31 16:25, Geert Uytterhoeven 写道:
> > On Thu, Aug 31, 2023 at 7:39 AM Michael Ellerman <mpe@...erman.id.au> wrote:
> >> Geert Uytterhoeven <geert@...ux-m68k.org> writes:
> >>> On Fri, 18 Aug 2023, Liam R. Howlett wrote:
> >>>> The current implementation of append may cause duplicate data and/or
> >>>> incorrect ranges to be returned to a reader during an update.  Although
> >>>> this has not been reported or seen, disable the append write operation
> >>>> while the tree is in rcu mode out of an abundance of caution.
> >>>>
> >>>> During the analysis of the mas_next_slot() the following was
> >>>> artificially created by separating the writer and reader code:
> >>>>
> >>>> Writer:                                 reader:
> >>>> mas_wr_append
> >>>>     set end pivot
> >>>>     updates end metata
> >>>>     Detects write to last slot
> >>>>     last slot write is to start of slot
> >>>>     store current contents in slot
> >>>>     overwrite old end pivot
> >>>>                                         mas_next_slot():
> >>>>                                                 read end metadata
> >>>>                                                 read old end pivot
> >>>>                                                 return with incorrect range
> >>>>     store new value
> >>>>
> >>>> Alternatively:
> >>>>
> >>>> Writer:                                 reader:
> >>>> mas_wr_append
> >>>>     set end pivot
> >>>>     updates end metata
> >>>>     Detects write to last slot
> >>>>     last lost write to end of slot
> >>>>     store value
> >>>>                                         mas_next_slot():
> >>>>                                                 read end metadata
> >>>>                                                 read old end pivot
> >>>>                                                 read new end pivot
> >>>>                                                 return with incorrect range
> >>>>     set old end pivot
> >>>>
> >>>> There may be other accesses that are not safe since we are now updating
> >>>> both metadata and pointers, so disabling append if there could be rcu
> >>>> readers is the safest action.
> >>>>
> >>>> Fixes: 54a611b60590 ("Maple Tree: add new data structure")
> >>>> Cc: stable@...r.kernel.org
> >>>> Signed-off-by: Liam R. Howlett <Liam.Howlett@...cle.com>
> >>>
> >>> Thanks for your patch, which is now commit cfeb6ae8bcb96ccf
> >>> ("maple_tree: disable mas_wr_append() when other readers are
> >>> possible") in v6.5, and is being backported to stable.
> >>>
> >>> On Renesas RZ/A1 and RZ/A2 (single-core Cortex-A9), this causes the
> >>> following warning:
> >>>
> >>>        clocksource: timer@...3b000: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 28958491609 ns
> >>>        sched_clock: 32 bits at 66MHz, resolution 15ns, wraps every 32537631224ns
> >>>        /soc/timer@...3b000: used for clocksource
> >>>        /soc/timer@...3c000: used for clock events
> >>>       +------------[ cut here ]------------
> >>>       +WARNING: CPU: 0 PID: 0 at init/main.c:992 start_kernel+0x2f0/0x480
> >>>       +Interrupts were enabled early
> >> ...
> >>>
> >>> I do not see this issue on any other platform
> >>> (arm/arm64/risc-v/mips/sh/m68k), several of them use the same
> >>> RCU configuration.
> >>
> >> There's something similar on pmac32 / mac99.
> >>
> >>> Do you have a clue?
> >>
> >> It seems something in the maple tree code is setting TIF_NEED_RESCHED,
> >> and that causes a subsequent call to cond_resched() to call schedule()
> >> and enable interrupts.
> >>
> >> On pmac32 enabling CONFIG_DEBUG_ATOMIC_SLEEP fixes/hides the problem.
> >> But I don't see why.
> >
> > Enabling CONFIG_DEBUG_ATOMIC_SLEEP on RZ/A1 and RZ/A2 does
> > fix the problem.
> > But there must be more to it, as some of my test configs had it enabled,
> > and others hadn't, while only RZ/A showed the issue.
> > I tried disabling it on R-Car M2-W (arm32) and R-Car H3 (arm64), and
> > that did not cause the problem to happen...
>
> I guess this patch triggers a potential problem with the maple tree.
> I don't have the environment to trigger the problem. Can you apply the
> following patch to see if the problem still exists? This can help locate
> the root cause. At least narrow down the scope of the code that has bug.

Thanks for your suggestion!

> diff --git a/lib/maple_tree.c b/lib/maple_tree.c
> index f723024e1426..1b4b6f6e3095 100644
> --- a/lib/maple_tree.c
> +++ b/lib/maple_tree.c
> @@ -4351,9 +4351,6 @@ static inline void mas_wr_modify(struct
> ma_wr_state *wr_mas)
>         if (new_end == wr_mas->node_end && mas_wr_slot_store(wr_mas))
>                 return;
>
> -       if (mas_wr_node_store(wr_mas, new_end))
> -               return;
> -
>         if (mas_is_err(mas))
>                 return;

Unfortunately removing these lines does not help.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@...ux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ