[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <68350BC2-074F-4776-B5CD-1335311CC1D8@vmware.com>
Date: Wed, 21 Sep 2022 18:03:11 +0000
From: Nadav Amit <namit@...are.com>
To: Peter Zijlstra <peterz@...radead.org>
CC: Thomas Gleixner <tglx@...utronix.de>,
LKML <linux-kernel@...r.kernel.org>, X86 ML <x86@...nel.org>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Borislav Petkov <bp@...en8.de>, Ingo Molnar <mingo@...hat.com>,
"# 5 . 10+" <stable@...r.kernel.org>,
"Steven Rostedt (Google)" <rostedt@...dmis.org>
Subject: Re: [PATCH] x86/alternative: fix race in try_get_desc()
On Sep 21, 2022, at 12:52 AM, Peter Zijlstra <peterz@...radead.org> wrote:
> ⚠ External Email
>
> On Tue, Sep 20, 2022 at 10:47:43PM +0000, Nadav Amit wrote:
>
>> Fix this issue with small backportable patch. Instead of trying to make
>> RCU-like behavior for bp_desc, just eliminate the unnecessary level of
>> indirection of bp_desc, and hold the whole descriptor on the stack.
>> Anyhow, there is only a single descriptor at any given moment.
>
> Because of text_mutex; indeed. No idea why I put that thing on the
> stack.
>
> I've done a few minor edits to your patch, but it otherwise looks good
> to me.
>
> ---
> Subject: x86/alternative: Fix race in try_get_desc()
> From: Nadav Amit <namit@...are.com>
> Date: Tue, 20 Sep 2022 22:47:43 +0000
>
> From: Nadav Amit <namit@...are.com>
>
> The text poke mechanism claims to have an RCU-like behavior, but it does
> not appear that there is any quiescent state to ensure that nobody holds
> reference to desc. As a result, the following race appears to be
> possible, which can lead to memory corruption.
>
> CPU0 CPU1
> ---- ----
> text_poke_bp_batch()
> -> smp_store_release(&bp_desc, &desc)
>
> [ notice that desc is on
> the stack ]
>
> poke_int3_handler()
>
> [ int3 might be kprobe's
> so sync events are do not
> help ]
>
> -> try_get_desc(descp=&bp_desc)
> desc = __READ_ONCE(bp_desc)
>
> if (!desc) [false, success]
> WRITE_ONCE(bp_desc, NULL);
> atomic_dec_and_test(&desc.refs)
>
> [ success, desc space on the stack
> is being reused and might have
> non-zero value. ]
> arch_atomic_inc_not_zero(&desc->refs)
>
> [ might succeed since desc points to
> stack memory that was freed and might
> be reused. ]
>
> I encountered some occasional crashes of poke_int3_handler() when
> kprobes are set, while accessing desc->vec. The analysis has been done
> offline and I did not corroborate the cause of the crashes. Yet, it
> seems that this race might be the root cause.
>
> Fix this issue with small backportable patch. Instead of trying to make
> RCU-like behavior for bp_desc, just eliminate the unnecessary level of
> indirection of bp_desc, and hold the whole descriptor on the stack.
Looks good to me. Thanks for improving my patch.
I just made one small mistake in commit message. It should say “hold
the whole descriptor as a global” in the line above.
I will send v2 with your changes and the updated commit message.
Thanks again.
Powered by blists - more mailing lists