linux-kernel - Re: KASAN: global-out-of-bounds Read in srcu_gp_start_if

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250304035732.GA128190@joelnvbox>
Date: Mon, 3 Mar 2025 22:57:32 -0500
From: Joel Fernandes <joelagnelf@...dia.com>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: Strforexc yn <strforexc@...il.com>,
	Lai Jiangshan <jiangshanlai@...il.com>,
	"Paul E. McKenney" <paulmck@...nel.org>,
	Josh Triplett <josh@...htriplett.org>,
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
	rcu@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: KASAN: global-out-of-bounds Read in srcu_gp_start_if_needed

On Mon, Mar 03, 2025 at 11:47:11AM -0500, Steven Rostedt wrote:
[...]
> > [   92.322347][   T28]  register_lock_class+0xb2/0xfc0
> > [   92.322366][   T28]  ? __lock_acquire+0xb97/0x16a0
> > [   92.322386][   T28]  ? __pfx_register_lock_class+0x10/0x10
> > [   92.322407][   T28]  ? do_perf_trace_lock.isra.0+0x10b/0x570
> > [   92.322427][   T28]  __lock_acquire+0xc3/0x16a0
> > [   92.322446][   T28]  ? __pfx___lock_release+0x10/0x10
> > [   92.322466][   T28]  ? rcu_is_watching+0x12/0xd0
> > [   92.322486][   T28]  lock_acquire+0x181/0x3a0
> > [   92.322505][   T28]  ? srcu_gp_start_if_needed+0x1a9/0x5f0
> > [   92.322522][   T28]  ? __pfx_lock_acquire+0x10/0x10
> > [   92.322541][   T28]  ? debug_object_active_state+0x2f1/0x3f0
> > [   92.322557][   T28]  ? do_raw_spin_trylock+0xb4/0x190
> > [   92.322570][   T28]  ? __pfx_do_raw_spin_trylock+0x10/0x10
> > [   92.322583][   T28]  ? __kmalloc_cache_noprof+0x1b9/0x450
> > [   92.322604][   T28]  _raw_spin_trylock+0x76/0xa0
> > [   92.322619][   T28]  ? srcu_gp_start_if_needed+0x1a9/0x5f0
> > [   92.322636][   T28]  srcu_gp_start_if_needed+0x1a9/0x5f0
> 
> The lock taken is from the passed in rcu_pending pointer.
> 
> > [   92.322655][   T28]  rcu_pending_enqueue+0x686/0xd30
> > [   92.322676][   T28]  ? __pfx_rcu_pending_enqueue+0x10/0x10
> > [   92.322693][   T28]  ? trace_lock_release+0x11a/0x180
> > [   92.322708][   T28]  ? bkey_cached_free+0xa3/0x170
> > [   92.322725][   T28]  ? lock_release+0x13/0x180
> > [   92.322744][   T28]  ? bkey_cached_free+0xa3/0x170
> > [   92.322760][   T28]  bkey_cached_free+0xfd/0x170
> 
> Which has:
> 
> static void bkey_cached_free(struct btree_key_cache *bc,
>                              struct bkey_cached *ck)
> {
>         kfree(ck->k);
>         ck->k           = NULL;
>         ck->u64s        = 0;
>                 
>         six_unlock_write(&ck->c.lock);
>         six_unlock_intent(&ck->c.lock);
> 
>         bool pcpu_readers = ck->c.lock.readers != NULL;
>         rcu_pending_enqueue(&bc->pending[pcpu_readers], &ck->rcu);
>         this_cpu_inc(*bc->nr_pending);
> }
> 
> So if that bc->pending[pcpu_readers] gets corrupted in anyway, that could trigger this.

True, another thing that could corrupt it is if per-cpu global data section
section is corrupted, because the crash is happening in this trylock per the
above stack:

 srcu_gp_start_if_needed ->
	spin_lock_irqsave_sdp_contention(sdp) ->
		spin_trylock(sdp->lock)

	where sdp is ssp->sda and is allocated from per-cpu storage.

So corruption of the per-cpu global data section can also trigger this, even
if the rcu_pending pointer is intact.

thanks,

 - Joel