lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230905114841.GB3881391@google.com>
Date:   Tue, 5 Sep 2023 11:48:41 +0000
From:   Joel Fernandes <joel@...lfernandes.org>
To:     Lorenzo Stoakes <lstoakes@...il.com>
Cc:     linux-kernel@...r.kernel.org,
        Andrew Morton <akpm@...ux-foundation.org>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        Vlastimil Babka <vbabka@...e.cz>,
        Zqiang <qiang.zhang1211@...il.com>,
        Zhen Lei <thunder.leizhen@...weicloud.com>,
        rcu@...r.kernel.org, Matthew Wilcox <willy@...radead.org>,
        stable@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH v3 2/2] rcu: Dump vmalloc memory info safely

On Tue, Sep 05, 2023 at 08:00:44AM +0100, Lorenzo Stoakes wrote:
> On Mon, Sep 04, 2023 at 06:08:05PM +0000, Joel Fernandes (Google) wrote:
> > From: Zqiang <qiang.zhang1211@...il.com>
> >
> > Currently, for double invoke call_rcu(), will dump rcu_head objects
> > memory info, if the objects is not allocated from the slab allocator,
> > the vmalloc_dump_obj() will be invoke and the vmap_area_lock spinlock
> > need to be held, since the call_rcu() can be invoked in interrupt context,
> > therefore, there is a possibility of spinlock deadlock scenarios.
> >
> > And in Preempt-RT kernel, the rcutorture test also trigger the following
> > lockdep warning:
> >
> > BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
> > in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 1, name: swapper/0
> > preempt_count: 1, expected: 0
> > RCU nest depth: 1, expected: 1
> > 3 locks held by swapper/0/1:
> >  #0: ffffffffb534ee80 (fullstop_mutex){+.+.}-{4:4}, at: torture_init_begin+0x24/0xa0
> >  #1: ffffffffb5307940 (rcu_read_lock){....}-{1:3}, at: rcu_torture_init+0x1ec7/0x2370
> >  #2: ffffffffb536af40 (vmap_area_lock){+.+.}-{3:3}, at: find_vmap_area+0x1f/0x70
> > irq event stamp: 565512
> > hardirqs last  enabled at (565511): [<ffffffffb379b138>] __call_rcu_common+0x218/0x940
> > hardirqs last disabled at (565512): [<ffffffffb5804262>] rcu_torture_init+0x20b2/0x2370
> > softirqs last  enabled at (399112): [<ffffffffb36b2586>] __local_bh_enable_ip+0x126/0x170
> > softirqs last disabled at (399106): [<ffffffffb43fef59>] inet_register_protosw+0x9/0x1d0
> > Preemption disabled at:
> > [<ffffffffb58040c3>] rcu_torture_init+0x1f13/0x2370
> > CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W          6.5.0-rc4-rt2-yocto-preempt-rt+ #15
> > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014
> > Call Trace:
> >  <TASK>
> >  dump_stack_lvl+0x68/0xb0
> >  dump_stack+0x14/0x20
> >  __might_resched+0x1aa/0x280
> >  ? __pfx_rcu_torture_err_cb+0x10/0x10
> >  rt_spin_lock+0x53/0x130
> >  ? find_vmap_area+0x1f/0x70
> >  find_vmap_area+0x1f/0x70
> >  vmalloc_dump_obj+0x20/0x60
> >  mem_dump_obj+0x22/0x90
> >  __call_rcu_common+0x5bf/0x940
> >  ? debug_smp_processor_id+0x1b/0x30
> >  call_rcu_hurry+0x14/0x20
> >  rcu_torture_init+0x1f82/0x2370
> >  ? __pfx_rcu_torture_leak_cb+0x10/0x10
> >  ? __pfx_rcu_torture_leak_cb+0x10/0x10
> >  ? __pfx_rcu_torture_init+0x10/0x10
> >  do_one_initcall+0x6c/0x300
> >  ? debug_smp_processor_id+0x1b/0x30
> >  kernel_init_freeable+0x2b9/0x540
> >  ? __pfx_kernel_init+0x10/0x10
> >  kernel_init+0x1f/0x150
> >  ret_from_fork+0x40/0x50
> >  ? __pfx_kernel_init+0x10/0x10
> >  ret_from_fork_asm+0x1b/0x30
> >  </TASK>
> >
> > The previous patch fixes this by using the deadlock-safe best-effort
> > version of find_vm_area. However, in case of failure print the fact that
> > the pointer was a vmalloc pointer so that we print at least something.
> >
> > Reported-by: Zhen Lei <thunder.leizhen@...weicloud.com>
> > Cc: Paul E. McKenney <paulmck@...nel.org>
> > Cc: rcu@...r.kernel.org
> > Reviewed-by: Matthew Wilcox (Oracle) <willy@...radead.org>
> > Fixes: 98f180837a89 ("mm: Make mem_dump_obj() handle vmalloc() memory")
> > Cc: stable@...r.kernel.org
> > Signed-off-by: Zqiang <qiang.zhang1211@...il.com>
> > Signed-off-by: Joel Fernandes (Google) <joel@...lfernandes.org>
> > ---
> >  mm/util.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/mm/util.c b/mm/util.c
> > index dd12b9531ac4..406634f26918 100644
> > --- a/mm/util.c
> > +++ b/mm/util.c
> > @@ -1071,7 +1071,9 @@ void mem_dump_obj(void *object)
> >  	if (vmalloc_dump_obj(object))
> >  		return;
> >
> > -	if (virt_addr_valid(object))
> > +	if (is_vmalloc_addr(object))
> > +		type = "vmalloc memory";
> > +	else if (virt_addr_valid(object))
> >  		type = "non-slab/vmalloc memory";
> 
> I think you should update this to say non-slab/non-vmalloc memory (as much
> as that description sucks!) as this phrasing in the past meant to say
> 'non-slab or vmalloc memory' (already confusing phrasing) so better to be
> clear.

True, though the issue you mentioned it is in existing code, a respin of this
patch could update it to say non-vmalloc. Good point, thanks for reviewing!

 - Joel

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ