linux-kernel - Re: kvm+nouveau induced lockdep gripe

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <431e81699f2310eabfe5af0a3de400ab99d9323b.camel@gmx.de>
Date:   Mon, 26 Oct 2020 20:15:23 +0100
From:   Mike Galbraith <efault@....de>
To:     Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Hillf Danton <hdanton@...a.com>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        LKML <linux-kernel@...r.kernel.org>,
        linux-rt-users <linux-rt-users@...r.kernel.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Skeggs <bskeggs@...hat.com>
Subject: Re: kvm+nouveau induced lockdep  gripe

On Mon, 2020-10-26 at 18:31 +0100, Sebastian Andrzej Siewior wrote:
> On 2020-10-24 13:00:00 [+0800], Hillf Danton wrote:
> >
> > Hmm...curious how that word went into your mind. And when?
> > > [   30.457363]
> > >                other info that might help us debug this:
> > > [   30.457369]  Possible unsafe locking scenario:
> > >
> > > [   30.457375]        CPU0
> > > [   30.457378]        ----
> > > [   30.457381]   lock(&mgr->vm_lock);
> > > [   30.457386]   <Interrupt>
> > > [   30.457389]     lock(&mgr->vm_lock);
> > > [   30.457394]
> > >                 *** DEADLOCK ***
> > >
> > > <snips 999 lockdep lines and zillion ATOMIC_SLEEP gripes>
>
> The backtrace contained the "normal" vm_lock. What should follow is the
> backtrace of the in-softirq usage.
>
> >
> > Dunno if blocking softint is a right cure.
> >
> > --- a/drivers/gpu/drm/drm_vma_manager.c
> > +++ b/drivers/gpu/drm/drm_vma_manager.c
> > @@ -229,6 +229,7 @@ EXPORT_SYMBOL(drm_vma_offset_add);
> >  void drm_vma_offset_remove(struct drm_vma_offset_manager *mgr,
> >  			   struct drm_vma_offset_node *node)
> >  {
> > +	local_bh_disable();
>
> There is write_lock_bh(). However changing only one will produce the
> same backtrace somewhere else unless all other users already run BH
> disabled region.

Since there doesn't _seems_ to be a genuine deadlock lurking, I just
asked lockdep to please not log the annoying initialization time
chain.

--- a/drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c
@@ -116,7 +116,17 @@ nvkm_pci_oneinit(struct nvkm_subdev *sub
 			return ret;
 	}

+	/*
+	 * Scheduler code taking cpuset_rwsem during irq thread initialization sets
+	 * up a cpuset_rwsem vs mm->mmap_lock circular dependency gripe upon later
+	 * cpuset usage. It's harmless, tell lockdep there's nothing to see here.
+	 */
+	if (force_irqthreads)
+		lockdep_off();
 	ret = request_irq(pdev->irq, nvkm_pci_intr, IRQF_SHARED, "nvkm", pci);
+	if (force_irqthreads)
+		lockdep_on();
+
 	if (ret)
 		return ret;