[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LRH.2.02.1911120240020.25757@file01.intranet.prod.int.rdu2.redhat.com>
Date: Tue, 12 Nov 2019 02:50:42 -0500 (EST)
From: Mikulas Patocka <mpatocka@...hat.com>
To: Mike Snitzer <snitzer@...hat.com>
cc: Nikos Tsironis <ntsironis@...ikto.com>,
Scott Wood <swood@...hat.com>,
Ilias Tsitsimpis <iliastsi@...ikto.com>, dm-devel@...hat.com,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH 1/2] dm-snapshot: fix crash with the realtime kernel
On Mon, 11 Nov 2019, Mike Snitzer wrote:
> On Mon, Nov 11 2019 at 11:37am -0500,
> Nikos Tsironis <ntsironis@...ikto.com> wrote:
>
> > On 11/11/19 3:59 PM, Mikulas Patocka wrote:
> > > Snapshot doesn't work with realtime kernels since the commit f79ae415b64c.
> > > hlist_bl is implemented as a raw spinlock and the code takes two non-raw
> > > spinlocks while holding hlist_bl (non-raw spinlocks are blocking mutexes
> > > in the realtime kernel, so they couldn't be taken inside a raw spinlock).
> > >
> > > This patch fixes the problem by using non-raw spinlock
> > > exception_table_lock instead of the hlist_bl lock.
> > >
> > > Signed-off-by: Mikulas Patocka <mpatocka@...hat.com>
> > > Fixes: f79ae415b64c ("dm snapshot: Make exception tables scalable")
> > >
> >
> > Hi Mikulas,
> >
> > I wasn't aware that hlist_bl is implemented as a raw spinlock in the
> > real time kernel. I would expect it to be a standard non-raw spinlock,
> > so everything works as expected. But, after digging further in the real
> > time tree, I found commit ad7675b15fd87f1 ("list_bl: Make list head
> > locking RT safe") which suggests that such a conversion would break
> > other parts of the kernel.
>
> Right, the proper fix is to update list_bl to work on realtime (which I
> assume the referenced commit does). I do not want to take this
> dm-snapshot specific workaround that open-codes what should be done
> within hlist_{bl_lock,unlock}, etc.
If we change list_bl to use non-raw spinlock, it fails in dentry lookup
code. The dentry code takes a seqlock (which is implemented as preempt
disable in the realtime kernel) and then takes a list_bl lock.
This is wrong from the real-time perspective (the chain in the hash could
be arbitrarily long, so using non-raw spinlock could cause unbounded
wait), however we can't do anything with it.
I think that fixing dm-snapshot is way easier than fixing the dentry code.
If you have an idea how to fix the dentry code, tell us.
> I'm not yet sure which realtime mailing list and/or maintainers should
> be cc'd to further the inclussion of commit ad7675b15fd87f1 -- Nikos do
> you?
>
> Thanks,
> Mike
Mikulas
Powered by blists - more mailing lists