[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210216111820.GA534175@bfoster>
Date: Tue, 16 Feb 2021 06:18:20 -0500
From: Brian Foster <bfoster@...hat.com>
To: Donald Buczek <buczek@...gen.mpg.de>
Cc: Dave Chinner <david@...morbit.com>, linux-xfs@...r.kernel.org,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
it+linux-xfs@...gen.mpg.de
Subject: Re: [PATCH] xfs: Wake CIL push waiters more reliably
On Mon, Feb 15, 2021 at 02:36:38PM +0100, Donald Buczek wrote:
> On 13.01.21 22:53, Dave Chinner wrote:
> > [...]
> > I agree that a throttling fix is needed, but I'm trying to
> > understand the scope and breadth of the problem first instead of
> > jumping the gun and making the wrong fix for the wrong reasons that
> > just papers over the underlying problems that the throttling bug has
> > made us aware of...
>
> Are you still working on this?
>
> If it takes more time to understand the potential underlying problem, the fix for the problem at hand should be applied.
>
> This is a real world problem, accidentally found in the wild. It appears very rarely, but it freezes a filesystem or the whole system. It exists in 5.7 , 5.8 , 5.9 , 5.10 and 5.11 and is caused by c7f87f3984cf ("xfs: fix use-after-free on CIL context on shutdown") which silently added a condition to the wakeup. The condition is based on a wrong assumption.
>
> Why is this "papering over"? If a reminder was needed, there were better ways than randomly hanging the system.
>
> Why is
>
> if (ctx->space_used >= XLOG_CIL_BLOCKING_SPACE_LIMIT(log))
> wake_up_all(&cil->xc_push_wait);
>
> , which doesn't work reliably, preferable to
>
> if (waitqueue_active(&cil->xc_push_wait))
> wake_up_all(&cil->xc_push_wait);
>
> which does?
>
JFYI, Dave followed up with a patch a couple weeks or so ago:
https://lore.kernel.org/linux-xfs/20210128044154.806715-5-david@fromorbit.com/
Brian
> Best
> Donald
>
> > Cheers,
> >
> > Dave
>
Powered by blists - more mailing lists