lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <b62dbd9a-9c80-6383-46f1-dc78ca9bca41@molgen.mpg.de>
Date:   Tue, 16 Feb 2021 13:40:35 +0100
From:   Donald Buczek <buczek@...gen.mpg.de>
To:     Brian Foster <bfoster@...hat.com>
Cc:     Dave Chinner <david@...morbit.com>, linux-xfs@...r.kernel.org,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        it+linux-xfs@...gen.mpg.de
Subject: Re: [PATCH] xfs: Wake CIL push waiters more reliably

On 16.02.21 12:18, Brian Foster wrote:
> On Mon, Feb 15, 2021 at 02:36:38PM +0100, Donald Buczek wrote:
>> On 13.01.21 22:53, Dave Chinner wrote:
>>> [...]
>>> I agree that a throttling fix is needed, but I'm trying to
>>> understand the scope and breadth of the problem first instead of
>>> jumping the gun and making the wrong fix for the wrong reasons that
>>> just papers over the underlying problems that the throttling bug has
>>> made us aware of...
>>
>> Are you still working on this?
>>
>> If it takes more time to understand the potential underlying problem, the fix for the problem at hand should be applied.
>>
>> This is a real world problem, accidentally found in the wild. It appears very rarely, but it freezes a filesystem or the whole system. It exists in 5.7 , 5.8 , 5.9 , 5.10 and 5.11 and is caused by c7f87f3984cf ("xfs: fix use-after-free on CIL context on shutdown") which silently added a condition to the wakeup. The condition is based on a wrong assumption.
>>
>> Why is this "papering over"? If a reminder was needed, there were better ways than randomly hanging the system.
>>
>> Why is
>>
>>      if (ctx->space_used >= XLOG_CIL_BLOCKING_SPACE_LIMIT(log))
>>          wake_up_all(&cil->xc_push_wait);
>>
>> , which doesn't work reliably, preferable to
>>
>>      if (waitqueue_active(&cil->xc_push_wait))
>>          wake_up_all(&cil->xc_push_wait);
>>
>> which does?
>>
> 
> JFYI, Dave followed up with a patch a couple weeks or so ago:
> 
> https://lore.kernel.org/linux-xfs/20210128044154.806715-5-david@fromorbit.com/

Oh, great. I apologize for the unneeded reminder.

Best

   Donald

> 
> Brian
> 
>> Best
>>    Donald
>>
>>> Cheers,
>>>
>>> Dave
>>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ