lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <eaee4ea1-8e5a-dde8-472d-44241d992037@kernel.dk>
Date:   Mon, 16 May 2022 12:22:07 -0600
From:   Jens Axboe <axboe@...nel.dk>
To:     Thorsten Leemhuis <regressions@...mhuis.info>,
        Daniel Harding <dharding@...ing180.net>,
        Pavel Begunkov <asml.silence@...il.com>
Cc:     regressions@...ts.linux.dev, io-uring@...r.kernel.org,
        linux-kernel@...r.kernel.org,
        Christian Brauner <christian.brauner@...ntu.com>
Subject: Re: [REGRESSION] lxc-stop hang on 5.17.x kernels

On 5/16/22 12:17 PM, Thorsten Leemhuis wrote:
>>> Pavel, I had actually just started a draft email with the same theory
>>> (although you stated it much more clearly than I could have).  I'm
>>> working on debugging the LXC side, but I'm pretty sure the issue is
>>> due to LXC using blocking reads and getting stuck exactly as you
>>> describe.  If I can confirm this, I'll go ahead and mark this
>>> regression as invalid and file an issue with LXC. Thanks for your help
>>> and patience.
>>
>> Yes, it does appear that was the problem.  The attach POC patch against
>> LXC fixes the hang.  The kernel is working as intended.
>>
>> #regzbot invalid:  userspace programming error
> 
> Hmmm, not sure if I like this. So yes, this might be a bug in LXC, but
> afaics it's a bug that was exposed by kernel change in 5.17 (correct me
> if I'm wrong!). The problem thus still qualifies as a kernel regression
> that normally needs to be fixed, as can be seen my some of the quotes
> from Linus in this file:
> https://www.kernel.org/doc/html/latest/process/handling-regressions.html

Sorry, but that's really BS in this particularly case. This could always
have triggered, it's the way multishot works. Will we count eg timing
changes as potential regressions, because an application relied on
something there? That does not make it ABI.

In general I agree with Linus on this, a change in behavior breaking
something should be investigated and figured out (and reverted, if need
be). This is not that.

-- 
Jens Axboe
I

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ