lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <990192aa-f554-4aba-b935-f522c62188ab@leemhuis.info>
Date: Wed, 3 Apr 2024 06:49:20 +0200
From: "Linux regression tracking (Thorsten Leemhuis)"
 <regressions@...mhuis.info>
To: Martin Steigerwald <martin@...htvoll.de>, linux-pm@...r.kernel.org,
 regressions@...ts.linux.dev, linux-kernel@...r.kernel.org
Subject: Re: [regression] 6.8.1: fails to hibernate with
 pm_runtime_force_suspend+0x0/0x120 returns -16

On 02.04.24 21:42, Martin Steigerwald wrote:
> Linux regression tracking (Thorsten Leemhuis) - 19.03.24, 09:40:06 CEST:
>> On 16.03.24 17:12, Martin Steigerwald wrote:
>>> Martin Steigerwald - 16.03.24, 17:02:44 CET:
>>>> ThinkPad T14 AMD Gen 1 fails to hibernate with self-compiled 6.8.1.
>>>> Hibernation works correctly with self-compiled 6.7.9.
>>>
>>> Apparently 6.8.1 does not even reboot correctly anymore. runit on
>>> Devuan. It says it is doing the system reboot but then nothing
>>> happens.
>>>
>>> As for hibernation the kernel cancels the attempt and returns back to
>>> user space desktop session.
>>>
>>>> Trying to use "no_console_suspend" to debug next. Will not do bisect
>>>> between major kernel releases on a production machine.
>>
>> FWIW, without a bisection I guess no developer will take a closer look
>> (but I might be wrong and you lucky here!), as any change in those
>> hundreds of drivers used on that machine can possibly lead to problems
>> like yours. So without a bisection we are likely stuck here, unless
>> someone else runs into the same problem and bisects or fixes it. Sorry,
>> but that's just how it is.
> 
> I have been asked this repeatedly with previous bug reports. My issue
> with bisecting between major kernel versions is this:
>  
> When I look around here I see no second ThinkPad T14 AMD Gen 1 here I 
> could use for testing. Also doing a kernel bisect using a GRML live iso… 
> not really.
> 
> The one I reported this from is a production machine with a 4 TB NVMe
> SSD which contains a lot of data. I am not willing to risk data loss or
> (silent) file system corruption by bisecting between major kernel
> releases. Bisecting between major kernel releases in my understanding
> would require to test various releases between in this example 6.7 and
> 6.8 and even between 6.7 and 6.8-rc1. At least in my understand anything 
> between 6.7 and 6.8-rc1 is not guaranteed to be even be somewhat stable.

It's hard to qualify and always a matter of personal viewpoint/opinion,
but I'd say: kernel from the merge window are pretty stable and
reliable. But sure, accidents that eat data happen and they happen
slightly more often during merge windows because the rate of change is
higher. But in the end they do not happen often, which is why Fedora
rawhide for example ships merge window kernels all the time.

> I 
> am not usually installing an rc1 kernel on a production machine, but 
> rather wait for at least rc2/3 nowadays. Its a balanced risk calculation. 
> And rc2/3 or later appears to be a risk I am willing to take. But 
> something between stable and rc1? Nope.

Well, that's up to you -- but the reality is also that developers are
not obliged to look into regressions report closely, unless someone
bisected it.

> It is not even that rare. 6.7 some rc failed with hibernation as well. 

Maybe too few people (or too few of those that run the latest kernels)
use hibernate these days (I haven't for more than 15 years), which is
why it's not tested much.

> With exactly the same machine. I refused to do a bisect as well in that 
> case. At some later time the issue was fixed without me doing anything 
> more.

Maybe you were lucky, maybe someone else bisected and reported the problem.

> Now my question is this: Without me willing to bisect in that case, is
> a bug report even useful? Otherwise I may just switch this last machine
> to distribution kernels. It would save a lot of time for me. This private 
> and freelancer production machine is the last left-over machine with self-
> compiled kernels.
> 
> So far I still thought I would somehow be contributing to Linux kernel
> quality with detailed bug reports that take time to write, but apparently 
> I am not. Can you clarify?

Not really, as it always depends on the situation. There are bugs (like
https://lore.kernel.org/all/08275279-7462-4f4a-a0ee-8aa015f829bc@leemhuis.info/
) where a report without a bisection is enough. But there are others
where it's unlikely that anyone will take a closer look; a lot of those
reg. suspend/hibernate fall into this category, as problems in that area
can be cause by any subsystem and its drivers -- which is why the power
management people can't look into most of those, as then they quickly
wouldn't get anything else done while spending time on bugs most of the
time other people caused.

Ciao, Thorsten

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ