lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d63a4a41-6289-e3ba-cccb-b0dcbed60c79@siemens.com>
Date:   Tue, 8 Feb 2022 07:35:41 +0100
From:   Jan Kiszka <jan.kiszka@...mens.com>
To:     Bjorn Helgaas <helgaas@...nel.org>, <linux-pci@...r.kernel.org>
CC:     <joey.corleone@...l.ru>, <linux-kernel@...r.kernel.org>
Subject: Re: [Bug 215533] [BISECTED][REGRESSION] UI becomes unresponsive every
 couple of seconds

On 07.02.22 23:45, Bjorn Helgaas wrote:
> [+cc linux-kernel for visibility]
> 
> On Wed, Jan 26, 2022 at 06:12:50AM -0600, Bjorn Helgaas wrote:
>> On Wed, Jan 26, 2022 at 08:18:12AM +0000, bugzilla-daemon@...zilla.kernel.org wrote:
>>> https://bugzilla.kernel.org/show_bug.cgi?id=215533
>>>
>>> --- Comment #1 from joey.corleone@...l.ru ---
>>> I accidentally sent the report prematurely. So here come my findings:
>>>
>>> Since 5.16
>>> (1) my system becomes unresponsive every couple of seconds (micro lags), which
>>> makes it more or less unusable.
>>> (2) wrong(?) CPU frequencies are reported. 
>>>
>>> - 5.15 works fine.
>>> - Starting from some commit in 5.17, it seems (1) is fixed (unsure), but
>>> definitely not (2).
>>>
>>> I have bisected the kernel between 5.15 and 5.16, and found that the offending
>>> commit is 0e8ae5a6ff5952253cd7cc0260df838ab4c21009 ("PCI/portdrv: Do not setup
>>> up IRQs if there are no users"). Bisection log attached.
>>>
>>> Reverting this commit on linux-git[1] fixes both (1) and (2).
>>>
>>> Important notes:
>>> - This regression was reported on a DELL XPS 9550 laptop by two users [2], so
>>> it might be related strictly to that model. 
>>> - According to user mallocman, the issue can also be fixed by reverting the
>>> BIOS version of the laptop to v1.12.
>>> - The issue ONLY occurs when AC is plugged in (and stays there even when I
>>> unplug it).
>>> - When booting on battery power, there is no issue at all.
>>>
>>> You can easily observe the regression via: 
>>>
>>> watch cat /sys/devices/system/cpu/cpu[0-9]*/cpufreq/scaling_cur_fre
>>>
>>> As soon as I plug in AC, all frequencies go up to values around 3248338 and
>>> stay there even if I unplug AC. This does not happen at all when booted on
>>> battery power. 
>>>
>>> Also note: 
>>> - the laptop's fans are not really affected by the high frequencies.
>>> - setting the governor to "powersave" has no effect on the frequencies (as
>>> compared to when on battery power).
>>> - lowering the maximum frequency manually works, but does not fix (1).
>>>
>>> [1] https://aur.archlinux.org/pkgbase/linux-git/ (pulled commits up to
>>> 0280e3c58f92b2fe0e8fbbdf8d386449168de4a8).
>>> [2] https://bbs.archlinux.org/viewtopic.php?id=273330
> 
> I hope we can find a better solution, but since the responsiveness
> issue is a significant regression, I queued up a revert of
> 0e8ae5a6ff59 ("PCI/portdrv: Do not setup up IRQs if there are no
> users") in case we don't find one.

Likely best for now.

> 
> If/when we get to the bottom of this, I'll replace the revert with the
> solution.  0e8ae5a6ff59 appeared in v5.16, so we'll have to make sure
> we fix that as well.

If you could give some feedback/hints on the questions I posted last
week on the original patch, that might accelerate understanding the real
issue.

Thanks,
Jan

-- 
Siemens AG, Technology
Competence Center Embedded Linux

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ