lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <10de7289-653f-43b1-ad46-2e8a0cd42724@molgen.mpg.de>
Date: Tue, 11 Feb 2025 12:57:33 +0100
From: Paul Menzel <pmenzel@...gen.mpg.de>
To: Frederic Weisbecker <frederic@...nel.org>
Cc: Michał Pecio <michal.pecio@...il.com>,
 anna-maria@...utronix.de, linux-kernel@...r.kernel.org,
 linux-trace-kernel@...r.kernel.org, linux-usb@...r.kernel.org,
 mingo@...nel.org, tglx@...utronix.de
Subject: Re: NOHZ tick-stop error: local softirq work is pending, handler
 #08!!! on Dell XPS 13 9360

Dear Frederic,


Thank you for your reply.


Am 10.02.25 um 14:26 schrieb Frederic Weisbecker:
> Le Mon, Feb 10, 2025 at 12:59:42PM +0100, Paul Menzel a écrit :

>> Am 10.02.25 um 12:45 schrieb Michał Pecio:
>>
>>>>>>>>>> On Dell XPS 13 9360/0596KF, BIOS 2.21.0 06/02/2022, with Linux
>>>>>>>>>> 6.9-rc2+
>>>
>>>> Just for the record, I am still seeing this with 6.14.0-rc1
>>>
>>> Is this a regression? If so, which versions were not affected?
>>
>> Unfortunately, I do not know. Right now, my logs go back until September
>> 2024.
>>
>>      Sep 22 13:08:04 abreu kernel: Linux version 6.11.0-07273-g1e7530883cd2 (build@...emianrhapsody.molgen.mpg.de) (gcc (Debian 14.2.0-5) 14.2.0, GNU ld (GNU Binutils for Debian) 2.43.1) #12 SMP PREEMPT_DYNAMIC Sun Sep 22 09:57:36 CEST 2024
>>
>>> How hard to reproduce? Wasn't it during resume from hibernation?
>>
>> It’s not easy to reproduce, and I believe it’s not related with resuming
>> from hibernation (which I do not use) or ACPI S3 suspend. I think, I can
>> force it more, when having the USB-C adapter with only the network cable
>> plugged into it, and then running `sudo powertop --auto-tune`. But sometimes
>> it seems unrelated.
>>
>>> IRQ isuses may be a red herring, this code here is a busy wait under
>>> spinlock. There are a few of those, they cause various problems.
>>>
>>>                   if (xhci_handshake(&xhci->op_regs->status,
>>>                                 STS_RESTORE, 0, 100 * 1000)) {
>>>                           xhci_warn(xhci, "WARN: xHC restore state timeout\n");
>>> 			spin_unlock_irq(&xhci->lock);
>>>                           return -ETIMEDOUT;
>>>                   }
>>>
>>> This thing timing out may be close to the root cause of everything.
>>
>> Interesting. Hopefully the USB folks have an idea.
> 
> Handler #08 is NET_RX. So something raised the NET_RX on some non-appropriate
> place, perhaps...
> 
> Can I ask you one more trace dump?
> 
> I need:
> 
> echo 1 > /sys/kernel/tracing/events/irq/softirq_raise/enable
> echo 1 > /sys/kernel/tracing/options/stacktrace
> 
> Unfortunately this will also involve a small patch:
> 
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index fa058510af9c..accd2eb8c927 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -1159,6 +1159,9 @@ static bool report_idle_softirq(void)
>   	if (local_bh_blocked())
>   		return false;
>   
> +	trace_printk("STOP\n");
> +	trace_dump_stack(0);
> +	tracing_off();
>   	pr_warn("NOHZ tick-stop error: local softirq work is pending, handler #%02x!!!\n",
>   		pending);
>   	ratelimit++;

Thank you for your help. I applied the patch on top of 6.14-rc2, and was 
able to reproduce the issue. Please find the Linux messages attached, 
and the trace can be downloaded [1].


Kind regards,

Paul


[1]: 
https://owww.molgen.mpg.de/~pmenzel/20250210--dell-xps-13-9360--linux-6.14-rc2--NOHZ-tick-stop-error-local-softirq-work-is-pending--trace.txt.7z
View attachment "20250210--dell-xps-13-9360--linux-6.14-rc2--NOHZ-tick-stop-error-local-softirq-work-is-pending.txt" of type "text/plain" (349361 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ