netdev - Re: WARNING: CPU: 0 PID: 0 at net/ipv4/af_inet.c:155 inet_sock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <577BC94A.4060204@free.fr>
Date:	Tue, 5 Jul 2016 16:50:50 +0200
From:	Mason <slash.tmp@...e.fr>
To:	linux-pm <linux-pm@...r.kernel.org>,
	netdev <netdev@...r.kernel.org>,
	LKML <linux-kernel@...r.kernel.org>
Cc:	Sebastian Frias <sf84@...oste.net>
Subject: Re: WARNING: CPU: 0 PID: 0 at net/ipv4/af_inet.c:155
 inet_sock_destruct+0x1c4/0x1dc

On 05/07/2016 15:33, Mason wrote:

> I was testing suspend/resume sequences where the suspend operation
> fails and returns without having suspended the platform.
> 
> # echo mem > /sys/power/state
> [   90.322264] PM: Syncing filesystems ... done.
> [   90.328758] Freezing user space processes ... (elapsed 0.001 seconds) done.
> [   90.337092] Double checking all user space processes after OOM killer disable... (elapsed 0.000 seconds)
> [   90.346765] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
> [   90.355357] Suspending console(s) (use no_console_suspend to debug)
> [   90.364590] PM: suspend of devices complete after 2.068 msecs
> [   90.365554] PM: late suspend of devices complete after 0.954 msecs
> [   90.366223] PM: noirq suspend of devices complete after 0.662 msecs
> [   90.366227] Disabling non-boot CPUs ...
> [   90.379004] CPU1: shutdown
> [   90.412661] Enabling non-boot CPUs ...
> [   90.450385] CPU1 is up
> [   90.450979] PM: noirq resume of devices complete after 0.584 msecs
> [   90.451672] PM: early resume of devices complete after 0.667 msecs
> [   90.453149] nb8800 26000.ethernet eth0: Link is Down
> [   90.453264] PM: resume of devices complete after 1.583 msecs
> [   90.508180] Restarting tasks ... done.
> -sh: echo: write error: Input/output error
> [   93.860411] nb8800 26000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx
> 
> (The error message is expected, as my suspend routine returns -EIO
> on failure.)
> 
> I left the system to idle at the prompt; then 5 minutes later,
> the system printed the following trace.
> 
> [  400.718491] ------------[ cut here ]------------
> [  400.723175] WARNING: CPU: 0 PID: 0 at net/ipv4/af_inet.c:155 inet_sock_destruct+0x1c4/0x1dc
> [  400.731582] Modules linked in:
> [  400.734689] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.7.0-rc6-00010-gd07031bdc433 #1
> [  400.742646] Hardware name: Sigma Tango DT
> [  400.746671] Backtrace: 
> [  400.749141] [<c010b974>] (dump_backtrace) from [<c010bb70>] (show_stack+0x18/0x1c)
> [  400.756747]  r7:60000113 r6:c080ea84 r5:00000000 r4:c080ea84
> [  400.762454] [<c010bb58>] (show_stack) from [<c02e9fe4>] (dump_stack+0x80/0x94)
> [  400.769722] [<c02e9f64>] (dump_stack) from [<c011bfc8>] (__warn+0xec/0x104)
> [  400.776717]  r7:00000009 r6:c05e3fbc r5:00000000 r4:00000000
> [  400.782417] [<c011bedc>] (__warn) from [<c011c098>] (warn_slowpath_null+0x28/0x30)
> [  400.790022]  r9:dfbdd4e0 r8:0000000a r7:c0801de8 r6:df6f9514 r5:df5df144 r4:df5df040
> [  400.797825] [<c011c070>] (warn_slowpath_null) from [<c0463c04>] (inet_sock_destruct+0x1c4/0x1dc)
> [  400.806661] [<c0463a40>] (inet_sock_destruct) from [<c03e9c60>] (__sk_destruct+0x28/0xe0)
> [  400.814878]  r7:c0801de8 r6:df6f9514 r5:df5df040 r4:df5df1ec
> [  400.820584] [<c03e9c38>] (__sk_destruct) from [<c016f230>] (rcu_process_callbacks+0x488/0x59c)
> [  400.829237]  r5:00000000 r4:00000000
> [  400.832836] [<c016eda8>] (rcu_process_callbacks) from [<c01207e4>] (__do_softirq+0x138/0x264)
> [  400.841402]  r10:c08020a0 r9:40000001 r8:00000101 r7:c0800000 r6:c08020a4 r5:00000009
> [  400.849285]  r4:00000000
> [  400.851829] [<c01206ac>] (__do_softirq) from [<c0120c04>] (irq_exit+0xc8/0x104)
> [  400.859172]  r10:c0801f10 r9:df402400 r8:00000001 r7:00000000 r6:00000013 r5:00000000
> [  400.867053]  r4:c0735428
> [  400.869601] [<c0120b3c>] (irq_exit) from [<c0162610>] (__handle_domain_irq+0x88/0xf4)
> [  400.877473] [<c0162588>] (__handle_domain_irq) from [<c01014ac>] (gic_handle_irq+0x50/0x94)
> [  400.885865]  r10:dfffcdc0 r9:e0803100 r8:e0802100 r7:c0801f10 r6:e080210c r5:c080277c
> [  400.893747]  r4:c080eca0 r3:c0801f10
> [  400.897342] [<c010145c>] (gic_handle_irq) from [<c010c694>] (__irq_svc+0x54/0x90)
> [  400.904861] Exception stack(0xc0801f10 to 0xc0801f58)
> [  400.909936] 1f00:                                     00000000 00000000 0000826a c0117c80
> [  400.918156] 1f20: c0800000 c08024f8 c0802494 c081e2d6 c05b954c c07268c0 dfffcdc0 c0801f6c
> [  400.926376] 1f40: c0801f70 c0801f60 c01086b0 c01086b4 60000013 ffffffff
> [  400.933020]  r9:c07268c0 r8:c05b954c r7:c0801f44 r6:ffffffff r5:60000013 r4:c01086b4
> [  400.940826] [<c0108674>] (arch_cpu_idle) from [<c0155f54>] (default_idle_call+0x28/0x34)
> [  400.948960] [<c0155f2c>] (default_idle_call) from [<c0156088>] (cpu_startup_entry+0x128/0x17c)
> [  400.957620] [<c0155f60>] (cpu_startup_entry) from [<c04a3f54>] (rest_init+0x8c/0x90)
> [  400.965400]  r7:ffffffff r4:00000002
> [  400.969005] [<c04a3ec8>] (rest_init) from [<c0700cb4>] (start_kernel+0x310/0x31c)
> [  400.976522]  r5:c081e4c0 r4:00000001
> [  400.980121] [<c07009a4>] (start_kernel) from [<8000807c>] (0x8000807c)
> [  400.986716] ---[ end trace f8deb50d1b3d3c7a ]---


NB: The warning shows up 310 seconds after the suspend attempt.

I rebooted, tried the same operation, and hit the same warning
still 310 seconds later:

# echo mem > /sys/power/state
[   25.665905] PM: Syncing filesystems ... done.
[   25.672102] Freezing user space processes ... (elapsed 0.001 seconds) done.
[   25.680807] Double checking all user space processes after OOM killer disable... (elapsed 0.000 seconds) 
[   25.690494] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
[   25.699118] Suspending console(s) (use no_console_suspend to debug)
[   25.707899] PM: suspend of devices complete after 1.639 msecs
[   25.708796] PM: late suspend of devices complete after 0.887 msecs
[   25.709460] PM: noirq suspend of devices complete after 0.657 msecs
[   25.709465] Disabling non-boot CPUs ...
[   25.729045] CPU1: shutdown
[   25.762704] Enabling non-boot CPUs ...
[   25.800416] CPU1 is up
[   25.801024] PM: noirq resume of devices complete after 0.595 msecs
[   25.801730] PM: early resume of devices complete after 0.678 msecs
[   25.803194] nb8800 26000.ethernet eth0: Link is Down
[   25.803311] PM: resume of devices complete after 1.571 msecs
[   25.858245] Restarting tasks ... done.
-sh: echo: write error: Input/output error
[   29.186902] nb8800 26000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx

[  335.865192] ------------[ cut here ]------------
[  335.869875] WARNING: CPU: 0 PID: 0 at net/ipv4/af_inet.c:155 inet_sock_destruct+0x1c4/0x1dc
[  335.878284] Modules linked in:
[  335.881366] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.7.0-rc6-00010-gd07031bdc433 #1
[  335.889321] Hardware name: Sigma Tango DT
[  335.893346] Backtrace: 
[  335.895815] [<c010b974>] (dump_backtrace) from [<c010bb70>] (show_stack+0x18/0x1c)
[  335.903420]  r7:60000113 r6:c080ea84 r5:00000000 r4:c080ea84
[  335.909125] [<c010bb58>] (show_stack) from [<c02e9fe4>] (dump_stack+0x80/0x94)
[  335.916391] [<c02e9f64>] (dump_stack) from [<c011bfc8>] (__warn+0xec/0x104)
[  335.923386]  r7:00000009 r6:c05e3fbc r5:00000000 r4:00000000
[  335.929086] [<c011bedc>] (__warn) from [<c011c098>] (warn_slowpath_null+0x28/0x30)
[  335.936691]  r9:dfbdd4e0 r8:0000000a r7:c0801de8 r6:df75ec54 r5:df5af3c4 r4:df5af2c0
[  335.944491] [<c011c070>] (warn_slowpath_null) from [<c0463c04>] (inet_sock_destruct+0x1c4/0x1dc)
[  335.953326] [<c0463a40>] (inet_sock_destruct) from [<c03e9c60>] (__sk_destruct+0x28/0xe0)
[  335.961542]  r7:c0801de8 r6:df75ec54 r5:df5af2c0 r4:df5af46c
[  335.967248] [<c03e9c38>] (__sk_destruct) from [<c016f230>] (rcu_process_callbacks+0x488/0x59c)
[  335.975901]  r5:00000000 r4:00000000
[  335.979499] [<c016eda8>] (rcu_process_callbacks) from [<c01207e4>] (__do_softirq+0x138/0x264)
[  335.988065]  r10:c08020a0 r9:40000001 r8:00000101 r7:c0800000 r6:c08020a4 r5:00000009
[  335.995947]  r4:00000000
[  335.998492] [<c01206ac>] (__do_softirq) from [<c0120c04>] (irq_exit+0xc8/0x104)
[  336.005835]  r10:c0801f10 r9:df402400 r8:00000001 r7:00000000 r6:00000013 r5:00000000
[  336.013716]  r4:c0735428
[  336.016264] [<c0120b3c>] (irq_exit) from [<c0162610>] (__handle_domain_irq+0x88/0xf4)
[  336.024136] [<c0162588>] (__handle_domain_irq) from [<c01014ac>] (gic_handle_irq+0x50/0x94)
[  336.032526]  r10:dfffcdc0 r9:e0803100 r8:e0802100 r7:c0801f10 r6:e080210c r5:c080277c
[  336.040406]  r4:c080eca0 r3:c0801f10
[  336.044001] [<c010145c>] (gic_handle_irq) from [<c010c694>] (__irq_svc+0x54/0x90)
[  336.051520] Exception stack(0xc0801f10 to 0xc0801f58)
[  336.056594] 1f00:                                     00000000 00000000 000079b2 c0117c80
[  336.064815] 1f20: c0800000 c08024f8 c0802494 c081e2d6 c05b954c c07268c0 dfffcdc0 c0801f6c
[  336.073034] 1f40: c0801f70 c0801f60 c01086b0 c01086b4 60000013 ffffffff
[  336.079678]  r9:c07268c0 r8:c05b954c r7:c0801f44 r6:ffffffff r5:60000013 r4:c01086b4
[  336.087483] [<c0108674>] (arch_cpu_idle) from [<c0155f54>] (default_idle_call+0x28/0x34)
[  336.095616] [<c0155f2c>] (default_idle_call) from [<c0156088>] (cpu_startup_entry+0x128/0x17c)
[  336.104277] [<c0155f60>] (cpu_startup_entry) from [<c04a3f54>] (rest_init+0x8c/0x90)
[  336.112057]  r7:ffffffff r4:00000002
[  336.115662] [<c04a3ec8>] (rest_init) from [<c0700cb4>] (start_kernel+0x310/0x31c)
[  336.123180]  r5:c081e4c0 r4:00000001
[  336.126777] [<c07009a4>] (start_kernel) from [<8000807c>] (0x8000807c)
[  336.133349] ---[ end trace d6b09977089e89b4 ]---


Does this 310 second lag ring a bell for anyone?

Regards.