lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20191104213628.1e9c38cb@hermes.lan>
Date:   Mon, 4 Nov 2019 21:36:28 -0800
From:   Stephen Hemminger <stephen@...workplumber.org>
To:     netdev@...r.kernel.org
Subject: Re: [Bug 205427] New: Qdisc running seqcount may cause performance
 issues



On Tue, 05 Nov 2019 05:27:57 +0000
bugzilla-daemon@...zilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=205427
> 
>             Bug ID: 205427
>            Summary: Qdisc running seqcount may cause performance issues
>            Product: Networking
>            Version: 2.5
>     Kernel Version: 4.9
>           Hardware: All
>                 OS: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: IPV4
>           Assignee: stephen@...workplumber.org
>           Reporter: yellowriver2010@...mail.com
>         Regression: No
> 
> [ 1261.949409] NMI watchdog: BUG: soft lockup - CPU#80 stuck for 23s!
> [tc:12076]
> [ 1261.956565] Modules linked in: kpatch_D871570(O) kpatch(OE) intel_rapl
> iosf_mbi x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass
> crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel lrw glue_helper
> ablk_helper cryptd iTCO_wdt iTCO_vendor_support pcspkr iohub_sriov(O) mei_me
> ioatdma i2c_i801 lpc_ich ipmi_si mei dca shpchp mfd_core i2c_smbus wmi
> ipmi_msghandler acpi_power_meter acpi_pad ip_tables ext4 jbd2 mbcache
> virtio_net virtio_blk i2c_algo_bit drm_kms_helper crc32c_intel syscopyarea
> sysfillrect sysimgblt fb_sys_fops ttm ahci virtio_pci drm virtio_ring(E)
> libahci virtio libata nvme nvme_core i2c_core
> [ 1262.014116] CPU: 80 PID: 12076 Comm: tc Tainted: G           OE K
> 4.9.151-015.ali3000.alios7.x86_64 #1
> [ 1262.023437] Hardware name: Alibaba Alibaba Cloud ECS/Alibaba Cloud ECS, BIOS
> 3.23.34 02/14/2019
> [ 1262.032154] task: ffff887f48dc8000 task.stack: ffffc90037128000
> [ 1262.038091] RIP: 0010:[<ffffffff8160fbee>]  [<ffffffff8160fbee>]
> __gnet_stats_copy_basic+0x8e/0x90
> [ 1262.047093] RSP: 0018:ffffc9003712b930  EFLAGS: 00000202
> [ 1262.052418] RAX: 000000000011845b RBX: ffffc9003712b9a0 RCX:
> ffff887f4e39b2b0
> [ 1262.059569] RDX: 0000000000000000 RSI: ffffc9003712b93c RDI:
> ffff887f4e3950bc
> [ 1262.066871] RBP: ffffc9003712b968 R08: 0000000000000004 R09:
> ffff887f5252bf7c
> [ 1262.074166] R10: ffff887f58718100 R11: 0000000000000008 R12:
> 0000000000000000
> [ 1262.081465] R13: ffff887f5252bf4c R14: ffff887f4e39b200 R15:
> ffff887f4e39b2b0
> [ 1262.088765] FS:  00007f0a974cd740(0000) GS:ffff887f7f000000(0000)
> knlGS:0000000000000000
> [ 1262.097157] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1262.103067] CR2: 00000000006474c0 CR3: 0000007f50006000 CR4:
> 00000000007606f0
> [ 1262.110368] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [ 1262.117665] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> 0000000000000400
> [ 1262.124962] PKRU: 55555554
> [ 1262.127822] Stack:
> [ 1262.129985]  ffffffff8160fce8 0000000000000009 0000000000000000
> ffffffff8160ff24
> [ 1262.137779]  ffff887f58718100 d0e0e5e734dba31b ffff887f58718100
> ffffc9003712ba30
> [ 1262.145574]  ffffffff8164d383 0000000000000009 0000000000000000
> 0000000000000000
> [ 1262.153363] Call Trace:
> [ 1262.155968]  [<ffffffff8160fce8>] ? gnet_stats_copy_basic+0x38/0xe0
> [ 1262.162397]  [<ffffffff8160ff24>] ? gnet_stats_start_copy_compat+0x94/0x100
> [ 1262.169527]  [<ffffffff8164d383>] tc_fill_qdisc+0x283/0x400
> [ 1262.175258]  [<ffffffff8164d635>] tc_dump_qdisc_root+0x135/0x1a0
> [ 1262.181430]  [<ffffffff8164d744>] tc_dump_qdisc+0xa4/0x150
> [ 1262.187077]  [<ffffffff81654eb1>] netlink_dump+0x231/0x2c0
> [ 1262.192721]  [<ffffffff81655d68>] __netlink_dump_start+0x168/0x1a0
> [ 1262.199067]  [<ffffffff8164d6a0>] ? tc_dump_qdisc_root+0x1a0/0x1a0
> [ 1262.205413]  [<ffffffff81634093>] rtnetlink_rcv_msg+0x1c3/0x230
> [ 1262.211494]  [<ffffffff8164d6a0>] ? tc_dump_qdisc_root+0x1a0/0x1a0
> [ 1262.217835]  [<ffffffff81633ed0>] ? rtnl_newlink+0x860/0x860
> [ 1262.223651]  [<ffffffff81657b64>] netlink_rcv_skb+0xa4/0xc0
> [ 1262.229387]  [<ffffffff8162e785>] rtnetlink_rcv+0x15/0x20
> [ 1262.234942]  [<ffffffff8165754c>] netlink_unicast+0x18c/0x220
> [ 1262.240844]  [<ffffffff8165793b>] netlink_sendmsg+0x35b/0x3b0
> [ 1262.246758]  [<ffffffff815fca58>] sock_sendmsg+0x38/0x50
> [ 1262.252233]  [<ffffffff815fd4ad>] ___sys_sendmsg+0x29d/0x2b0
> [ 1262.258060]  [<ffffffff811ed21f>] ? do_wp_page+0x39f/0x850
> [ 1262.263708]  [<ffffffff811f04ea>] ? handle_mm_fault+0x6da/0xd50
> [ 1262.269791]  [<ffffffff815fde04>] __sys_sendmsg+0x54/0x90
> [ 1262.275359]  [<ffffffff815fde52>] SyS_sendmsg+0x12/0x20
> [ 1262.280744]  [<ffffffff81003c04>] do_syscall_64+0x74/0x180
> [ 1262.286387]  [<ffffffff81741c8e>] entry_SYSCALL_64_after_swapgs+0x58/0xc6
> [ 1262.293337] Code: 85 ff 74 18 8b 07 a8 01 75 1f 48 8b 11 48 89 16 8b 51 08
> 89 56 08 39 07 75 ea f3 c3 48 8b 01 48 89 06 8b 41 08 89 46 08 c3 f3 90 <eb> d7
> 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 53 8b 47 1c 48 89 
> [ 1262.314295] Kernel panic - not syncing: softlockup: hung tasks
> [ 1262.320291] CPU: 80 PID: 12076 Comm: tc Tainted: G           OELK
> 4.9.151-015.ali3000.alios7.x86_64 #1
> [ 1262.329904] Hardware name: Alibaba Alibaba Cloud ECS/Alibaba Cloud ECS, BIOS
> 3.23.34 02/14/2019
> [ 1262.338904]  ffff887f7f003e50 ffffffff8139f342 0000000000000000
> ffffffff81a4f7f1
> [ 1262.346674]  ffff887f7f003ed8 ffffffff811b2245 ffffc90000000008
> ffff887f7f003ee8
> [ 1262.354456]  ffff887f7f003e80 d0e0e5e734dba31b ffff887f7f003ea7
> 0000000000000000
> [ 1262.362249] Call Trace:
> [ 1262.364846]  <IRQ> 
> [ 1262.366785]  [<ffffffff8139f342>] dump_stack+0x63/0x81
> [ 1262.372095]  [<ffffffff811b2245>] panic+0xf8/0x244
> [ 1262.377049]  [<ffffffff81158296>] watchdog_timer_fn+0x226/0x230
> [ 1262.383126]  [<ffffffff81158070>] ? watchdog_park_threads+0x70/0x70
> [ 1262.389555]  [<ffffffff811085e3>] __hrtimer_run_queues+0xf3/0x270
> [ 1262.395807]  [<ffffffff81108d8a>] hrtimer_interrupt+0x9a/0x180
> [ 1262.401796]  [<ffffffff810580f8>] local_apic_timer_interrupt+0x38/0x60
> [ 1262.408491]  [<ffffffff81745405>] smp_apic_timer_interrupt+0x45/0x60
> [ 1262.415006]  [<ffffffff81743b90>] apic_timer_interrupt+0xa0/0xb0
> [ 1262.421177]  <EOI> 
> [ 1262.423118]  [<ffffffff8160fbee>] ? __gnet_stats_copy_basic+0x8e/0x90
> [ 1262.429871]  [<ffffffff8160fce8>] ? gnet_stats_copy_basic+0x38/0xe0
> [ 1262.436301]  [<ffffffff8160ff24>] ? gnet_stats_start_copy_compat+0x94/0x100
> [ 1262.443428]  [<ffffffff8164d383>] tc_fill_qdisc+0x283/0x400
> [ 1262.449164]  [<ffffffff8164d635>] tc_dump_qdisc_root+0x135/0x1a0
> [ 1262.455329]  [<ffffffff8164d744>] tc_dump_qdisc+0xa4/0x150
> [ 1262.460978]  [<ffffffff81654eb1>] netlink_dump+0x231/0x2c0
> [ 1262.466620]  [<ffffffff81655d68>] __netlink_dump_start+0x168/0x1a0
> [ 1262.472965]  [<ffffffff8164d6a0>] ? tc_dump_qdisc_root+0x1a0/0x1a0
> [ 1262.479311]  [<ffffffff81634093>] rtnetlink_rcv_msg+0x1c3/0x230
> [ 1262.485394]  [<ffffffff8164d6a0>] ? tc_dump_qdisc_root+0x1a0/0x1a0
> [ 1262.491737]  [<ffffffff81633ed0>] ? rtnl_newlink+0x860/0x860
> [ 1262.497560]  [<ffffffff81657b64>] netlink_rcv_skb+0xa4/0xc0
> [ 1262.503295]  [<ffffffff8162e785>] rtnetlink_rcv+0x15/0x20
> [ 1262.508854]  [<ffffffff8165754c>] netlink_unicast+0x18c/0x220
> [ 1262.514760]  [<ffffffff8165793b>] netlink_sendmsg+0x35b/0x3b0
> [ 1262.520666]  [<ffffffff815fca58>] sock_sendmsg+0x38/0x50
> [ 1262.526140]  [<ffffffff815fd4ad>] ___sys_sendmsg+0x29d/0x2b0
> [ 1262.531960]  [<ffffffff811ed21f>] ? do_wp_page+0x39f/0x850
> [ 1262.537607]  [<ffffffff811f04ea>] ? handle_mm_fault+0x6da/0xd50
> [ 1262.543691]  [<ffffffff815fde04>] __sys_sendmsg+0x54/0x90
> [ 1262.549360]  [<ffffffff815fde52>] SyS_sendmsg+0x12/0x20
> [ 1262.554750]  [<ffffffff81003c04>] do_syscall_64+0x74/0x180
> [ 1262.560402]  [<ffffffff81741c8e>] entry_SYSCALL_64_after_swapgs+0x58/0xc6
> 
> 
> CPU 80                                                                  CPU 1
> execute : tc qd show dev bond0                                 netperf
> 
> rtnetlink_rcv (acquired rtnl_mutex)
> ...
> ...                                                            __dev_xmit_skb
> __gnet_stats_copy_basic       
> 
> 
> 
> 
> If netperf on CPU 1 runs for a long time, the tc program may repeatedly acquire
> seqcount for a long time, resulting in softlockup. At this time, the tc program
> holds the large lock of rtnl_mutex, which also affects other programs in the
> system.
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ