lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 24 May 2024 18:40:43 +0800
From: Yue Haibing <yuehaibing@...wei.com>
To: Eric Dumazet <edumazet@...gle.com>
CC: <davem@...emloft.net>, <kuba@...nel.org>, <pabeni@...hat.com>,
	<jhs@...atatu.com>, <xiyou.wangcong@...il.com>, <jiri@...nulli.us>,
	<hannes@...essinduktion.org>, <netdev@...r.kernel.org>,
	<linux-kernel@...r.kernel.org>
Subject: Re: [PATCH net] net/sched: Add xmit_recursion level in
 sch_direct_xmit()

On 2024/5/24 17:24, Eric Dumazet wrote:
> On Fri, May 24, 2024 at 10:49 AM Yue Haibing <yuehaibing@...wei.com> wrote:
>>
>> packet from PF_PACKET socket ontop of an IPv6-backed ipvlan device will hit
>> WARN_ON_ONCE() in sk_mc_loop() through sch_direct_xmit() path while ipvlan
>> device has qdisc queue.
>>
>> WARNING: CPU: 2 PID: 0 at net/core/sock.c:775 sk_mc_loop+0x2d/0x70
>> Modules linked in: sch_netem ipvlan rfkill cirrus drm_shmem_helper sg drm_kms_helper
>> CPU: 2 PID: 0 Comm: swapper/2 Kdump: loaded Not tainted 6.9.0+ #279
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
>> RIP: 0010:sk_mc_loop+0x2d/0x70
>> Code: fa 0f 1f 44 00 00 65 0f b7 15 f7 96 a3 4f 31 c0 66 85 d2 75 26 48 85 ff 74 1c
>> RSP: 0018:ffffa9584015cd78 EFLAGS: 00010212
>> RAX: 0000000000000011 RBX: ffff91e585793e00 RCX: 0000000002c6a001
>> RDX: 0000000000000000 RSI: 0000000000000040 RDI: ffff91e589c0f000
>> RBP: ffff91e5855bd100 R08: 0000000000000000 R09: 3d00545216f43d00
>> R10: ffff91e584fdcc50 R11: 00000060dd8616f4 R12: ffff91e58132d000
>> R13: ffff91e584fdcc68 R14: ffff91e5869ce800 R15: ffff91e589c0f000
>> FS:  0000000000000000(0000) GS:ffff91e898100000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 00007f788f7c44c0 CR3: 0000000008e1a000 CR4: 00000000000006f0
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> Call Trace:
>>  <IRQ>
>>  ? __warn+0x83/0x130
>>  ? sk_mc_loop+0x2d/0x70
>>  ? report_bug+0x18e/0x1a0
>>  ? handle_bug+0x3c/0x70
>>  ? exc_invalid_op+0x18/0x70
>>  ? asm_exc_invalid_op+0x1a/0x20
>>  ? sk_mc_loop+0x2d/0x70
>>  ip6_finish_output2+0x31e/0x590
>>  ? nf_hook_slow+0x43/0xf0
>>  ip6_finish_output+0x1f8/0x320
>>  ? __pfx_ip6_finish_output+0x10/0x10
>>  ipvlan_xmit_mode_l3+0x22a/0x2a0 [ipvlan]
>>  ipvlan_start_xmit+0x17/0x50 [ipvlan]
>>  dev_hard_start_xmit+0x8c/0x1d0
>>  sch_direct_xmit+0xa2/0x390
>>  __qdisc_run+0x66/0xd0
>>  net_tx_action+0x1ca/0x270
>>  handle_softirqs+0xd6/0x2b0
>>  __irq_exit_rcu+0x9b/0xc0
>>  sysvec_apic_timer_interrupt+0x75/0x90
> 
> Please provide full symbols in stack traces.

Call Trace:
<IRQ>
? __warn (kernel/panic.c:693)
? sk_mc_loop (net/core/sock.c:775 net/core/sock.c:760)
? report_bug (lib/bug.c:201 lib/bug.c:219)
? handle_bug (arch/x86/kernel/traps.c:239)
? exc_invalid_op (arch/x86/kernel/traps.c:260 (discriminator 1))
? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:621)
? sk_mc_loop (net/core/sock.c:775 net/core/sock.c:760)
ip6_finish_output2 (net/ipv6/ip6_output.c:83 (discriminator 1))
? nf_hook_slow (./include/linux/netfilter.h:154 net/netfilter/core.c:626)
ip6_finish_output (net/ipv6/ip6_output.c:211 net/ipv6/ip6_output.c:222)
? __pfx_ip6_finish_output (net/ipv6/ip6_output.c:215)
ipvlan_xmit_mode_l3 (drivers/net/ipvlan/ipvlan_core.c:498 drivers/net/ipvlan/ipvlan_core.c:538 drivers/net/ipvlan/ipvlan_core.c:602) ipvlan
ipvlan_start_xmit (drivers/net/ipvlan/ipvlan_main.c:226) ipvlan
dev_hard_start_xmit (./include/linux/netdevice.h:4882 ./include/linux/netdevice.h:4896 net/core/dev.c:3578 net/core/dev.c:3594)
sch_direct_xmit (net/sched/sch_generic.c:343)
__qdisc_run (net/sched/sch_generic.c:416)
net_tx_action (./include/net/sch_generic.h:219 ./include/net/pkt_sched.h:128 ./include/net/pkt_sched.h:124 net/core/dev.c:5286)
handle_softirqs (./arch/x86/include/asm/jump_label.h:27 ./include/linux/jump_label.h:207 ./include/trace/events/irq.h:142 kernel/softirq.c:555)
__irq_exit_rcu (kernel/softirq.c:589 kernel/softirq.c:428 kernel/softirq.c:637)
sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1043 arch/x86/kernel/apic/apic.c:1043)

> 
>>  </IRQ>
>>
>> Fixes: f60e5990d9c1 ("ipv6: protect skb->sk accesses from recursive dereference inside the stack")
>> Signed-off-by: Yue Haibing <yuehaibing@...wei.com>
>> ---
>>  include/linux/netdevice.h | 17 +++++++++++++++++
>>  net/core/dev.h            | 17 -----------------
>>  net/sched/sch_generic.c   |  8 +++++---
>>  3 files changed, 22 insertions(+), 20 deletions(-)
> 
> This patch seems unrelated to the WARN_ON_ONCE(1) met in sk_mc_loop()
> 
> If sk_mc_loop() is called with a socket which is not inet, we are in trouble.
> 
> Please fix the root cause instead of trying to shortcut sk_mc_loop() as you did.
First setup like this:
ip netns add ns0
ip netns add ns1
ip link add ip0 link eth0 type ipvlan mode l3 vepa
ip link add ip1 link eth0 type ipvlan mode l3 vepa
ip link set ip0 netns ns0
ip link exec ip link set ip0 up
ip link set ip1 netns ns1
ip link exec ip link set ip1 up
ip link exec tc qdisc add dev ip1 root netem delay 10ms

Second, build and send a raw ipv6 multicast packet as attached repro in ns1

packet_sendmsg
   packet_snd //skb->sk is packet sk
      __dev_queue_xmit
         __dev_xmit_skb //q->enqueue is not NULL
             __qdisc_run
                 qdisc_restart
                    sch_direct_xmit
                       dev_hard_start_xmit
                          netdev_start_xmit
                            ipvlan_start_xmit
                              ipvlan_xmit_mode_l3 //l3 mode
                                 ipvlan_process_outbound //vepa flag
                                   ipvlan_process_v6_outbound //skb->protocol is ETH_P_IPV6
                                      ip6_local_out
                                       ...
                                         __ip6_finish_output
                                            ip6_finish_output2 //multicast packet
                                               sk_mc_loop //dev_recursion_level is 0
                                                  WARN_ON_ONCE //sk->sk_family is AF_PACKET

> .
> 

View attachment "repro.c" of type "text/plain" (12818 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ