lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <20190225080147.30128f73@shemminger-XPS-13-9360> Date: Mon, 25 Feb 2019 08:01:47 -0800 From: Stephen Hemminger <stephen@...workplumber.org> To: Sheng Lan <lansheng@...wei.com> Cc: <davem@...emloft.net>, <netdev@...r.kernel.org>, <netem@...ts.linux-foundation.org>, <xuhanbing@...wei.com>, <zhengshaoyu@...wei.com>, <jiqin.ji@...wei.com>, <liuzhiqiang26@...wei.com> Subject: Re: [PATCH] net: netem: fix skb length BUG_ON in __skb_to_sgvec On Mon, 25 Feb 2019 22:49:39 +0800 Sheng Lan <lansheng@...wei.com> wrote: > From: Sheng Lan <lansheng@...wei.com> > Subject: [PATCH] net: netem: fix skb length BUG_ON in __skb_to_sgvec > > It can be reproduced by following steps: > 1. virtio_net NIC is configured with gso/tso on > 2. configure nginx as http server with an index file bigger than 1M bytes > 3. use tc netem to produce duplicate packets and delay: > tc qdisc add dev eth0 root netem delay 100ms 10ms 30% duplicate 90% > 4. continually curl the nginx http server to get index file on client > 5. BUG_ON is seen quickly > > [10258690.371129] kernel BUG at net/core/skbuff.c:4028! > [10258690.371748] invalid opcode: 0000 [#1] SMP PTI > [10258690.372094] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G W 5.0.0-rc6 #2 > [10258690.372094] RSP: 0018:ffffa05797b43da0 EFLAGS: 00010202 > [10258690.372094] RBP: 00000000000005ea R08: 0000000000000000 R09: 00000000000005ea > [10258690.372094] R10: ffffa0579334d800 R11: 00000000000002c0 R12: 0000000000000002 > [10258690.372094] R13: 0000000000000000 R14: ffffa05793122900 R15: ffffa0578f7cb028 > [10258690.372094] FS: 0000000000000000(0000) GS:ffffa05797b40000(0000) knlGS:0000000000000000 > [10258690.372094] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [10258690.372094] CR2: 00007f1a6dc00868 CR3: 000000001000e000 CR4: 00000000000006e0 > [10258690.372094] Call Trace: > [10258690.372094] <IRQ> > [10258690.372094] skb_to_sgvec+0x11/0x40 > [10258690.372094] start_xmit+0x38c/0x520 [virtio_net] > [10258690.372094] dev_hard_start_xmit+0x9b/0x200 > [10258690.372094] sch_direct_xmit+0xff/0x260 > [10258690.372094] __qdisc_run+0x15e/0x4e0 > [10258690.372094] net_tx_action+0x137/0x210 > [10258690.372094] __do_softirq+0xd6/0x2a9 > [10258690.372094] irq_exit+0xde/0xf0 > [10258690.372094] smp_apic_timer_interrupt+0x74/0x140 > [10258690.372094] apic_timer_interrupt+0xf/0x20 > [10258690.372094] </IRQ> > > In __skb_to_sgvec, the skb->len is not equal to the sum of the skb's > linear data size and nonlinear data size, thus BUG_ON triggered. The > bad skb's nonlinear data size is less than skb->data_len, because the > skb is cloned and a part of related cloned skb's nonlinear data is > split off. > > Duplicate packet is cloned by skb_clone in netem_enqueue and may be delayed > some time in qdisc. Due to the delay time, the original skb will be pushed > again later in __tcp_push_pending_frames when tcp receives new packets. > In tcp_write_xmit, when the tcp_mss_split_point returns a smaller limit, > the original skb will be fragmented and the skb's nonlinear data will be > split off. The length of the skb cloned by netem will not be updated. > When we use virtio_net NIC, the duplicated cloned skb will be filled into > a scatter-gather list in __skb_to_sgvec and trigger the BUG_ON. > > Here I replace the skb_clone with skb_copy in netem_enqueue to ensure > the duplicated skb's nonlinear data is independent. > > Signed-off-by: Sheng Lan <lansheng@...wei.com> > Reported-by: Qin Ji <jiqin.ji@...wei.com> > > Fixes: 0afb51e7 ("netem: reinsert for duplication") This sounds like a bug in the other layers (either TCP or Virtio net) not handling a cloned skb properly.
Powered by blists - more mailing lists