lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Fri, 07 Sep 2012 08:08:45 +0200 From: Eric Dumazet <eric.dumazet@...il.com> To: Shawn Bohrer <sbohrer@...advisors.com> Cc: netdev@...r.kernel.org Subject: Re: Increased multicast packet drops in 3.4 On Thu, 2012-09-06 at 23:00 -0500, Shawn Bohrer wrote: > On Thu, Sep 06, 2012 at 03:21:07PM +0200, Eric Dumazet wrote: > > kfree_skb() can free a list of skb, and we use a generic function to do > > so, without forwarding the drop/notdrop status. So its unfortunate, but > > adding extra parameters just for the sake of drop_monitor is not worth > > it. skb_drop_fraglist() doesnt know if the parent skb is dropped or > > only freed, so it calls kfree_skb(), not consume_skb() or kfree_skb() > > I understand that this means that dropwatch or the skb:kfree_skb > tracepoint won't know if the packet was really dropped, but do we > know in this case from the context of the stack trace? I'm assuming > since we didn't receive an error that the packet was delivered and > these aren't real drops. I am starting to believe this is an application error. This application uses recvmmsg() to fetch a lot of messages in one syscall, and it might well be it throws out a batch of 50+ messages because of an application bug. Yes, this starts with 3.4, but it can b triggered by a timing difference or something that is not a proper kernel bug... > > > Are you receiving fragmented UDP frames ? > > I looked at the sending application and it yes it is possible it is > sending fragmented frames. > > > I ask this because with latest kernels (linux-3.5), we should no longer > > build a list of skb, but a single skb with page fragments. > > > > commit 3cc4949269e01f39443d0fcfffb5bc6b47878d45 > > Author: Eric Dumazet <edumazet@...gle.com> > > Date: Sat May 19 03:02:20 2012 +0000 > > > > ipv4: use skb coalescing in defragmentation > > > > ip_frag_reasm() can use skb_try_coalesce() to build optimized skb, > > reducing memory used by them (truesize), and reducing number of cache > > line misses and overhead for the consumer. > > > > Signed-off-by: Eric Dumazet <edumazet@...gle.com> > > Cc: Alexander Duyck <alexander.h.duyck@...el.com> > > Signed-off-by: David S. Miller <davem@...emloft.net> > > I'll have to give 3.5 a try tomorrow and see if it has the same > problem. After backporting all of your patches to convert kfree_skb() > to consume_skb() to 3.4 I actually don't have that many different > places hitting the skb:kfree_skb tracepoint at the time of the drop. > Here are some of the ones I have left that might be relevant. > > <idle>-0 [001] 11933.738797: kfree_skb: skbaddr=0xffff8805ebcf9500 protocol=2048 location=0xffffffff81404e33 > <idle>-0 [001] 11933.738801: kernel_stack: <stack trace> > => ip_rcv (ffffffff81404e33) > => __netif_receive_skb (ffffffff813ce123) > => netif_receive_skb (ffffffff813d0da1) > => process_responses (ffffffffa018486c) > => napi_rx_handler (ffffffffa0185606) > => net_rx_action (ffffffff813d2449) > => __do_softirq (ffffffff8103bfd0) > => call_softirq (ffffffff8148a14c) > => do_softirq (ffffffff81003e85) > => irq_exit (ffffffff8103c3a5) > => do_IRQ (ffffffff8148a693) > => ret_from_intr (ffffffff814814a7) > => cpu_idle (ffffffff8100ac16) > => start_secondary (ffffffff81af5e66) > > My IPSTATS_MIB_INHDRERRORS, IPSTATS_MIB_INDISCARDS, and > IPSTATS_MIB_INTRUNCATEDPKTS counters are all 0 so maybe this is from > skb->pkt_type == PACKET_OTHERHOST? > > <idle>-0 [001] 11933.937378: kfree_skb: skbaddr=0xffff8805ebcf8c00 protocol=2048 location=0xffffffff81404660 > <idle>-0 [001] 11933.937385: kernel_stack: <stack trace> > => ip_rcv_finish (ffffffff81404660) > => ip_rcv (ffffffff81404f61) > => __netif_receive_skb (ffffffff813ce123) > => netif_receive_skb (ffffffff813d0da1) > => process_responses (ffffffffa018486c) > => napi_rx_handler (ffffffffa0185606) > => net_rx_action (ffffffff813d2449) > => __do_softirq (ffffffff8103bfd0) > => call_softirq (ffffffff8148a14c) > => do_softirq (ffffffff81003e85) > => irq_exit (ffffffff8103c3a5) > => do_IRQ (ffffffff8148a693) > => ret_from_intr (ffffffff814814a7) > => cpu_idle (ffffffff8100ac16) > => start_secondary (ffffffff81af5e66) > > I see two places here that I might be hitting that don't increment any > counters. I can try instrumenting these to see which one I hit. > > <idle>-0 [003] 11932.454375: kfree_skb: skbaddr=0xffff880584843700 protocol=4 location=0xffffffffa00492d4 > <idle>-0 [003] 11932.454382: kernel_stack: <stack trace> > => llc_rcv (ffffffffa00492d4) > => __netif_receive_skb (ffffffff813ce123) > => netif_receive_skb (ffffffff813d0da1) > => process_responses (ffffffffa018486c) > => napi_rx_handler (ffffffffa0185606) > => net_rx_action (ffffffff813d2449) > => __do_softirq (ffffffff8103bfd0) > => call_softirq (ffffffff8148a14c) > => do_softirq (ffffffff81003e85) > => irq_exit (ffffffff8103c3a5) > => do_IRQ (ffffffff8148a693) > => ret_from_intr (ffffffff814814a7) > => cpu_idle (ffffffff8100ac16) > => start_secondary (ffffffff81af5e66) > > This is protocol=4 so I don't know if it is really relevant but then > again I don't know what this is. You can ignore this > > <idle>-0 [003] 11914.266635: kfree_skb: skbaddr=0xffff880584843b00 protocol=2048 location=0xffffffff8143bfa8 > <idle>-0 [003] 11914.266641: kernel_stack: <stack trace> > => igmp_rcv (ffffffff8143bfa8) > => ip_local_deliver_finish (ffffffff814049ed) > => ip_local_deliver (ffffffff81404d1a) > => ip_rcv_finish (ffffffff814046ad) > => ip_rcv (ffffffff81404f61) > => __netif_receive_skb (ffffffff813ce123) > => netif_receive_skb (ffffffff813d0da1) > => mlx4_en_process_rx_cq (ffffffffa010a4fe) > => mlx4_en_poll_rx_cq (ffffffffa010a9ef) > => net_rx_action (ffffffff813d2449) > => __do_softirq (ffffffff8103bfd0) > => call_softirq (ffffffff8148a14c) > => do_softirq (ffffffff81003e85) > => irq_exit (ffffffff8103c3a5) > => do_IRQ (ffffffff8148a693) > => ret_from_intr (ffffffff814814a7) > => cpu_idle (ffffffff8100ac16) > => start_secondary (ffffffff81af5e66) > > Also don't know if this one is relevant. This looks like an igmp > packet so probably not my drop, but I am receiving multicast packets > in this case so maybe it is somehow related. Yes, we need to change igmp to call consume_skb() for all frames that were properly handled. So you can ignore this as well. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists