lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 5 Sep 2012 19:11:08 -0500
From:	Shawn Bohrer <sbohrer@...advisors.com>
To:	netdev@...r.kernel.org
Cc:	eric.dumazet@...il.com
Subject: Increased multicast packet drops in 3.4

I've been testing the 3.4 kernel compared to the 3.1 kernel and
noticed my application is experiencing a noticeable increase in packet
drops compared to 3.1.  In this case I have 8 processes all listening
on the same multicast group and occasionally 1 or more of the
processes will report drops based on gaps in the sequence numbers on
the packets.  One thing I find interesting is that some of the time 2
or 3 of the 8 processes will report that they missed the exact same
50+ packets.  Since the other processes receive the packets I know
that they are making it to the machine and past the driver.

So far I have not been able to _see_ any OS counters increase when the
drops occur but perhaps there is a location that I have not yet
looked.  I've been looking for drops in /proc/net/udp /proc/net/snmp
and /proc/net/dev.

I've tried using dropwatch/drop_monitor but it is awfully noisy even
after back porting many of the patches Eric Dumazet has contributed to
silence the false positives.  Similarly I setup trace-cmd/ftrace to
record skb:kfree_skb calls with a stacktrace and had my application
stop the trace when a drop was reported.  From these traces I see a
number of the following:

    md_connector-12791 [014]  7952.982818: kfree_skb:            skbaddr=0xffff880583bd7500 protocol=2048 location=0xffffffff813c930b
    md_connector-12791 [014]  7952.982821: kernel_stack:         <stack trace>
=> skb_release_data (ffffffff813c930b)
=> __kfree_skb (ffffffff813c934e)
=> skb_free_datagram_locked (ffffffff813ccca8)
=> udp_recvmsg (ffffffff8143335c)
=> inet_recvmsg (ffffffff8143cbfb)
=> sock_recvmsg_nosec (ffffffff813be80f)
=> __sys_recvmsg (ffffffff813bfe70)
=> __sys_recvmmsg (ffffffff813c2392)
=> sys_recvmmsg (ffffffff813c25b0)
=> system_call_fastpath (ffffffff8148cfd2)

Looking at the code it does look like these could be the drops, since
I do not see any counters incremented in this code path.  However I'm
not very familiar with this code so it could also be a false positive.
It does look like the above stack only gets called if
skb_has_frag_list(skb) does this imply the packet was over one MTU
(1500)?

I'd appreciate any input on possible causes/solutions for these drops.
Or ways that I can further debug this issue to find the root cause of
the increase in drops on 3.4.

Thanks,
Shawn

-- 

---------------------------------------------------------------
This email, along with any attachments, is confidential. If you 
believe you received this message in error, please contact the 
sender immediately and delete all copies of the message.  
Thank you.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ