netdev - Re: UDP packet loss when running lsof

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <46528973.50809@cosmosbay.com>
Date:	Tue, 22 May 2007 08:10:59 +0200
From:	Eric Dumazet <dada1@...mosbay.com>
To:	John Miller <forall@...l15.com>
CC:	netdev@...r.kernel.org
Subject: Re: UDP packet loss when running lsof

John Miller a écrit :
> Hi Eric,
> 
>> I CCed netdev since this stuff is about network and not
>> lkml.
> 
> Ok, dropped the CC...
> 
>> What kind of machine do you have ? SMP or not ?
> 
> It's a HP system with two dual core CPUs at 3GHz, the
> storage system is connected through QLogic FC-HBA. It should
> really be fast enough to handle a data stream of 50 MB/s...

Then you might try to bind network IRQ to one CPU
(echo 1 >/proc/irq/XX/smp_affinity)

XX being your NIC interrupt (cat /proc/interrupts to catch it)

and bind your user program to another cpu(s)

You might hit a cond_resched_softirq() bug that Ingo and others are sorting 
out right now. Using separate CPU for softirq handling and your programs 
should help a lot here.

> 
>> If you have many sockets on this machine, lsof can be
>> very slow reading /proc/net/tcp and/or /proc/net/udp,
>> locking some tables long enough to drop packets.
> 
> First I tried with one UDP socket and during tests I switched
> to 16 sockets with no effect. As I removed nearly all daemons
> there aren't many open sockets.
> 
> /proc/net/tcp seems to be one cause of the problem: a simple
> "cat /proc/net/tcp" leads nearly allways to immediate UDP packet
> loss. So it seems that reading TCP statistics blocks UDP
> packet processing.
> 
> As it isn't my goal to collect statistics all the time, I could
> live with disabling access to /proc/net/tcp, but I wouldn't call
> this a good solution...
> 
>> If you have a low count of tcp sockets, you might want to
>> boot with thash_entries=2048 or so, to reduce tcp hash
>> table size.
> 
> This did help a lot, I tried thash_entries=10 and now only a
> while loop around the "cat ...tcp" triggers packet loss. Tests
> are now running and I can say more tomorrow.

I dont understand here : using a small thash_entries makes the bug always appear ?

> 
> Getting information about thash_entries is really hard. Even
> finding out the default value: For a system with 2GB RAM
> it could be around 100000.
> 
>> no RcvbufErrors error as well ?
> 
> The kernel is a bit too old (2.6.18). Looking at the patch
> from 2.16.18 to 1.6.19 I found that RcvbufErrors is only
> increased when InErrors is increased. So my answer would be
> yes.
> 
>> > - Network card is handled by bnx2 kernel module
> 
>> I dont know this NIC, does it support ethtool ?
> 
> It is a "Broadcom Corporation NetXtreme II BCM5708S
> Gigabit Ethernet (rev 12)", and it seems ethtool is supported.
> 
> The output below was captured after packet loss (I don't see
> any hints, but maybe you):
> 
>> ethtool -S eth0
> 
> NIC statistics:
>      rx_bytes: 155481467364
>      rx_error_bytes: 0
>      tx_bytes: 5492161
>      tx_error_bytes: 0
>      rx_ucast_packets: 18341
>      rx_mcast_packets: 137321933
>      rx_bcast_packets: 2380
>      tx_ucast_packets: 14416
>      tx_mcast_packets: 190
>      tx_bcast_packets: 8
>      tx_mac_errors: 0
>      tx_carrier_errors: 0
>      rx_crc_errors: 0
>      rx_align_errors: 0
>      tx_single_collisions: 0
>      tx_multi_collisions: 0
>      tx_deferred: 0
>      tx_excess_collisions: 0
>      tx_late_collisions: 0
>      tx_total_collisions: 0
>      rx_fragments: 0
>      rx_jabbers: 0
>      rx_undersize_packets: 0
>      rx_oversize_packets: 0
>      rx_64_byte_packets: 244575
>      rx_65_to_127_byte_packets: 6828
>      rx_128_to_255_byte_packets: 167
>      rx_256_to_511_byte_packets: 94
>      rx_512_to_1023_byte_packets: 393
>      rx_1024_to_1522_byte_packets: 137090597
>      rx_1523_to_9022_byte_packets: 0
>      tx_64_byte_packets: 52
>      tx_65_to_127_byte_packets: 7547
>      tx_128_to_255_byte_packets: 3304
>      tx_256_to_511_byte_packets: 399
>      tx_512_to_1023_byte_packets: 897
>      tx_1024_to_1522_byte_packets: 2415
>      tx_1523_to_9022_byte_packets: 0
>      rx_xon_frames: 0
>      rx_xoff_frames: 0
>      tx_xon_frames: 0
>      tx_xoff_frames: 0
>      rx_mac_ctrl_frames: 0
>      rx_filtered_packets: 158816
>      rx_discards: 0
>      rx_fw_discards: 0
> 
>> ethtool -c eth0
> 
> Coalesce parameters for eth1:
> Adaptive RX: off  TX: off
> stats-block-usecs: 999936
> sample-interval: 0
> pkt-rate-low: 0
> pkt-rate-high: 0
> 
> rx-usecs: 18
> rx-frames: 6
> rx-usecs-irq: 18
> rx-frames-irq: 6
> 
> tx-usecs: 80
> tx-frames: 20
> tx-usecs-irq: 80
> tx-frames-irq: 20
> 
> rx-usecs-low: 0
> rx-frame-low: 0
> tx-usecs-low: 0
> tx-frame-low: 0
> 
> rx-usecs-high: 0
> rx-frame-high: 0
> tx-usecs-high: 0
> tx-frame-high: 0
> 
>> ethtool -g eth0
> 
> Ring parameters for eth1:
> Pre-set maximums:
> RX:             1020
> RX Mini:        0
> RX Jumbo:       0
> TX:             255
> Current hardware settings:
> RX:             100
> RX Mini:        0
> RX Jumbo:       0
> TX:             255
> 
>> Just to make sure, does your application setup a huge
>> enough SO_RCVBUF val?
> 
> Yes, my first try with one socket was 5MB, but I also tested
> with 10 and even 25MB. With 16 sockets I also set it to 5MB.
> When pausing the application netstat shows the filled buffers.
> 
>> What values do you have in /proc/sys/net/ipv4/tcp_rmem ?
> 
> I kept the default values there:
> 4096    43689   87378
> 
>> cat /proc/meminfo
> 
> MemTotal:      2060664 kB
> MemFree:        146536 kB
> Buffers:         10984 kB
> Cached:        1667740 kB
> SwapCached:          0 kB
> Active:         255228 kB
> Inactive:      1536352 kB
> HighTotal:           0 kB
> HighFree:            0 kB
> LowTotal:      2060664 kB
> LowFree:        146536 kB
> SwapTotal:           0 kB
> SwapFree:            0 kB
> Dirty:          820740 kB
> Writeback:         112 kB
> Mapped:         127612 kB
> Slab:           104184 kB
> CommitLimit:   1030332 kB
> Committed_AS:   774944 kB
> PageTables:       1928 kB
> VmallocTotal: 34359738367 kB
> VmallocUsed:      6924 kB
> VmallocChunk: 34359731259 kB
> HugePages_Total:     0
> HugePages_Free:      0
> HugePages_Rsvd:      0
> Hugepagesize:     2048 kB
> 
> Thanks for your help!
> Regards,
> John
> 
> 

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html