netdev - Re: tg3 dropping packets at high packet rates

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <6576c307ed554adb443e62a60f099266c95b55a7.camel@redhat.com>
Date:   Wed, 18 May 2022 19:27:02 +0200
From:   Paolo Abeni <pabeni@...hat.com>
To:     David Laight <David.Laight@...LAB.COM>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Cc:     "'mchan@...adcom.com'" <mchan@...adcom.com>,
        David Miller <davem@...emloft.net>
Subject: Re: tg3 dropping packets at high packet rates

On Wed, 2022-05-18 at 16:08 +0000, David Laight wrote:
> I'm trying to see why the tg3 driver is dropping a lot of
> receive packets.
> 
> (This driver is making my head hurt...)
> 
> I think that the rx_packets count (sum of rx_[umb]cast_packets)
> is all the packets, but a smaller number are actually processed
> by the tg3_rx()
> But none of the error counts get increased.
> 
> It is almost as if it has lost almost all the receive buffers.
> 
> If I read /sys/class/net/em2/statistics/rx_packets every second
> delaying with:
>   syscall(SYS_clock_nanosleep, CLOCK_MONOTONIC, TIMER_ABSTIME, &ts, NULL);
> about every 43 seconds I get a zero increment.
> This really doesn't help!

It looks like the tg3 driver fetches the H/W stats once per second. I
guess that if you fetch them with the same period and you are unlucky
you can read the same sample 2 consecutive time.

> I've put a count into tg3_rx() that seems to match what IP/UDP
> and the application see.
> 
> The traffic flow is pretty horrid (but could be worse).
> There are 8000 small UDP packets every 20ms.
> These are reasonably spread through the 20ms (not back to back).
> All the destination ports are different (8000 receiving sockets).
> (The receiving application handles this fine (now).)
> The packets come from two different systems.
> 
> Firstly RSS doesn't seem to work very well.
> With the current driver I think everything hits 2 rings.
> With the 3.10 RHEL driver it all ends up in one.
> 
> Anyway after a hint from Eric I enabled RPS.
> This offloads the IP and UDP processing enough to stop
> any of the cpu (only 40 of them) from reporting even 50% busy.
> 
> I've also increased the rx ring size to 2047.
> Changing the coalescing parameters seems to have no effect.
> 
> I think there should be 2047 receive buffers.
> So 4 interrupts every 20ms or 200/sec might be enough
> to receive all the frames.
> The actual interrupt rate (deltas on /proc/interrupts)
> is actual over 80000/sec.
> So it doesn't look as though the driver is ever processing
> many packets/interrupt.
> If the driver were getting behind I'd expect a smaller number
> of interrupts.

With RPS enabled packet processing for most packets (the ones stirred
to remote CPUs) is very cheap, as the skb are moved out of the NIC to a
per CPU queue and that's it.

In theory packets could be drepped before inserting them into the RPS
queue, if the latter grow to big, but that looks unlikely. You can try
raising netdev_max_backlog, just in case.


dropwatch (or perf record -ga -e skb:kfree_skb) should point you where
exactly the packets are dropped.

Cheers,

Paolo