lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f3d1d5bf11144b31b1b3959e95b04490@AcuMS.aculab.com>
Date:   Thu, 19 May 2022 13:14:53 +0000
From:   David Laight <David.Laight@...LAB.COM>
To:     'Pavan Chebbi' <pavan.chebbi@...adcom.com>
CC:     Michael Chan <michael.chan@...adcom.com>,
        Paolo Abeni <pabeni@...hat.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "mchan@...adcom.com" <mchan@...adcom.com>,
        David Miller <davem@...emloft.net>
Subject: RE: tg3 dropping packets at high packet rates

From: Pavan Chebbi
> Sent: 19 May 2022 11:21
...
> >
> > > Please show a snapshot of all the counters.  In particular,
> > > rxbds_empty, rx_discards, etc will show whether the driver is keeping
> > > up with incoming RX packets or not.
> >
> > After running the test for a short time.
> > The application stats indicate that around 40000 packets are missing.
> >
...

Some numbers taken at the same time:

Application trace - each 'gap' is one or more lost packets.
T+000004:  all gaps so far 1104
T+000005:  all gaps so far 21664
T+000006:  all gaps so far 54644
T+000007:  all gaps so far 84641
T+000008:  all gaps so far 110232
T+000009:  all gaps so far 131191
T+000010:  all gaps so far 150286
T+000011:  all gaps so far 171588
T+000012:  all gaps so far 190777
T+000013:  all gaps so far 210771

rx_packets counted by tg3_rx() and read every second.
63 344426
64 341734
65 338740
66 337995
67 339770
68 336314
69 340087
70 345084

Cumulative error counts since the driver was last loaded.
     rxbds_empty: 30983
     rx_discards: 3123
     mbuf_lwm_thresh_hit: 3123

The number of interrupt is high - about 40000/sec.
(I've not deltad these, just removed all the zeros and prefixed the
cpu number before each non-zero value.)
86: IR-PCI-MSI 1050625-edge em2-rx-1 8:13 14:234754517
86: IR-PCI-MSI 1050625-edge em2-rx-1 8:13 14:234767945
86: IR-PCI-MSI 1050625-edge em2-rx-1 8:13 14:234802555
86: IR-PCI-MSI 1050625-edge em2-rx-1 8:13 14:234843542
86: IR-PCI-MSI 1050625-edge em2-rx-1 8:13 14:234887963
86: IR-PCI-MSI 1050625-edge em2-rx-1 8:13 14:234928204
86: IR-PCI-MSI 1050625-edge em2-rx-1 8:13 14:234966428
86: IR-PCI-MSI 1050625-edge em2-rx-1 8:13 14:235009505
86: IR-PCI-MSI 1050625-edge em2-rx-1 8:13 14:235052740
86: IR-PCI-MSI 1050625-edge em2-rx-1 8:13 14:235093254
86: IR-PCI-MSI 1050625-edge em2-rx-1 8:13 14:235133299
86: IR-PCI-MSI 1050625-edge em2-rx-1 8:13 14:235173151
86: IR-PCI-MSI 1050625-edge em2-rx-1 8:13 14:235212387
86: IR-PCI-MSI 1050625-edge em2-rx-1 8:13 14:235252403
86: IR-PCI-MSI 1050625-edge em2-rx-1 8:13 14:235317928
86: IR-PCI-MSI 1050625-edge em2-rx-1 8:13 14:235371301

RSS is enabled, but I've used ethtool -X equal 1 to
put everything through ring 0.
Cpu 14 is still 25% idle - that is the busiest cpu.

I've discovered that the 'lost packet' rate does depend on
the number of rx buffers configured with 'ethtool -G em2 rx nnnn'.
The traces above are with 1000 rx buffers.

I'm also slightly confused about the receive buffers.
As I read the code the following happens:

Ignoring jumbo buffers - which I don't have configured.
AFAICT all the rings have 2048 entries.
With RSS there are 4 pairs of rings, one contains (free) buffers
the other receive data status.
The receive code processes an entry from the status ring
and puts a buffer back onto the corresponding buffers ring.
Since the hardware only takes buffers from one ring, the
driver moves all the free buffers from rings 1-3 onto ring 0.

When the rings are allocated I think that buffers (default 200)
are added to all 4 rings.
As soon as the 'napi' code for ring 0 runs it collects the
other 600 buffers and puts them on its own (free) buffer ring.
This seems to make all 800 buffers available for any of the RSS
channels.

Now if I configure 'ethtool -G em2 rx 2000' a total of 8000
receive buffers are allocated.
Only 2047 will fit into ring[0] so the other 'buffer' rings
still contain buffers.
Now if I receive traffic that goes to ring[3] the free buffer
ring[3] will wrap - discarding 2048 buffers.

I'm assuming I've missed something?

This bit of code in tg3_rx() also looks buggy:

                if (unlikely(rx_std_posted >= tp->rx_std_max_post)) {
                        tpr->rx_std_prod_idx = std_prod_idx &
                                               tp->rx_std_ring_mask;
                        tw32_rx_mbox(TG3_RX_STD_PROD_IDX_REG,
                                     tpr->rx_std_prod_idx);
                        work_mask &= ~RXD_OPAQUE_RING_STD;
                        rx_std_posted = 0;
                }

Clearing work_mask stops napi[0] being run to move
the freed buffers across.
(I don't think I have the hardware that goes through that bit.)

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ