netdev - RE: [net-next 6/9] e1000e: fix buffer overrun while the I219 is processing DMA transactions

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Mon, 16 Oct 2017 13:27:49 +0000
From:   David Laight <David.Laight@...LAB.COM>
To:     "'Neftin, Sasha'" <sasha.neftin@...el.com>,
        'Jeff Kirsher' <jeffrey.t.kirsher@...el.com>,
        "davem@...emloft.net" <davem@...emloft.net>
CC:     "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "nhorman@...hat.com" <nhorman@...hat.com>,
        "sassmann@...hat.com" <sassmann@...hat.com>,
        "jogreene@...hat.com" <jogreene@...hat.com>
Subject: RE: [net-next 6/9] e1000e: fix buffer overrun while the I219 is
 processing DMA transactions

From: Neftin, Sasha
> Sent: 16 October 2017 11:40
> On 10/11/2017 12:07, David Laight wrote:
> > From: Jeff Kirsher
> >> Sent: 10 October 2017 18:22
> >> Intel 100/200 Series Chipset platforms reduced the round-trip
> >> latency for the LAN Controller DMA accesses, causing in some high
> >> performance cases a buffer overrun while the I219 LAN Connected
> >> Device is processing the DMA transactions. I219LM and I219V devices
> >> can fall into unrecovered Tx hang under very stressfully UDP traffic
> >> and multiple reconnection of Ethernet cable. This Tx hang of the LAN
> >> Controller is only recovered if the system is rebooted. Slightly slow
> >> down DMA access by reducing the number of outstanding requests.
> >> This workaround could have an impact on TCP traffic performance
> >> on the platform. Disabling TSO eliminates performance loss for TCP
> >> traffic without a noticeable impact on CPU performance.
> >>
> >> Please, refer to I218/I219 specification update:
> >> https://www.intel.com/content/www/us/en/embedded/products/networking/
> >> ethernet-connection-i218-family-documentation.html
> >>
> >> Signed-off-by: Sasha Neftin <sasha.neftin@...el.com>
> >> Reviewed-by: Dima Ruinskiy <dima.ruinskiy@...el.com>
> >> Reviewed-by: Raanan Avargil <raanan.avargil@...el.com>
> >> Tested-by: Aaron Brown <aaron.f.brown@...el.com>
> >> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@...el.com>
> >> ---
> >>   drivers/net/ethernet/intel/e1000e/netdev.c | 8 +++++---
> >>   1 file changed, 5 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c
> b/drivers/net/ethernet/intel/e1000e/netdev.c
> >> index ee9de3500331..14b096f3d1da 100644
> >> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
> >> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
> >> @@ -3021,8 +3021,8 @@ static void e1000_configure_tx(struct e1000_adapter *adapter)
> >>
> >>   	hw->mac.ops.config_collision_dist(hw);
> >>
> >> -	/* SPT and CNP Si errata workaround to avoid data corruption */
> >> -	if (hw->mac.type >= e1000_pch_spt) {
> >> +	/* SPT and KBL Si errata workaround to avoid data corruption */
> >> +	if (hw->mac.type == e1000_pch_spt) {
> >>   		u32 reg_val;
> >>
> >>   		reg_val = er32(IOSFPC);
> >> @@ -3030,7 +3030,9 @@ static void e1000_configure_tx(struct e1000_adapter *adapter)
> >>   		ew32(IOSFPC, reg_val);
> >>
> >>   		reg_val = er32(TARC(0));
> >> -		reg_val |= E1000_TARC0_CB_MULTIQ_3_REQ;
> >> +		/* SPT and KBL Si errata workaround to avoid Tx hang */
> >> +		reg_val &= ~BIT(28);
> >> +		reg_val |= BIT(29);

> > Shouldn't some more of the commit message about what this is doing
> > be in the comment?

> There is provided link on specification update:
> https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/i218-i219-ethernet-
> connection-spec-update.pdf?asset=9561.
> This is Intel's public release.

And sometime next week the marketing people will decide to reorganise the
web site and the link will become invalid.

> > And shouldn't the 28 and 28 be named constants?

> (28 and 29) - you can easy understand from code that same value has been
> changed from 3 to 2. There is no point add flag here I thought.

Oh, there is. The 'workaround is':
  Slightly slow down DMA access by reducing the number of outstanding requests.
  This workaround could have an impact on TCP traffic performance and could
  reduce performance up to 5 to 15% (depending) on the platform.
  Disabling TSO eliminates performance loss for TCP traffic without a 
  noticeable impact on CPU performance.

I wonder what tests they did to show that TSO doesn't save cpu cycles!

So my guess is that you are changing the number of outstanding PCIe reads
(or reads for tx buffers, or ???) from 3 to 2.

Lets read between the lines a little further
(since you are at Intel you can probably check this):
Assuming that TSO is 'Transmit Segmentation Offload' and that TSO packets
might be 64k, then reading 3 TSO packets might issue PCIe reads for 196k
bytes of data (under 4k for non-TSO).
If the internal buffer that this data is stored in isn't that big then
that internal buffer would overflow.
It might be that data is removed from this buffer as soon as the last
completion TLP arrives - but they can be interleaved with other
outstanding PCIe reads.
It all rather depends on the negotiated maximum TLP size and number
of tags.

Perhaps reducing the maximum TSO packet to 32k stops the overflow
as well...

	David