lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAKgT0UcetjfR3bWx5cg=C9MG_WKpC+jbnpTqj2grrnZmOdXGLA@mail.gmail.com>
Date:   Mon, 20 Nov 2017 16:04:32 -0800
From:   Alexander Duyck <alexander.duyck@...il.com>
To:     Sarah Newman <sarah.newman@...puter.org>
Cc:     e1000-devel@...ts.sf.net, Netdev <netdev@...r.kernel.org>
Subject: Re: [E1000-devel] Questions about crashes and GRO

On Mon, Nov 20, 2017 at 3:35 PM, Sarah Newman <sarah.newman@...puter.org> wrote:
> On 11/20/2017 02:56 PM, Alexander Duyck wrote:
>> On Mon, Nov 20, 2017 at 2:38 PM, Sarah Newman <sarah.newman@...puter.org> wrote:
>>> On 11/20/2017 08:36 AM, Alexander Duyck wrote:
>>>> Hi Sarah,
>>>>
>>>> I am adding the netdev mailing list as I am not certain this is an
>>>> i350 specific issue. The traces themselves aren't anything I recognize
>>>> as an existing issue. From what I can tell it looks like you are
>>>> running Xen, so would I be correct in assuming you are bridging
>>>> between VMs? If so are you using any sort of tunnels on your network,
>>>> if so what type? This information would be useful as we may be looking
>>>> at a bug in a tunnel offload for GRO.
>>>
>>> Yes, there's bridging. The traffic on the physical device is tagged with vlans and the bridges use untagged traffic. There are no tunnels. I do not
>>> own the VMs traffic.
>>>
>>> Because I have only seen this on a single server with unique hardware, I think it's most likely related to the hardware or to a particular VM on that
>>> server.
>>
>> So I would suspect traffic coming from the VM if anything. The i350 is
>> a pretty common device. If we were seeing issues specific to it > would expect we would have more reports than just the one so far.
>
> My confusion was primarily related to the release notes for an older version of a different intel driver.
>
> But regarding traffic coming from a VM, the backtraces both include igb_poll. Doesn't that mean the problem is related to inbound traffic on the igb
> device and not traffic direct from a local VM?
>
> --Sarah

All the igb driver is doing is taking the data off of the network,
populating sk_buff structures, and then handing them off to the stack.
The format of the sk_buff's has been pretty consistent for the last
several years so I am not really suspecting a driver issue.

The issue with network traffic is that it is usually symmetric meaning
if the VM sends something it will get some sort of reply.  The actual
traffic itself and how the kernel handles it has changed quite a bit
over the years, and a VM could be setting up a tunnel, or stack of
VLANs, or some other type of traffic that the kernel might have
recognized and tried to do GRO for but didn't fully support. If
turning off GRO solves the problem then the issue is likely in the GRO
code, not in the igb driver.

- Alex

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ