[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAKgT0UcetjfR3bWx5cg=C9MG_WKpC+jbnpTqj2grrnZmOdXGLA@mail.gmail.com>
Date: Mon, 20 Nov 2017 16:04:32 -0800
From: Alexander Duyck <alexander.duyck@...il.com>
To: Sarah Newman <sarah.newman@...puter.org>
Cc: e1000-devel@...ts.sf.net, Netdev <netdev@...r.kernel.org>
Subject: Re: [E1000-devel] Questions about crashes and GRO
On Mon, Nov 20, 2017 at 3:35 PM, Sarah Newman <sarah.newman@...puter.org> wrote:
> On 11/20/2017 02:56 PM, Alexander Duyck wrote:
>> On Mon, Nov 20, 2017 at 2:38 PM, Sarah Newman <sarah.newman@...puter.org> wrote:
>>> On 11/20/2017 08:36 AM, Alexander Duyck wrote:
>>>> Hi Sarah,
>>>>
>>>> I am adding the netdev mailing list as I am not certain this is an
>>>> i350 specific issue. The traces themselves aren't anything I recognize
>>>> as an existing issue. From what I can tell it looks like you are
>>>> running Xen, so would I be correct in assuming you are bridging
>>>> between VMs? If so are you using any sort of tunnels on your network,
>>>> if so what type? This information would be useful as we may be looking
>>>> at a bug in a tunnel offload for GRO.
>>>
>>> Yes, there's bridging. The traffic on the physical device is tagged with vlans and the bridges use untagged traffic. There are no tunnels. I do not
>>> own the VMs traffic.
>>>
>>> Because I have only seen this on a single server with unique hardware, I think it's most likely related to the hardware or to a particular VM on that
>>> server.
>>
>> So I would suspect traffic coming from the VM if anything. The i350 is
>> a pretty common device. If we were seeing issues specific to it > would expect we would have more reports than just the one so far.
>
> My confusion was primarily related to the release notes for an older version of a different intel driver.
>
> But regarding traffic coming from a VM, the backtraces both include igb_poll. Doesn't that mean the problem is related to inbound traffic on the igb
> device and not traffic direct from a local VM?
>
> --Sarah
All the igb driver is doing is taking the data off of the network,
populating sk_buff structures, and then handing them off to the stack.
The format of the sk_buff's has been pretty consistent for the last
several years so I am not really suspecting a driver issue.
The issue with network traffic is that it is usually symmetric meaning
if the VM sends something it will get some sort of reply. The actual
traffic itself and how the kernel handles it has changed quite a bit
over the years, and a VM could be setting up a tunnel, or stack of
VLANs, or some other type of traffic that the kernel might have
recognized and tried to do GRO for but didn't fully support. If
turning off GRO solves the problem then the issue is likely in the GRO
code, not in the igb driver.
- Alex
Powered by blists - more mailing lists