[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <537220B7.5080202@citrix.com>
Date: Tue, 13 May 2014 14:40:07 +0100
From: Zoltan Kiss <zoltan.kiss@...rix.com>
To: Sander Eikelenboom <linux@...elenboom.it>
CC: Ian Campbell <Ian.Campbell@...rix.com>,
"David S. Miller" <davem@...emloft.net>, <netdev@...r.kernel.org>,
<xen-devel@...ts.xen.org>
Subject: Re: [3.15-rc3] Bisected: xen-netback mangles packets between two
guests on a bridge since merge of "TX grant mapping with SKBTX_DEV_ZEROCOPY
instead of copy" series.
Hi,
It seems I've fixed this: the receive side couldn't handle when the
frags were changed. I'll post a patch shortly.
Zoli
On 09/05/14 22:02, Zoltan Kiss wrote:
> Hi,
>
> Sorry for the long silence on this issue, I was busy trying to figure
> out what went wrong. Fun facts:
>
> - commenting out that _pskb_pull_tail from tx_submit which
> unconditionally pulls up the linear area to 128 bytes seems to solve the
> problem
> - I could repro the problem only when the sending guest had a 64 bit
> kernel, but then even with 3.2. On the other hand, with 32 bit sending
> guest it works fine. More exactly I think it boils down to the actual
> config, I used XenServer Dom0 config files, see them here:
> https://github.com/xenserver/linux-3.x.pg/blob/master/master/kernel-configuration
>
> - with 64 bit Debian 7 kernel as sender it also works, so I guess it's
> not about 32/64 bit, but something in the config
> - the receiving guest, where wget ran, doesn't matter.
> - the "more than MAX_SKB_FRAGS slots" thing was a red herring. A typical
> skb layout (on the sender's xenvif_start_xmit) which gets corrupted:
> linear area: 66 bytes
> 0. frag: 52 bytes
> 1. frag: 1200 bytes
> - so I guess the problem is when that pull_tail pulls the whole first
> frag into the linear area
> - a corrupt packet on the receiver side looks like the following:
> - linear buffer: 128 bytes, content is OK
> - the content of the frag area is shifted back 4096 bytes in the
> TCP stream. So instead of the Nth byte it starts with the (N-4096)th byte
> - the length is the same as on the sender side, I've checked by
> looking at the IP id fields
> - otherwise the stream content looks ok (I used a continuously
> incrementing pattern)
> - the next packet starts at the right place
> - the pulling itself doesn't cause the corruption, I've printed out the
> first frag after that, and it still looks OK
> - ftrace_printk("%*ph") seems to have problems when the pointer points
> to a grant mapped page. I have the impression that it tries to
> dereference it when I read the trace buffer, at which point the mapping
> and the content is long gone.
>
> I'll continue to look into this next week
>
> Zoli
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists