[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <06F8DA5DD1D9E277F2AF6F1E@Ximines.local>
Date: Thu, 04 Jul 2013 13:57:16 +0100
From: Alex Bligh <alex@...x.org.uk>
To: Eric Dumazet <eric.dumazet@...il.com>,
Ian Campbell <Ian.Campbell@...rix.com>
cc: Joe Jin <joe.jin@...cle.com>,
Frank Blaschka <frank.blaschka@...ibm.com>,
"David S. Miller" <davem@...emloft.net>,
linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
zheng.x.li@...cle.com, Xen Devel <xen-devel@...ts.xen.org>,
Jan Beulich <JBeulich@...e.com>,
Stefano Stabellini <stefano.stabellini@...citrix.com>,
Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
Alex Bligh <alex@...x.org.uk>
Subject: Re: kernel panic in skb_copy_bits
--On 4 July 2013 03:12:10 -0700 Eric Dumazet <eric.dumazet@...il.com> wrote:
> It looks like a typical COW issue to me.
>
> If the page content is written while there is still a reference on this
> page, we should allocate a new page and copy the previous content.
>
> And this has little to do with networking.
I suspect this would get more attention if we could make Ian's case
below trigger (a) outside Xen, (b) outside networking.
> memset(buf, 0xaa, 4096);
> write(fd, buf, 4096)
> memset(buf, 0x55, 4096);
> (where fd is O_DIRECT on NFS) Can result in 0x55 being seen on the wire
> in the TCP retransmit.
We know this should fail using O_DIRECT+NFS. We've had reports suggesting
it fails in O_DIRECT+iSCSI. However, that's been with a kernel panic
(under Xen) rather than data corruption as per the above.
Historical trawling suggests this is an issue with DRDB (see Ian's
original thread from the mists of time).
I don't quite understand why we aren't seeing corruption with standard
ATA devices + O_DIRECT and no Xen involved at all.
My memory is a bit misty on this but I had thought the reason why
this would NOT be solved simply by O_DIRECT taking a reference to
the page was that the O_DIRECT I/O completed (and thus the reference
would be freed up) before the networking stack had actually finished
with the page. If the O_DIRECT I/O did not complete until the
page was actually finished with, we wouldn't see the problem in the
first place. I may be completely off base here.
--
Alex Bligh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists