[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1445706124.1823412.419177649.420E8888@webmail.messagingengine.com>
Date: Sat, 24 Oct 2015 19:02:04 +0200
From: Hannes Frederic Sowa <hannes@...essinduktion.org>
To: Tom Herbert <tom@...bertland.com>
Cc: Linux Kernel Network Developers <netdev@...r.kernel.org>,
Eric Dumazet <edumazet@...gle.com>,
Vlad Yasevich <vyasevich@...il.com>,
Benjamin Coddington <bcodding@...hat.com>
Subject: Re: [PATCH net] ipv6: no CHECKSUM_PARTIAL on skbs with extension headers
and recalc checksum during fragmentation
Hi Tom,
On Sat, Oct 24, 2015, at 18:46, Tom Herbert wrote:
> On Sat, Oct 24, 2015 at 12:28 PM, Hannes Frederic Sowa
> <hannes@...essinduktion.org> wrote:
> > Hi Tom,
> >
> > On Sat, Oct 24, 2015, at 18:21, Tom Herbert wrote:
> >> On Fri, Oct 23, 2015 at 9:13 AM, Hannes Frederic Sowa
> >> <hannes@...essinduktion.org> wrote:
> >> > CHECKSUM_PARTIAL should only be used on plain vanilla IPv6 + UDP packets
> >> > in ip6_append_data. Some drivers don't correctly handle extension headers,
> >> > especially not ipv6 fragmentation which could result in broken checksums.
> >> >
> >> Yes, we've seen this in some drivers, but the conclusion is that those
> >> drivers are *broken* and need to be fixed! CHECKSUM_PARTIAL works
> >> perfectly well in the presence of extension headers if the
> >> driver/device is correctly implemented (simple algorithm with
> >> csum_start and csum_offset).
> >
> > I will do some more research on those drivers and I agree in general
> > with you. But if you look at the code it was clearly the intention to
> > not use CHECKSUM_PARTIAL on fragmented IPv6 packets, just this intention
> > has not been formulated correctly in code. I still would like to see
> > something like that showing up in stable and we can follow more CSUM
> > bits maybe for drivers who support or don't support that.
> >
> > Also as we are talking about fragments here, I would not put too much
> > effort into that, as fragments will be slow path anyway.
> >
> Checksum will need to be done before fragmentation, but I see little
> value in trying to push that back to the userspace copy. Besides, I
> would expect that UFO should work properly with extension headers.
Sure, checksum needs to be calculated before fragmentation, but touching
the data once during userspace copy seems to be valuable to me to
instead doing it later in the kernel, especially if the data is cold as
it has been on a send queue for quite some while and some context
switches happened (MSG_MORE for example).
For UFO CHECKSUM_PARTIAL is a must and we require proper checksum
offload capabilities. But nearly no networking card does provide UFO
offloading, there are very few of them (but heavily used due to virtual
ones supporting that). If you use UFO during UDP output you end up doing
CHECKSUM_PARTIAL.
> The rule is simple: if a driver states that ids capable of support an
> IPv6 offload, that means that it must properly handle packets with
> extension headers which are a core part of IPv6 protocol. We do not
> want to add any more feature bits for stuff like this.
I would agree, but we don't want to disable CHECKSUM_PARTIAL for most of
the traffic just if some drivers fail to do correct checksum offloading
for fragmented packets.
> It is possible that we will be deploying IPv6 extension headers within
> our data center at some point. I would prefer that we retain the value
> of HW checksum offload in that, but we simply cannot have devices
> silently sending bad checksums...
Yes, sure. Btw. this patch only affects UDP and RAW sockets.
> Please see my patch on the driver helper function to rectify device
> capabilities with packets as they are transmitted.
I have seen that already but will have a second look.
Anyway, currently it is easy to generate broken checksums on the wire
and would like to solve that for net, we certainly can improve that in
net-next.
Bye,
Hannes
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists