[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150518213329.GA2335@breakpoint.cc>
Date: Mon, 18 May 2015 23:33:29 +0200
From: Florian Westphal <fw@...len.de>
To: David Miller <davem@...emloft.net>
Cc: fw@...len.de, netdev@...r.kernel.org, hannes@...essinduktion.org,
edumazet@...gle.com, herbert@...dor.apana.org.au
Subject: Re: [PATCH -next] net: preserve geometry of fragment sizes when
forwarding
David Miller <davem@...emloft.net> wrote:
> From: Florian Westphal <fw@...len.de>
> Date: Mon, 18 May 2015 22:40:49 +0200
>
> > But, to the best of my understanding, what you ask will push a lot of
> > non-trivial code into the kernel for no functional gain over
> > what has been proposed.
>
> The functional gain is that we stop linearizing the packet, which
> involves memory allocation and copying the entire packet.
AFAICS ipv4 and ipv6 defragmentations do not perform linearizations or
reallocations?
> I am very confident that the performance gains would be non-trivial
> and quite measurable.
Are fragmented packets that common?
I don't have any real data on this, the box sending this email has
55965898 incoming packets delivered
62 reassemblies required
... but it is just an end host.
TCP shouldn't be a problem thanks to pmtud, and for high-volume
fragmented ipv4 flows i'd expect poor performance due to the 16 bit ID space
limitations long before processing bottleneck.
> You'd also be able to trivially respect the geometry of the original
> incoming packet stream.
True. OTOH, the patch proposed in this thread would have done the same
with a lot less code (I admit that removing the optimization from Eric
once nf_defrag is loaded is not desirable; but I did not find a solution
to this problem aside from doing route lookup or tentative 'forward is
off') check, which I did not like.
Another alternative might be to delay Erics 'coalescing' step and move
it into the ip stack, after 'local delivery' decision was taken.
I can investigate this if you think its worth it.
> Every objection has been of the form "this special case" (this time
> SIP) is not easy.
Yes, but these objections are not some random hand-waving gesture.
It presents us with certain dilemmas, e.g. single udp packet:
1280 1280 1280 542
sip nat helper has to do nat/pat and replaces 10.2.3.4 with 192.168.2.3
(lets assume we'd have helpers that deal with addresses split over 2
fragmented skbs so we can deal with 10.2 appearing in fragment #2
and .3.4 in fragment #3)
We can then end up with something like
1283 1281 1282 542
... and what should we do then?
shuffe payload via memcpy/memmove() to only grow last frag?
This will not be hot path or common by any means.
But nervertheless, this can happen, and we need to deal with it.
> If I were doing this, I would implement something that handles the
> normal cases properly. And then take it from there.
What is a 'normal case'?
And how do you propose we deal with the 'non-normal' cases?
I assume you mean to e.g. linearize for edge cases + then refragment?
If thats true, then we'd still need one of the proposed solutions to handle
this to get packets we can send out without breaking geometry/growing
fragments to a larger mtu.
> If you try to imagine the totality of it and all the edge cases
> and details from the beginning, yes it will look impossible.
Hmm... correct, but I still believe we're talking immense pain
for very little gain.
Thanks for spending time on this.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists