netdev - Re: RFC: MTU for serving NFS on Infiniband

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1282715698.2467.681.camel@edumazet-laptop>
Date:	Wed, 25 Aug 2010 07:54:58 +0200
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Stephen Hemminger <shemminger@...tta.com>
Cc:	Ben Hutchings <bhutchings@...arflare.com>,
	Marc Aurele La France <tsi@...berta.ca>,
	linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
	"David S. Miller" <davem@...emloft.net>,
	Alexey Kuznetsov <kuznet@....inr.ac.ru>,
	"Pekka Savola (ipv6)" <pekkas@...core.fi>,
	James Morris <jmorris@...ei.org>,
	Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
	Patrick McHardy <kaber@...sh.net>
Subject: Re: RFC: MTU for serving NFS on Infiniband

Le mardi 24 août 2010 à 15:39 -0700, Stephen Hemminger a écrit :

> IF NFS server is smart enough to generate:
>    Header (skb) + one or more pages in fragment list
> then IP fragmentation could do fragmentation by allocating
> new headers skb (small) and assigning the same pages to
> multiple skb's using page ref count.
> 
> It obviously isn't working that way.
> 

It is, but ip_append_data() is allocating a huge head if MTU is huge.

NFS is trying to build paged skb, to avoid order-X allocations (X > 0)

> The whole problem is moot because NFS over UDP has known data corruption
> issues in the face of packet loss.  The sequence number of the IP fragment
> can easily wrap around causing old data to be grouped with new data and
> the UDP checksum is so weak that the resulting UDP packet will be consumed by the NFS
> client ans passed to the user application as corrupted disk block.
> 
> DON'T USE NFS OVER UDP!

But Marc point is using a big MTU, so that no IP fragmentation is
needed.

All UDP applications using MSG_MORE will hit the order-2 allocations if
MTU=9000 for example...



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html