netdev - Re: RFC: MTU for serving NFS on Infiniband

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20100825121058.GA28498@ms2.inr.ac.ru>
Date:	Wed, 25 Aug 2010 16:10:58 +0400
From:	Alexey Kuznetsov <kuznet@....inr.ac.ru>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	Stephen Hemminger <shemminger@...tta.com>,
	Ben Hutchings <bhutchings@...arflare.com>,
	Marc Aurele La France <tsi@...berta.ca>,
	linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
	"David S. Miller" <davem@...emloft.net>,
	"Pekka Savola (ipv6)" <pekkas@...core.fi>,
	James Morris <jmorris@...ei.org>,
	Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
	Patrick McHardy <kaber@...sh.net>
Subject: Re: RFC: MTU for serving NFS on Infiniband

Hello!

> It is, but ip_append_data() is allocating a huge head if MTU is huge.

Hmm, strange, as I remember, it was supposed to work right.

If the device supports SG (which is required to accept non-linear skbs anyway),
then ip_append_* should allocate skbs not rounded up to mtu and we should
allocate small skb with NFS header only. Does not it work?

I can only guess one possible trap: people could do _one_ huge ip_append_data()
(instead of "planned" scenario, when the header is sent with ip_append_data()
and the following payload is appended with ip_append_page()). Huge ip_append_data()
will generate huge skb indeed. Is this the problem?

BTW this issue could be revisited and this "will generate huge" can be reconsidered.
Automatic generation of fragmented skbs was deliberately suppressed, because it was
found that all devices existing at the moment when this code was written
are strongly biased against SG. Current code tries to _avoid_ generating
non-linear skbs, unless it is intended for zero-copy, which compensated
bias against SG. Modern hardware should work better.

Alexey
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html