lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 14 Aug 2014 21:54:55 -0700
From:	Guy Harris <guy@...m.mit.edu>
To:	Hannes Frederic Sowa <hannes@...essinduktion.org>
Cc:	Eric Dumazet <eric.dumazet@...il.com>,
	Daniel Borkmann <dborkman@...hat.com>,
	Neil Horman <nhorman@...driver.com>,
	Jesper Dangaard Brouer <brouer@...hat.com>,
	David Miller <davem@...emloft.net>,
	netdev <netdev@...r.kernel.org>
Subject: Re: [RFC] packet: handle too big packets for PACKET_V3


On Aug 14, 2014, at 6:04 PM, Hannes Frederic Sowa <hannes@...essinduktion.org> wrote:

> On Fri, Aug 15, 2014, at 02:54, Eric Dumazet wrote:
>> On Fri, 2014-08-15 at 02:43 +0200, Hannes Frederic Sowa wrote:
>> 
>>> Someone could use GRO to create packet trains to hide from intrustion
>>> detection systems, which maybe are the main user of TPACKET_V3. I don't
>>> think this is a good idea.
>> 
>> Presumably these tools already use a large enough bloc_size, and not a
>> 4KB one ;)
>> 
>> Even without GRO, a jumbo frame (9K) can trigger the bug.
> 
> Sure, but if I would have written such a tool without knowledge of GRO I
> would have queried at least the MTU. ;)

...and then queried the maximum size of the headers that precede the link-layer payload.

Except that you *can't* do that, and it can be variable-length without an obvious maximum (think 802.11 in monitor mode, where you have radiotap headers).  This causes much pain for libpcap when using TPACKET_V1 and TPACKET_V2, forcing it to allocate huge blocks when smaller ones might be sufficient.

> If an IDS allocates block_sizes below the MTU there is not much we can
> do.

If an IDS uses libpcap, it will get libpcap's behavior, which, for TPACKET_V3 is, roughly

	req.tp_frame_size = MAXIMUM_SNAPLEN;
	req.tp_frame_nr = handle->opt.buffer_size/req.tp_frame_size;

	/* compute the minumum block size that will handle this frame. 
	 * The block has to be page size aligned. 
	 * The max block size allowed by the kernel is arch-dependent and 
	 * it's not explicitly checked here. */
	req.tp_block_size = getpagesize();
	while (req.tp_block_size < req.tp_frame_size) 
		req.tp_block_size <<= 1;

	frames_per_block = req.tp_block_size/req.tp_frame_size;

	req.tp_block_nr = req.tp_frame_nr / frames_per_block;

	/* req.tp_frame_nr is requested to match frames_per_block*req.tp_block_nr */
	req.tp_frame_nr = req.tp_block_nr * frames_per_block;

(the last two C statements are actually part of a loop, where it'll reduce req.tp_frame_nr if it gets told "I don't have enough room for that big a ring").

MAXIMUM_SNAPLEN is 65535 in older versions of libpcap and 262144 in newer versions; it's the maximum frame size.

handle->opt.buffer_size is the buffer size requested by the application; it defaults to 2 MiB.

That calculation, with the default values for the latest version of libpcap, ends up with:

	req.tp_frame_size = 262144;
	req.tp_frame_nr = /* 2097152/262144 */ 8;

	/* compute the minumum block size that will handle this frame. 
	 * The block has to be page size aligned. 
	 * The max block size allowed by the kernel is arch-dependent and 
	 * it's not explicitly checked here. */
	req.tp_block_size = 4096;	/* IA-32 and x86-64, and probably many others */
	while (req.tp_block_size < req.tp_frame_size) 
		req.tp_block_size <<= 1;
	/* ends up with req.tp_block_size = 262144 */

	frames_per_block = /* 262144/262144 */ 1;

	req.tp_block_nr = /* 8 / 1 */ 8;

	/* req.tp_frame_nr is requested to match frames_per_block*req.tp_block_nr */
	req.tp_frame_nr = /* 8 * 1 / 8;

which I think means "8 256 KiB blocks".

> But up to the MTU we should let GRO behave transparently and here we
> violate this. There are also interfaces which extremely large MTUs but
> at least they report the MTU size correctly to user space.

Reporting the MTU to user space is insufficient for libpcap; it needs to know the *maximum packet size*, which includes link-layer headers (which I guess it could compute based on the ARPHRD_ for the interface, although for 802.11 that may be subject to change, e.g. the maximum link-layer header length grew when the QoS stuff was added) *and* metadata headers (such as radiotap, for which there really is no generic maximum length other than the 65535 byte limit imposed by the header length field being 16 bits, but a given driver can presumably return its maximum).

It also doesn't help with segmentation/reassembly offloading, where the MTU is decoupled from the maximum packet size.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ