netdev - Re: [net-next PATCH V2 1/9] net: frag evictor, avoid killing warm frag queues

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20121129.124427.1093031685966728935.davem@davemloft.net>
Date:	Thu, 29 Nov 2012 12:44:27 -0500 (EST)
From:	David Miller <davem@...emloft.net>
To:	brouer@...hat.com
Cc:	eric.dumazet@...il.com, fw@...len.de, netdev@...r.kernel.org,
	pablo@...filter.org, tgraf@...g.ch, amwang@...hat.com,
	kaber@...sh.net, paulmck@...ux.vnet.ibm.com,
	herbert@...dor.hengli.com.au
Subject: Re: [net-next PATCH V2 1/9] net: frag evictor, avoid killing warm
 frag queues

From: Jesper Dangaard Brouer <brouer@...hat.com>
Date: Thu, 29 Nov 2012 17:11:09 +0100

> The fragmentation evictor system have a very unfortunate eviction
> system for killing fragment, when the system is put under pressure.
> If packets are coming in too fast, the evictor code kills "warm"
> fragments too quickly.  Resulting in a massive performance drop,
> because we drop frag lists where we have already queue up a lot of
> fragments/work, which gets killed before they have a chance to
> complete.

I think this is a trade-off where the decision is somewhat
arbitrary.

If you kill warm entries, the sending of all of the fragments is
wasted.  If you retain warm entries and drop incoming new fragments,
well then the sending of all of those newer fragments is wasted too.

The only way I could see this making sense is if some "probability
of fulfillment" was taken into account.  For example, if you have
more than half of the fragments already, then yes it may be
advisable to retain the warm entry.

Otherwise, as I said, the decision seems arbitrary.

Let's take a step back and think about why this is happening at all.

I wonder how reasonable the high and low thresholds really are.  Even
once you move them to per-cpu, I think the limits are far too small.

I'm under the impression that it's common for skb->truesize for 1500
MTU frames to be something rounded up to the next power of 2, so
2048 bytes, or something like that.  Then add in the sk_buff control
overhead, as well as the inet_frag head.

So a 64K fragmented frame probably consumes close to 100K.

So once we have three 64K frames in flight, we're already over the
high threshold and will start dropping things.

That's beyond stupid.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html