netdev - Re: [RFC PATCH] ip: re-introduce fragments cache worker

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <f1c2a8246410d6336454aaa8b26d5a670bc4d993.camel@redhat.com>
Date:   Fri, 06 Jul 2018 15:56:46 +0200
From:   Paolo Abeni <pabeni@...hat.com>
To:     Eric Dumazet <eric.dumazet@...il.com>, netdev@...r.kernel.org
Cc:     "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Florian Westphal <fw@...len.de>, NeilBrown <neilb@...e.com>
Subject: Re: [RFC PATCH] ip: re-introduce fragments cache worker

On Fri, 2018-07-06 at 05:09 -0700, Eric Dumazet wrote:
> On 07/06/2018 04:56 AM, Paolo Abeni wrote:
> > With your setting, you need a bit more concurrent connections (400 ?)
> > to saturate the ipfrag cache. Above that number, performances will
> > still sink.
> 
> Maybe, but IP defrag can not be 'perfect'.
> 
> For this particular use case I could still bump high_thresh to 6 GB and all would be good :)

Understood.

I'd like to be sure I stated the problem I see clearly. With the
current code the "goodput" goes to almost 0 as soon as the ipfrag cache
load goes above it's capacity. Before the worker removal, after
reaching high_thresh, the "goodput" degratated slowly and even with a
load more than an order of magnitude higher, the performances were
still quite good. I think we can't ask customers to add more memory for
a kernel upgrade; even changing the default sysfs configuration is
somewhat troubling.

> > This looks nice, I'll try to test it in my use case and I'll report
> > here.

I tried the patch, but the result are not encouraging:

./super_netperf.sh 200 -H 192.168.101.2 -t UDP_STREAM -l 60
34.94

# on the receiver side:
echo 2 >  /proc/sys/net/ipv4/ipfrag_time

# on the sender side:
./super_netperf.sh 200 -H 192.168.101.2 -t UDP_STREAM -l 60
85.8

# still on receiver side, while the test is running:
nstat>/dev/null ;sleep 1; nstat |grep IpReasm
IpReasmTimeout                  2128               0.0
IpReasmReqds                    754770             0.0
IpReasmOKs                      135                0.0
IpReasmFails                    752811             0.0

grep FRAG /proc/net/sockstat
FRAG: inuse 124 memory 5286144

The patch has some effect, as I basically saw no timeout without it,
but still does not look aggressive enough. Or possibly it's evicting
the fragments that are more likely to be used/completed (the most
recents one).

> > I have doubt: under DDOS we will trigger <max numfrags> timeout per
> > jiffy, can that keep a CPU busy, too?
> 
> Yes, the cpu(s) handling the RX queue(s), which are already provisioned for networking stuff ;)
> 
> Even without any frag being received, these cpu can be 100% busy.

With:

schedule_work_on(smp_processor_id(), #... )

We can be sure to run exclusively on the cpu handling the RX queue even with the worker.

Cheers,

Paolo