lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20170901081006.6hfftamapa56m2jv@unicorn.suse.cz>
Date:   Fri, 1 Sep 2017 10:10:06 +0200
From:   Michal Kubecek <mkubecek@...e.cz>
To:     Jesper Dangaard Brouer <brouer@...hat.com>
Cc:     liujian56@...wei.com, netdev@...r.kernel.org,
        Florian Westphal <fw@...len.de>
Subject: Re: [RFC PATCH] net: frag limit checks need to use
 percpu_counter_compare

On Fri, Sep 01, 2017 at 09:41:56AM +0200, Jesper Dangaard Brouer wrote:
> On Thu, 31 Aug 2017 18:23:49 +0200 Michal Kubecek <mkubecek@...e.cz> wrote:
> 
> > If we go this way (which would IMHO require some benchmarks to make sure
> > it doesn't harm performance too much) we can drop the explicit checks
> > for zero thresholds which were added to work around the unreliability of
> > fast checks of percpu counters (or at least the second one was by commit
> > 30759219f562 ("net: disable fragment reassembly if high_thresh is zero").
>   
> After much considerations, together with Florian, I'm now instead
> looking at reverting the use of percpu_counter for this memory
> accounting use-case.  The complexity and maintenance cost is not worth
> it.  And I'm of-cause testing the perf effect, and currently I'm _not_
> seeing any perf regression on my 10G + 100G testlab (although this is
> not a NUMA system, which were my original optimization case).

This sounds reasonable to me. It is indeed questionable if percpu
counters are still worth the complexity if all checks have to be changed
to the exact version.

Perhaps there would be some gain for many CPUs if thresholds are large
enough to (almost) always avoid the need to calculate the sum. But once
we leave that safe area, I would be surprised if simple atomic_t
wouldn't be more efficient.

Michal Kubecek

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ