lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 9 Aug 2011 20:48:04 +1000
From:	Dave Chinner <david@...morbit.com>
To:	Pekka Enberg <penberg@...nel.org>
Cc:	viro@...iv.linux.org.uk, linux-kernel@...r.kernel.org,
	linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH 1/2] dcache: convert dentry_stat.nr_unused to per-cpu
 counters

On Mon, Aug 08, 2011 at 10:08:20AM +0300, Pekka Enberg wrote:
> On Mon, Aug 8, 2011 at 10:03 AM, Dave Chinner <david@...morbit.com> wrote:
> > From: Dave Chinner <dchinner@...hat.com>
> >
> > Before we split up the dcache_lru_lock, the unused dentry counter
> > needs to be made independent of the global dcache_lru_lock. Convert
> > it to per-cpu counters to do this.
> >
> > Signed-off-by: Dave Chinner <dchinner@...hat.com>
> > ---
> >  fs/dcache.c |   17 ++++++++++++++---
> >  1 files changed, 14 insertions(+), 3 deletions(-)
> >
> > diff --git a/fs/dcache.c b/fs/dcache.c
> > index b05aac3..5e5208d 100644
> > --- a/fs/dcache.c
> > +++ b/fs/dcache.c
> > @@ -116,6 +116,7 @@ struct dentry_stat_t dentry_stat = {
> >  };
> >
> >  static DEFINE_PER_CPU(unsigned int, nr_dentry);
> > +static DEFINE_PER_CPU(unsigned int, nr_dentry_unused);
> >
> >  #if defined(CONFIG_SYSCTL) && defined(CONFIG_PROC_FS)
> >  static int get_nr_dentry(void)
> > @@ -127,10 +128,20 @@ static int get_nr_dentry(void)
> >        return sum < 0 ? 0 : sum;
> >  }
> >
> > +static int get_nr_dentry_unused(void)
> > +{
> > +       int i;
> > +       int sum = 0;
> > +       for_each_possible_cpu(i)
> > +               sum += per_cpu(nr_dentry_unused, i);
> > +       return sum < 0 ? 0 : sum;
> > +}
> 
> Why not use struct percpu_counter and percpu_counter_sum_positive() for this?

That's what I originally implemented for all these VFS per-cpu
counters. e.g:

cffbc8a fs: Convert nr_inodes and nr_unused to per-cpu counters

but then this happened:

commit 86c8749ede0c59e590de9267066932a26f1ce796
Author: Nick Piggin <npiggin@...nel.dk>
Date:   Fri Jan 7 17:49:18 2011 +1100

    vfs: revert per-cpu nr_unused counters for dentry and inodes
    
    The nr_unused counters count the number of objects on an LRU, and as such they
    are synchronized with LRU object insertion and removal and scanning, and
    protected under the LRU lock.
    
    Making it per-cpu does not actually get any concurrency improvements because o
    this lock, and summing the counter is much slower, and
    incrementing/decrementing it costs more code size and is slower too.
    
    These counters should stay per-LRU, which currently means global.
    
    Signed-off-by: Nick Piggin <npiggin@...nel.dk>

and then, because the per-cpu counters were actually necessary,
a couple of patches later in the series per-cpu counters were
re-introduced:

commit 3e880fb5e4bb6a012035e3edd0586ee2817c2e24
Author: Nick Piggin <npiggin@...nel.dk>
Date:   Fri Jan 7 17:49:19 2011 +1100

    fs: use fast counters for vfs caches
    
    percpu_counter library generates quite nasty code, so unless you need
    to dynamically allocate counters or take fast approximate value, a
    simple per cpu set of counters is much better.
    
    The percpu_counter can never be made to work as well, because it has an
    indirection from pointer to percpu memory, and it can't use direct
    this_cpu_inc interfaces because it doesn't use static PER_CPU data, so
    code will always be worse.

....

No point in making arguments about how using and improving the
generic counter helps everyone, the maintenance is lower as well,
or that we are summing the counters on every single shrinker call
(i.e. could be thousands of times a second) so we need fast
approximate values to be taken.

The change to per-sb LRUs actually takes away the need to sum the
VFS inode and dentry counters on every shrinker call, so now the use
of the roll-your-own counter structure makes a bit of sense because
they are only read when someone read /proc/sys/fs/dentry-state. It
sure as hell didn't make sense back when this code was merged,
though....

Cheers,

Dave.

-- 
Dave Chinner
david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ