linux-kernel - Re: [patch v2 0/5] percpu_counter: bug fix and enhancement

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1305261477.2373.45.camel@sli10-conroe>
Date:	Fri, 13 May 2011 12:37:57 +0800
From:	Shaohua Li <shaohua.li@...el.com>
To:	Tejun Heo <tj@...nel.org>
Cc:	Eric Dumazet <eric.dumazet@...il.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
	"cl@...ux.com" <cl@...ux.com>,
	"npiggin@...nel.dk" <npiggin@...nel.dk>
Subject: Re: [patch v2 0/5] percpu_counter: bug fix and enhancement

On Thu, 2011-05-12 at 17:05 +0800, Tejun Heo wrote:
> Hello,
> 
> On Thu, May 12, 2011 at 11:02:15AM +0200, Eric Dumazet wrote:
> > > I don't think @maxfuzzy is necessary there.  I wrote this before but
> > > why can't we track the actual deviation instead of the number of
> > > deviation events?
> > 
> > Thats roughly same thing (BATCH multiplicator factor apart)
> > 
> > Most percpu_counter users for a given percpu_counter object use a given
> > BATCH, dont they ?
> 
> Well, @maxfuzzy is much harder than @batch.  It's way less intuitive.
> Although I haven't really thought about it that much, I think it might
> be possible to eliminate it.  Maybe I'm confused.  I'll take another
> look later but if someone can think of something, please jump right
> in.
Hmm, looks Eric's approach doesn't work. because we want to remove lock
in _add, checking seq in _sum still races with _add.

can we do something like this:
void __percpu_counter_add(struct percpu_counter *fbc, s64 amount, s32 batch)
{
        s64 count;

        preempt_disable();
        count = __this_cpu_read(*fbc->counters) + amount;
        if (count >= batch || count <= -batch) {
                while (1) {
                        atomic_inc(&fbc->add_start);
                        if (atomic_read(&fbc->sum_start) != 0)
                                atomic_dec(&fbc->add_start);
                        else
                                break;
                        while (atomic_read(&fbc->sum_start) != 0)
                                cpu_relax();
                }

                atomic64_add(count, &fbc->count);
                __this_cpu_write(*fbc->counters, 0);
                atomic_dec(&fbc->add_start);
        } else {
                __this_cpu_write(*fbc->counters, count);
        }
        preempt_enable();
}

s64 __percpu_counter_sum(struct percpu_counter *fbc)
{
        s64 ret = 0;
        int cpu;
        int old_seq;
        s64 old_count;

        atomic_inc(&fbc->sum_start);
        while (atomic_read(&fbc->add_start) != 0)
                cpu_relax();

        old_count = atomic64_read(&fbc->count);

        for_each_online_cpu(cpu) {
                s32 *pcount = per_cpu_ptr(fbc->counters, cpu);
                ret += *pcount;
        }
        ret += atomic64_read(&fbc->count);
        atomic_dec(&fbc->sum_start);
        return ret;
}
if _add finds _sum is in progress, it gives up and and wait _sum. if
_sum finds _add is in progress, it waits _add to give up or end. We let
_add waits _sum here, because _sum is seldom called. If _sum waits _add,
_sum might run a dead loop. Maybe we need a spinlock to protect
concurrent _sum too. Anything wrong here?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/