netdev - Re: [PATCH net-next-2.6] bridge: 64bit rx/tx counters

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-Id: <20100812151145.f5fa259b.akpm@linux-foundation.org>
Date:	Thu, 12 Aug 2010 15:11:45 -0700
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	David Miller <davem@...emloft.net>,
	Stephen Hemminger <shemminger@...ux-foundation.org>,
	netdev@...r.kernel.org, bhutchings@...arflare.com,
	Nick Piggin <npiggin@...e.de>
Subject: Re: [PATCH net-next-2.6] bridge: 64bit rx/tx counters

On Thu, 12 Aug 2010 23:47:37 +0200
Eric Dumazet <eric.dumazet@...il.com> wrote:

> Le jeudi 12 ao__t 2010 __ 08:07 -0700, Andrew Morton a __crit : 
> > On Thu, 12 Aug 2010 14:16:15 +0200 Eric Dumazet <eric.dumazet@...il.com> wrote:
> > 
> > > > And all this open-coded per-cpu counter stuff added all over the place.
> > > > Were percpu_counters tested or reviewed and found inadequate and unfixable?
> > > > If so, please do tell.
> > > > 
> > > 
> > > percpu_counters tries hard to maintain a view of the current value of
> > > the (global) counter. This adds a cost because of a shared cache line
> > > and locking. (__percpu_counter_sum() is not very scalable on big hosts,
> > > it locks the percpu_counter lock for a possibly long iteration)
> > 
> > Could be.  Is percpu_counter_read_positive() unsuitable?
> > 
> 
> I bet most people want precise counters when doing 'ifconfig lo'
> 
> SNMP applications would be very surprised to get non increasing values
> between two samples, or inexact values.

percpu_counter_read_positive() should be returning monotonically
increasing numbers - if it ever went backward that would be bad.  But
yes, the value will increase in a lumpy fashion.  Probably one would
need to make informed choices between percpu_counter_read_positive()
and percpu_counter_sum(), depending on the type of stat.

But that's all a bit academic.

>
> > > And this folding has zero effect on
> > > concurrent writers (counter updates)
> > 
> > The fastpath looks a little expensive in the code you've added.  The
> > write_seqlock() does an rmw and a wmb() and the stats inc is a 64-bit
> > rmw whereas percpu_counters do a simple 32-bit add.  So I'd expect that
> > at some suitable batch value, percpu-counters are faster on 32-bit. 
> > 
> 
> Hmm... 6 instructions (16 bytes of text) are a "little expensive" versus
> 120 instructions if we use percpu_counter ?
> 
> Following code from drivers/net/loopback.c
> 
> 	u64_stats_update_begin(&lb_stats->syncp);
> 	lb_stats->bytes += len;
> 	lb_stats->packets++;
> 	u64_stats_update_end(&lb_stats->syncp);
> 
> maps on i386 to :
> 
> 	ff 46 10             	incl   0x10(%esi)  // u64_stats_update_begin(&lb_stats->syncp);
> 	89 f8                	mov    %edi,%eax
> 	99                   	cltd   
> 	01 7e 08             	add    %edi,0x8(%esi)
> 	11 56 0c             	adc    %edx,0xc(%esi)
> 	83 06 01             	addl   $0x1,(%esi)
> 	83 56 04 00          	adcl   $0x0,0x4(%esi)
> 	ff 46 10             	incl   0x10(%esi) // u64_stats_update_end(&lb_stats->syncp);
> 
> 
> Exactly 6 added instructions compared to previous kernel (32bit
> counters), only on 32bit hosts. These instructions are not expensive (no
> conditional branches, no extra register pressure) and access private cpu
> data.
> 
> While two calls to __percpu_counter_add() add about 120 instructions,
> even on 64bit hosts, wasting precious cpu cycles.

Oy.  You omitted the per_cpu_ptr() evaluation and, I bet, included all
the executed-1/batch-times instructions.

> 
> > They'll usually be slower on 64-bit, until that num_possible_cpus walk
> > bites you.
> > 
> 
> But are you aware we already fold SNMP values using for_each_possible()
> macros, before adding 64bit counters ? Not related to 64bit stuff
> really...


> > percpu_counters might need some work to make them irq-friendly.  That
> > bare spin_lock().
> > 
> > btw, I worry a bit about seqlocks in the presence of interrupts:
> > 
> 
> Please note that nothing is assumed about interrupts and seqcounts
> 
> Both readers and writers must mask them if necessary.
> 
> In most situations, masking softirq is enough for networking cases
> (updates are performed from softirq handler, reads from process context)

Yup, write_seqcount_begin/end() are pretty dangerous-looking.  The
caller needs to protect the lock against other CPUs, against interrupts
and even against preemption.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html