lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20071112152857.a60643b0.akpm@linux-foundation.org>
Date:	Mon, 12 Nov 2007 15:28:57 -0800
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	David Brownell <david-b@...bell.net>
Cc:	Linux Kernel list <linux-kernel@...r.kernel.org>,
	Florian Fainelli <florian.fainelli@...ecomint.eu>,
	Haavard Skinnemoen <hskinnemoen@...el.com>,
	Ingo Molnar <mingo@...e.hu>,
	Nick Piggin <nickpiggin@...oo.com.au>
Subject: Re: [patch 2.6.24-rc2 1/3] generic gpio -- gpio_chip support

On Mon, 12 Nov 2007 14:32:53 -0800 David Brownell <david-b@...bell.net> wrote:

> On Monday 12 November 2007, Andrew Morton wrote:
> > On Fri, 9 Nov 2007 11:36:19 -0800 David Brownell <david-b@...bell.net> wrote:
> > 
> > > Provide new implementation infrastructure that platforms may choose to use
> > > when implementing the GPIO programming interface.  ...
> > > 
> > > ...
> > >
> > > +/* gpio_lock protects the table of chips and to gpio_chip->requested.
> > > + * While any gpio is requested, its gpio_chip is not removable.  It's
> > > + * a raw spinlock to ensure safe access from hardirq contexts, and to
> > > + * shrink bitbang overhead:  per-bit preemption would be very wrong.
> > > + */
> > > +static raw_spinlock_t gpio_lock = __RAW_SPIN_LOCK_UNLOCKED;
> > 
> > Well that's weird.
> > 
> > For starters, this initialisation will confound lockdep: it should use
> > DEFINE_SPINLOCK.
> 
> Then it wouldn't be "raw"!  It seems info on raw spinlocks is out
> of sync with current code.  Allegedly it should be possible to just
> pass a raw_spinlock_t pointer to spin_lock_irqsave() and friends
> and have GCC sort out the right stuff ... but that didn't work.
> 
> I speculate that either the design has changed (without fanfare),
> or else that stuff is in RT kernels and has not yet gone upstream.

Well whatever.  We shouldn't have to resort to caller-side party tricks
like this to get acceptable performance.

> 
> > And the rationale seems dubious.  All you're saving here is a couple of
> > accesses to task_struct at spin_unlock()-time.  If the current task has a
> > preemption pending then yes, we'll schedule away but that's a very rare
> > thing and that's just what we're supposed to do.
> 
> Unfortunately, that's not what I observed/measured.
> 
>   http://marc.info/?l=linux-kernel&m=119429680220361&w=2

There's something missing here.  It went from 6.4 usec/bit up to 11.2
usec/bit.  You seem to be saying that enabling preemption accounts for 1
usec.  What caused the rest of the slowdown?

> Plus, note the comment about hardirq context and RT environments.
> When that spinlock is really a mutex, and the non-hardirq contexts
> are tasks, non-raw spinlocks would wrongly prevent access to GPIOs
> from true hardirq contexts.

I don't understand all this about raw spinlocks and hardirq context. 
We use plain old spin_lock()/spin_unlock() in hardirq context all the time?

I'm still trying to understand what you've observed here.  Is it the case
that a single gpio operation went from 6.4 up to 11.2 usecs?  And that this
operation does a single spin_lock/spin_unlock?  If so, something would have
had to have gone seriously wrong for a spin_lock/unlock time to increase by
4.8 microseconds.

I assume these timings are from a reasonably fast machine?

> 
> > So please tell us more about this.  Perhaps there are performance problems
> > with the current core preemption machinery.
> 
> The above email summarizes significant slowdown I observed using
> just i2c-gpio, which happened to be easy to measure down to the
> microsecond level.
> 
> 
> > > +	local_irq_save(flags);
> > > +	__raw_spin_lock(&gpio_lock);
> > >
> > > ...
> > > +	__raw_spin_unlock(&gpio_lock);
> > > +	local_irq_restore(flags);
> > > +	return status;
> > > +}
> > 
> > And of course if this code is converted to conventional locking, the above
> > becomes spin_lock_irqsave()/spin_lock_irqrestore() in many places.
> 
> See above.  Allegedly we ought to be able to use those calls even
> with raw spinlocks ... but that didn't work when I tried it. I'd
> certainly prefer if it did work.

It would of course be better to not use raw spinlocks at all.  It's an
internal implementation detail and drivers shouldn't go poking at it:

box:/usr/src/linux-2.6.24-rc2> grep -rl raw_spinlock_t .
./arch/x86/kernel/traps_32.c
./arch/x86/kernel/traps_64.c
./arch/x86/kernel/tsc_sync.c
./arch/mips/kernel/gdb-stub.c
./arch/parisc/lib/bitops.c
./arch/powerpc/lib/locks.c
./arch/s390/lib/spinlock.c
./include/asm-alpha/spinlock.h
./include/asm-alpha/spinlock_types.h
./include/asm-arm/spinlock.h
./include/asm-arm/spinlock_types.h
./include/asm-generic/bitops/atomic.h
./include/asm-x86/spinlock_32.h
./include/asm-x86/spinlock_64.h
./include/asm-x86/spinlock_types.h
./include/asm-ia64/spinlock.h
./include/asm-ia64/spinlock_types.h
./include/asm-m32r/spinlock.h
./include/asm-m32r/spinlock_types.h
./include/asm-mips/spinlock.h
./include/asm-mips/spinlock_types.h
./include/asm-parisc/atomic.h
./include/asm-parisc/spinlock.h
./include/asm-parisc/spinlock_types.h
./include/asm-powerpc/spinlock.h
./include/asm-powerpc/spinlock_types.h
./include/asm-ppc/spinlock.h
./include/asm-s390/spinlock.h
./include/asm-s390/spinlock_types.h
./include/asm-sh/spinlock.h
./include/asm-sh/spinlock_types.h
./include/asm-sparc/spinlock.h
./include/asm-sparc/spinlock_types.h
./include/asm-sparc64/spinlock.h
./include/asm-sparc64/spinlock_types.h
./include/linux/spinlock.h
./include/linux/spinlock_types.h
./include/linux/spinlock_types_up.h
./include/linux/spinlock_up.h
./kernel/lockdep.c
./lib/spinlock_debug.c


So can we please drill down and work out what went wrong and see if we can
fix it all up properly?

> 
> > > +/* There's no value in inlining GPIO calls that may sleep.
> > 
> > There's no value in inlining anything, hardly ;)
> 
> Remember that GPIOs do get used for bitbanging, and there's lots
> of kernel code to make bit ops stay fast!
> 
> The canonical reason to inline GPIO operations is so that when
> you bitbang some serial protocol, you're closer to one instruction
> per bit toggle than fifty ... at one point I measured a factor of
> nearly three speedup for byte-level bitbanging for SPI (ca. 70 Kbit
> with quick function calls, vs about 2 Mbits inlined, ISTR) ... with
> most of the overhead being to pass the next byte down to the bitbang
> loop.  Using a bigger wordsize saw even more improvement.
> 
> 
> > > +postcore_initcall(gpiolib_debugfs_init);
> > 
> > postcore_initcall() is unusual, hence a comment describing why it was
> > employed would be a good thing to have.
> 
> Actually that one could easily be later, fs_initcall or whatnot.
> I think that's debris from other versions.
> 
> - Dave
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ