lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 6 Mar 2009 20:03:50 +0100
From:	Jens Axboe <jens.axboe@...cle.com>
To:	Geert Uytterhoeven <Geert.Uytterhoeven@...ycom.com>
Cc:	Benjamin Herrenschmidt <benh@...nel.crashing.org>,
	Jim Paris <jim@...n.com>,
	Vivien Chappelier <vivien.chappelier@...e.fr>,
	David Woodhouse <dwmw2@...radead.org>,
	Arnd Bergmann <arnd@...db.de>,
	Geoff Levand <geoffrey.levand@...sony.com>,
	Linux/PPC Development <linuxppc-dev@...abs.org>,
	Cell Broadband Engine OSS Development 
	<cbe-oss-dev@...abs.org>,
	Linux Kernel Development <linux-kernel@...r.kernel.org>,
	linux-mtd@...ts.infradead.org
Subject: Re: [PATCH/RFC] ps3/block: Add ps3vram-ng driver for accessing
	video      RAM as block device

On Fri, Mar 06 2009, Geert Uytterhoeven wrote:
> On Fri, 6 Mar 2009, Jens Axboe wrote:
> > On Fri, Mar 06 2009, Geert Uytterhoeven wrote:
> > > On Fri, 6 Mar 2009, Jens Axboe wrote:
> > > > On Thu, Mar 05 2009, Geert Uytterhoeven wrote:
> > > > > But then I noticed ps3vram_make_request() may be called concurrently,
> > > > > so I had to add a mutex to avoid data corruption. This slows the
> > > > > driver down, and in the end, the version with a thread turns out to be
> > > > > ca. 1% faster. The version without a thread is about 50 lines less
> > > > > code, though.
> > > >
> > > > That is correct, ->make_request_fn may get reentered. I'm not surprised
> > > > that performance dropped if you just shoved everything under a mutex.
> > > > You could be a little more smart and queue concurrent bio's for
> > > > processing when the current one is complete though, there are several
> > > > approaches there that be a lot faster than going all the way through the
> > > > IO stack and scheduler just to avoid concurrency.
> > >
> > > Yes, using a spinlock and queueing requests on a list if the driver is
> > > busy can be done after 2.6.29...
> >
> > Certainly. Even just replacing your current mutex with a spinlock during
> > the memcpy() would surely be a lot faster. Or even just grabbing the
> > mutex before calling into the write for the duration of the bio. The way
> > you do it is certain context switch death :-)
> 
> It's not just the memcpy(). ps3vram_{up,down}load() call msleep(), so
> I cannot use a spinlock.

Ah right, I hadn't looked close enough. But putting the mutex_lock()
outside of the bio_for_each_segment() is going to be much faster than
getting/releasing it for each segment.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ