linux-kernel - Re: [Patch] fix MTD CFI/LPDDR flash driver huge latency bug

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1268499644.27883.94.camel@wall-e>
Date:	Sat, 13 Mar 2010 18:00:44 +0100
From:	Stefani Seibold <stefani@...bold.net>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	linux-mtd@...ts.infradead.org,
	linux-kernel <linux-kernel@...r.kernel.org>,
	David Woodhouse <dwmw2@...radead.org>,
	"Kreuzer, Michael (NSN - DE/Ulm)" <michael.kreuzer@....com>
Subject: Re: [Patch] fix MTD CFI/LPDDR flash driver huge latency bug

Am Samstag, den 13.03.2010, 06:25 -0500 schrieb Andrew Morton:
> On Sat, 13 Mar 2010 13:31:30 +0100 Stefani Seibold <stefani@...bold.net> wrote:
> 
> > Am Freitag, den 12.03.2010, 14:23 -0800 schrieb Andrew Morton:
> > > On Sat, 06 Mar 2010 17:48:57 +0100
> > > Stefani Seibold <stefani@...bold.net> wrote:
> > > 
> > > The patch change all the use of spin_lock operations for xxxx->mutex
> > > > into mutex operations, which is exact what the name says and means. 
> > > > 
> > > > There is no performance regression since the mutex is normally not
> > > > acquired.
> > > 
> > > hm, big scary patch.  Are you sure this mutex is never taken from
> > > atomic or irq contexts?  Is it ully tested with all relevant debug options
> > > and lockdep enabled?
> > > 
> > > 
> > 
> > I have analyzed this drivers and IMHO i don't think there will be used
> > from irq or atomic contexts. There is no request interrupt and there are
> > a lot msleep and add_wait_queues/schedule calls during holding the
> > mutex, which are not very useful in a irq or atomic context. But i don't
> > know the whole mtd stack. 
> > 
> > I tested the patch with the following kernel debug options:
> > 
> > CONFIG_DEBUG_KERNEL=y
> > CONFIG_DETECT_SOFTLOCKUP=y
> > CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE=0
> > CONFIG_SCHED_DEBUG=y
> > CONFIG_SCHEDSTATS=y
> > CONFIG_TIMER_STATS=y
> > CONFIG_DEBUG_MUTEXES=y
> > CONFIG_DEBUG_SPINLOCK_SLEEP=y
> > 
> 
> Neato.  As was mentioned, one thing to check is the mtdoops path. 
> oopses can happen with locks held, from IRQ context, etc.
> 

Okay, i didn't checked that case. But the old code has also a dead lock,
if the oops occurred during the spinlock(xxx->mutex) was held. With the
new mutex solution the change is bigger to run into that deadlock due
the possible preemption. 

But i did a "grep" at the whole mtd code and there is no panic_write
function assigned to mtd_info struct for the CFI flash chips. So this
problem will currently never occure.

> If we're trying to take that mutex in oops context then I guess that's
> fixable by just not taking it and hoping for the best.  Or, better,
> mutex_trylock() and conditional mutex_unlock() to try to be nice to
> possible concurrent activity on other CPUs.
> 

Concurrent access are dangerous and in most cases are not possible,
that's why the spinlock(xxxx->mutex) was for.

I also did some concurrency checks like:

cat /dev/zero >/flash/aa & cat /dev/zero >/flash/bb

without and side effects.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/