lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 26 May 2009 22:47:04 +0200
From:	Jens Axboe <jens.axboe@...cle.com>
To:	Damien Wyart <damien.wyart@...e.fr>
Cc:	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
	chris.mason@...cle.com, david@...morbit.com, hch@...radead.org,
	akpm@...ux-foundation.org, jack@...e.cz,
	yanmin_zhang@...ux.intel.com, richard@....demon.co.uk
Subject: Re: [PATCH 0/12] Per-bdi writeback flusher threads v7

On Tue, May 26 2009, Damien Wyart wrote:
> > > I have been playing with v7 since your sending and after a while
> > > (short on laptop, longer on desktop, a few hours), writeback doesn't
> > > seem to work anymore. Manual call to sync hangs (process in D state)
> > > and Dirty value in meminfo gets growing. As previous versions had
> > > been heavily tested, I guess there is some regression in v7.
> 
> > Not good, the prime suspect is the sync notification stuff. I'll take
> > a look and get that fixed. You didn't happen to catch any sysrq-t back
> > traces or anything like that? Would be interesting to see where
> > bdi-default and the bdi-* threads are stuck.
> 
> No, as I was doing many things at the same time and not exclusively
> debugging, I just rebooted hard and went back to an upatched kernel when
> the problems occured. But I noticed only bdi-default was alive, the
> other bdi-* threads had disappeared and the sync commands I had tried
> were all in D state. Also I tried to reinstall a kernel .deb (these
> systems are Debian) and this got stuck guring installation, when probing
> grub config (do not know if there is some sync syscall inthere).
> 
> Can try to go further tomorrow but will not have a lot of time...

OK, I spotted the problem. If we fallback to the on-stack allocation in
bdi_writeback_all(), then we do the wait for the work completion with
the bdi_lock mutex held. This can deadlock with bdi_forker_task(), so if
we require that to be invoked to make progress (happens if a thread
needs to be restarted), then we have a deadlock on that mutex.

I'll cook up a fix for this, but probably not before the morning.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ