lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sun, 11 Nov 2007 00:02:32 +0100
From:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Mikulas Patocka <mikulas@...ax.karlin.mff.cuni.cz>,
	linux-kernel@...r.kernel.org, WU Fengguang <wfg@...l.ustc.edu.cn>,
	Miklos Szeredi <miklos@...redi.hu>
Subject: Re: Temporary lockup on loopback block device


On Sat, 2007-11-10 at 14:54 -0800, Andrew Morton wrote:
> On Sat, 10 Nov 2007 20:51:31 +0100 (CET) Mikulas Patocka <mikulas@...ax.karlin.mff.cuni.cz> wrote:
> 
> > Hi
> > 
> > I am experiencing a transient lockup in 'D' state with loopback device. It 
> > happens when process writes to a filesystem in loopback with command like
> > dd if=/dev/zero of=/s/fill bs=4k 
> > 
> > CPU is idle, disk is idle too, yet the dd process is waiting in 'D' in 
> > congestion_wait called from balance_dirty_pages.
> > 
> > After about 30 seconds, the lockup is gone and dd resumes, but it locks up 
> > soon again.
> > 
> > I added a printk to the balance_dirty_pages
> > printk("wait: nr_reclaimable %d, nr_writeback %d, dirty_thresh %d, 
> > pages_written %d, write_chunk %d\n", nr_reclaimable, 
> > global_page_state(NR_WRITEBACK), dirty_thresh, pages_written, 
> > write_chunk);
> > 
> > and it shows this during the lockup:
> > 
> > wait: nr_reclaimable 3099, nr_writeback 0, dirty_thresh 2985, 
> > pages_written 1021, write_chunk 1522
> > wait: nr_reclaimable 3099, nr_writeback 0, dirty_thresh 2985, 
> > pages_written 1021, write_chunk 1522
> > wait: nr_reclaimable 3099, nr_writeback 0, dirty_thresh 2985, 
> > pages_written 1021, write_chunk 1522
> > 
> > What apparently happens:
> > 
> > writeback_inodes syncs inodes only on the given wbc->bdi, however 
> > balance_dirty_pages checks against global counts of dirty pages. So if 
> > there's nothing to sync on a given device, but there are other dirty pages 
> > so that the counts are over the limit, it will loop without doing any 
> > work.
> > 
> > To reproduce it, you need totally idle machine (no GUI, etc.) -- if 
> > something writes to the backing device, it flushes the dirty pages 
> > generated by the loopback and the lockup is gone. If you add printk, don't 
> > forget to stop klogd, otherwise logging would end the lockup.
> 
> erk.

known issue.

> > The hotfix (that I verified to work) is to not set wbc->bdi, so that all 
> > devices are flushed ... but the code probably needs some redesign (i.e. 
> > either account per-device and flush per-device, or account-global and 
> > flush-global).

.24 will have the per-device solution.

> > 
> > diff -u -r ../x/linux-2.6.23.1/mm/page-writeback.c mm/page-writeback.c
> > --- ../x/linux-2.6.23.1/mm/page-writeback.c     2007-10-12 18:43:44.000000000 +0200
> > +++ mm/page-writeback.c 2007-11-10 20:32:43.000000000 +0100
> > @@ -214,7 +214,6 @@
> > 
> > 	for (;;) {
> > 		struct writeback_control wbc = {
> > -			.bdi            = bdi,
> > 			.sync_mode      = WB_SYNC_NONE,
> > 			.older_than_this = NULL,
> > 			.nr_to_write    = write_chunk,
> 
> Arguably we just have the wrong backing-device here, and what we should do
> is to propagate the real backing device's pointer through up into the
> filesystem.  There's machinery for this which things like DM stacks use.
> 
> I wonder if the post-2.6.23 changes happened to make this problem go away.

The per BDI dirty stuff in 24 should make this work, I just checked and
loopback thingies seem to have their own BDI, so all should be well.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ