lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070219004537.GB9289@think.oraclecorp.com>
Date:	Sun, 18 Feb 2007 19:45:37 -0500
From:	Chris Mason <chris.mason@...cle.com>
To:	Miklos Szeredi <miklos@...redi.hu>
Cc:	akpm@...ux-foundation.org, linux-kernel@...r.kernel.org,
	linux-mm@...ck.org
Subject: Re: dirty balancing deadlock

On Mon, Feb 19, 2007 at 01:25:21AM +0100, Miklos Szeredi wrote:
> > > > If so, writes to B will decrease the dirty memory threshold.
> > > 
> > > Yes, but not by enough.  Say A dirties a 1100 pages, limit is 1000.
> > > Some pages queued for writeback (doesn't matter how much).  B writes
> > > back 1, 1099 dirty remain in A, zero in B.  balance_dirty_pages() for
> > > B doesn't know that there's nothing more to write back for B, it's
> > > just waiting there for those 1099, which'll never get written.
> > 
> > hm, OK, arguable.  I guess something like this..
> 
> Doesn't help the fuse case, but does seem to help the loopback mount
> one.
> 
> For fuse it's worse with the patch: now the write triggered by the
> balance recurses into fuse, with disastrous results, since the fuse
> writeback is now blocked on the userspace queue.
> 
> fusexmp_fh_no D 40136678     0   505    494           506   504 (NOTLB)
> 08982b78 00000001 00000000 08f9f9b4 0805d8cb 089a75f8 08982b78 08f98000
>        08f98000 08f9f9dc 0805a38a 089a7100 08982680 08f9f9cc 08f98000 08f98000
>        085d8300 08982680 089a7100 08f9fa34 08183006 089a7100 08982680 089a7100 Call Trace:
> 08f9f9a0:  [<0805d8cb>] switch_to_skas+0x3b/0x83
> 08f9f9b8:  [<0805a38a>] _switch_to+0x49/0x99
> 08f9f9e0:  [<08183006>] schedule+0x246/0x547
> 08f9fa38:  [<08103c7e>] fuse_get_req_wp+0xe9/0x14a
> 08f9fa70:  [<08103d2e>] fuse_writepage+0x4f/0x12c

In general, writepage is supposed to do work without blocking on
expensive locks that will get pdflush and dirty reclaim stuck in this
fashion.  You'll probably have to take the same approach reiserfs does
in data=journal mode, which is leaving the page dirty if fuse_get_req_wp
is going to block without making progress.

Queue it somewhere else (ie an internal Fs cleaning thread) and leave
the page dirty so that we can move on to other pages that have a chance
of being cleaned.

-chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ