lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20111202082950.GA19148@localhost>
Date:	Fri, 2 Dec 2011 16:29:50 +0800
From:	Wu Fengguang <fengguang.wu@...el.com>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Matthew Wilcox <matthew@....cx>, Jan Kara <jack@...e.cz>,
	LKML <linux-kernel@...r.kernel.org>,
	"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Theodore Ts'o <tytso@....edu>,
	Christoph Hellwig <hch@...radead.org>
Subject: Re: [PATCH] writeback: permit through good bdi even when global
 dirty exceeded

On Fri, Dec 02, 2011 at 03:03:59PM +0800, Andrew Morton wrote:
> On Fri, 2 Dec 2011 14:36:03 +0800 Wu Fengguang <fengguang.wu@...el.com> wrote:
> 
> > --- linux-next.orig/mm/page-writeback.c	2011-12-02 10:16:21.000000000 +0800
> > +++ linux-next/mm/page-writeback.c	2011-12-02 14:28:44.000000000 +0800
> > @@ -1182,6 +1182,14 @@ pause:
> >  		if (task_ratelimit)
> >  			break;
> >  
> > +		/*
> > +		 * In the case of an unresponding NFS server and the NFS dirty
> > +		 * pages exceeds dirty_thresh, give the other good bdi's a pipe
> > +		 * to go through, so that tasks on them still remain responsive.
> > +		 */
> > +		if (bdi_dirty < 8)
> > +			break;
> 
> What happens if the local disk has nine dirty pages?

The 9 dirty pages will be cleaned by the flusher (likely in one shot),
so after a while the dirtier task can dirty 8 pages more. This
consumer-producer work flow can keep going on as long as the magic
number chosen is >= 1.

> Also: please, no more magic numbers.  We have too many in there already.

Good point. Let's add some comment on the number chosen?

> What to do instead?  Perhaps arrange for devices which can block in
> this fashion to be identified as such in their backing_device and then
> prevent the kernel from ever permitting such devices to fully consume
> the dirty-page pool.

Yeah, that's considered too, unfortunately it's not as simple and
elegant than the proposed patch. For example, if giving all NFS mounts
the same "lowered" limit, there is still the problem that when one NFS
mount goes broken, the other NFS mounts are all impacted.

> If someone later comes along and decreases the dirty limits mid-flight,
> I guess the same problem occurs.  This can perhaps be handled by not
> permitting to limit to be set that low at that time.

Yes! Not long ago we introduced @global_dirty_limit and
update_dirty_limit() exactly for fixing that case. The comment says:

/*
 * The global dirtyable memory and dirty threshold could be suddenly knocked
 * down by a large amount (eg. on the startup of KVM in a swapless system).
 * This may throw the system into deep dirty exceeded state and throttle
 * heavy/light dirtiers alike. To retain good responsiveness, maintain
 * global_dirty_limit for tracking slowly down to the knocked down dirty
 * threshold.
 */
static void update_dirty_limit(unsigned long thresh, unsigned long dirty)
{       
...


Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ