lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 20 Feb 2007 09:47:11 +0100
From:	Miklos Szeredi <miklos@...redi.hu>
To:	chris.mason@...cle.com
CC:	akpm@...ux-foundation.org, linux-kernel@...r.kernel.org,
	linux-mm@...ck.org
Subject: Re: dirty balancing deadlock

> > How about this?
> > 
> > Solves the FUSE deadlock, but not the throttle_vm_writeout() one.
> > I'll try to tackle that one as well.
> > 
> > If the per-bdi dirty counter goes below 16, balance_dirty_pages()
> > returns.
> > 
> > Does the constant need to tunable?  If it's too large, then the global
> > threshold is more easily exceeded.  If it's too small, then in a tight
> > situation progress will be slower.
> 
> Ok, what is supposed to happen here is that filesystems are supposed to
> be throttled from making more dirty pages when the system is over the
> threshold.  Even if filesystem A doesn't have much to contribute, and
> filesystem B is the cause of 99% of the dirty pages, the goal of the
> threshold is to prevent more dirty data from happening, and filesystem A
> should block.

Which is the cause of the current deadlock.  But if we allow
filesystem A to go into the red just a little, the deadlock is
avoided, because it can continue to make progress with cleaning the
dirtyness produced by B.

The maximum that filesystems can go over the limit will be

  (16 + epsilon) * number-of-queues

This is usually insignificant compared to the limit itself (~2000
pages on a machine with 32MB)

However with thousands of fuse mounts this may become a problem, as
each filesystem gets a separate queue.  In theory, just 2 pages are
enough to always make progress, but current dirty balancing can't
enforce this, as the ratelimit is at least 8 pages.

So there may have to be some more strict page accounting within fuse
itself, but that doesn't change the overall concept I think.

Miklos
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ