lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 27 Apr 2016 20:31:38 +0200
From:	Lucas Stach <dev@...xeye.de>
To:	Dave Chinner <david@...morbit.com>
Cc:	Brian Foster <bfoster@...hat.com>, linux-kernel@...r.kernel.org,
	xfs@....sgi.com
Subject: Re: [PATCH] xfs: idle aild if the AIL is pushed up to the target LSN

Am Dienstag, den 26.04.2016, 09:08 +1000 schrieb Dave Chinner:
[...]
> > 
> > > 
> > > That said, I'm not sure whether there's a notable benefit of
> > > idling
> > > for
> > > 50ms over just scheduling out when we've hit the target lsn. It
> > > seems
> > > like that anybody who pushes the target forward again is going to
> > > wake
> > > up the thread anyways. On the other hand, if the fs is idle the
> > > thread
> > > will eventually schedule out indefinitely. 
> > Is this a problem? The patch tries to do exactly that: schedule out
> > aild indefinitely when there is no more work to do as nobody is
> > pushing
> > the target LSN forward.
> If the filesystem is slowly being dirtied, then the aild should't
> really idle at all.i
> 
> Keep in mind that the xfsaild has multiple functions, one of which
> is a watchdog that catches log space stalls that would otherwise
> hang the filesystem. Every time we've removed the watchdog function
> (i.e.  agressively idle the aild) we've had users report random,
> unreproducable hangs/stalls that have gone away when the watchdog
> function (i.e. don't idle until the log is covered and completely
> idle) was re-instated...
> 
I can only see xfsaild_push() doing any work after it has hit the
target LSN if something moves the target LSN forward. You say that
aggressively idling aild might produce log stalls, which would imply
there are races in the code where a code path that moves the target LSN
forward doesn't properly wake up aild.

Wouldn't this problem also be present when doing non-aggressive idle of
aild, just the probability of hitting the issue being reduced
significantly? The commit that re-enabled non-aggressive aild idle
especially mentions some races that have been fixed and I think those
fixes should allow for agressive aild idle. If they are insufficient it
wouldn't be safe to idle aild at all, right?

Regards,
Lucas

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ