linux-kernel - Re: [PATCH] xfs: idle aild if the AIL is pushed up to the target LSN

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160425230833.GC18496@dastard>
Date:	Tue, 26 Apr 2016 09:08:33 +1000
From:	Dave Chinner <david@...morbit.com>
To:	Lucas Stach <dev@...xeye.de>
Cc:	Brian Foster <bfoster@...hat.com>, linux-kernel@...r.kernel.org,
	xfs@....sgi.com
Subject: Re: [PATCH] xfs: idle aild if the AIL is pushed up to the target LSN

On Mon, Apr 25, 2016 at 08:11:37PM +0200, Lucas Stach wrote:
> Am Montag, den 25.04.2016, 10:24 -0400 schrieb Brian Foster:
> > On Mon, Apr 25, 2016 at 09:42:43AM +0200, Lucas Stach wrote:
> > > 
> > > The current logic only idles aild if the AIL is empty, rescheduling
> > > the worker with a timeout of 50ms otherwise. If the target LSN
> > > isn't
> > > moved forward, the worker will not make any progress as it only
> > > pushes the AIL up to the target LSN, leading to the empty AIL
> > > condition to only be met after the log worker pushed the complete
> > > AIL. As the log worker runs on the scale of minutes, this leaves
> > > aild in the "no progress, 50ms timeout" state for extended periods
> > > of
> > > time, causing many unecessary wakeups.
> > > 
> > Where do you get the "scale of minutes" from? I believe the default
> > log
> > worker interval is 30s, though it can be increased by the admin.
> > 
> Uh, right. It's 30s, which is still a very long time when we consider
> that new items can be added to the AIL right after the log worker
> forced it out.

Standard writeback timeout for data and metadata.

> > The other thing to note is that the AIL target is also driven by
> > demand
> > on log space. E.g., see xlog_grant_push_ail(), which executes when a
> > transaction attempts to reserve log space that isn't currently
> > available
> > in the log.
> > 
> And that's where I think the problem stems from: the above function
> pushes the AIL only up to a target LSN to make room for the required
> log space. Only the log worker empties the AIL completely by issuing a
> xfs_ail_push_all_sync().

xfs_ail_push_all(), actually, and the log worker runs every 30s by
default. i.e. we move the target to the end to the AIL to try to
empty it completely every 30s.

> > That said, I'm not sure whether there's a notable benefit of idling
> > for
> > 50ms over just scheduling out when we've hit the target lsn. It seems
> > like that anybody who pushes the target forward again is going to
> > wake
> > up the thread anyways. On the other hand, if the fs is idle the
> > thread
> > will eventually schedule out indefinitely. 
> 
> Is this a problem? The patch tries to do exactly that: schedule out
> aild indefinitely when there is no more work to do as nobody is pushing
> the target LSN forward.

If the filesystem is slowly being dirtied, then the aild should't
really idle at all.i

Keep in mind that the xfsaild has multiple functions, one of which
is a watchdog that catches log space stalls that would otherwise
hang the filesystem. Every time we've removed the watchdog function
(i.e.  agressively idle the aild) we've had users report random,
unreproducable hangs/stalls that have gone away when the watchdog
function (i.e. don't idle until the log is covered and completely
idle) was re-instated...

> On an idle FS this will stop aild when the target LSN is hit, will wake
> it up again as soon as the log worker requests the complete AIL to be
> pushed out, then it goes back to sleep indefinitely until the log fills
> up again.

The behaviour suggests that your filesystem is not idle. The
filesystem takes up to 90s to be marked idle (log needs to be
covered, state machine takes 3x30s cycles to transition to idle
"covered" state.

If you want the filesytsem to idle quickly, then run the log worker
more frequently to get the target updated more quickly. This will
also speed up the log covering state machine as well.

Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com