linux-kernel - Re: Sync writeback still broken

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20101105213324.GA25520@quack.suse.cz>
Date:	Fri, 5 Nov 2010 22:33:24 +0100
From:	Jan Kara <jack@...e.cz>
To:	Jan Engelhardt <jengelh@...ozas.de>
Cc:	Jan Kara <jack@...e.cz>, Andrew Morton <akpm@...ux-foundation.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Jens Axboe <jens.axboe@...cle.com>,
	Linux Kernel <linux-kernel@...r.kernel.org>, stable@...nel.org,
	gregkh@...e.de
Subject: Re: Sync writeback still broken

On Sun 31-10-10 23:40:12, Jan Kara wrote:
> On Sun 31-10-10 13:24:37, Jan Kara wrote:
> > On Mon 25-10-10 01:41:48, Jan Engelhardt wrote:
> > > On Sunday 2010-06-27 18:44, Jan Engelhardt wrote:
> > > >On Monday 2010-02-15 16:41, Jan Engelhardt wrote:
> > > >>On Monday 2010-02-15 15:49, Jan Kara wrote:
> > > >>>On Sat 13-02-10 13:58:19, Jan Engelhardt wrote:
> > > >>>> >> 
> > > >>>> >> This fixes it by using the passed in page writeback count, instead of
> > > >>>> >> doing MAX_WRITEBACK_PAGES batches, which gets us much better performance
> > > >>>> >> (Jan reports it's up from ~400KB/sec to 10MB/sec) and makes sync(1)
> > > >>>> >> finish properly even when new pages are being dirted.
> > > >>>> >
> > > >>>> >This seems broken.
> > > >>>> 
> > > >>>> It seems so. Jens, Jan Kara, your patch does not entirely fix this.
> > > >>>> While there is no sync/fsync to be seen in these traces, I can
> > > >>>> tell there's a livelock, without Dirty decreasing at all.
> > > >
> > > >What ultimately became of the discussion and/or the patch? 
> > > >
> > > >Your original ad-hoc patch certainly still does its job; had no need to 
> > > >reboot in 86 days and still counting.
> > > 
> > > I still observe this behavior on 2.6.36-rc8. This is starting to 
> > > get frustrating, so I will be happily following akpm's advise to 
> > > poke people.
> >   Yes, that's a good way :)
> > 
> > > Thread entrypoint: http://lkml.org/lkml/2010/2/12/41
> > > 
> > > Previously, many concurrent extractions of tarballs and so on have been 
> > > one way to trigger the issue; I now also have a rather small testcase 
> > > (below) that freezes the box here (which has 24G RAM, so even if I'm 
> > > lacking to call msync, I should be fine) sometime after memset finishes.
> >   I've tried your test but didn't succeed in freezing my laptop.
> > Everything was running smooth, the machine even felt reasonably responsive
> > although constantly reading and writing to disk. Also sync(1) finished in a
> > couple of seconds as one would expect in an optimistic case.
> >   Needless to say that my laptop has only 1G of ram so I had to downsize
> > the hash table from 16G to 1G to be able to run the test and the disk is
> > Intel SSD so the performance of the backing storage compared to the amount
> > of needed IO is much in my favor.
> >   OK, so I've taken a machine with standard rotational drive and 28GB of
> > ram and there I can see sync(1) hanging (but otherwise the machine looks
> > OK). Investigating further...
>   So with the writeback tracing, I verified that indeed the trouble is that
> work queued by sync(1) gets queued behind the background writeback which is
> just running. And background writeback won't stop because your process is
> dirtying pages so agressively. Actually, it would stop after writing
> LONG_MAX pages but that's effectively infinity. I have a patch
> (e.g. http://www.kerneltrap.com/mailarchive/linux-fsdevel/2010/8/3/6886244)
> to stop background writeback when other work is queued but it's kind
> of hacky so I can see why Christoph doesn't like it ;)
>   So I'll have to code something different to fix this issue...
  OK, so at Kernel Summit we agreed to fix the issue as I originally wanted
by patches
http://marc.info/?l=linux-fsdevel&m=128861735213143&w=2
and
http://marc.info/?l=linux-fsdevel&m=128861734813131&w=2

  I needed one more patch to resolve the issue (attached) which I've just
posted for review and possible inclusion. I had a similar one long time ago
but now I'm better able to explain why it works because of tracepoints.
Yay! ;). With those three patches I'm not able to trigger livelocks (but
sync takes still 15 minutes because the througput to disk is about 4MB/s -
no big surprise given the random nature of the load)

								Honza
-- 
Jan Kara <jack@...e.cz>
SUSE Labs, CR

View attachment "0001-mm-Avoid-livelocking-of-WB_SYNC_ALL-writeback.patch" of type "text/x-patch" (3001 bytes)