linux-kernel - Re: [PATCH 0/15] Per-bdi writeback flusher threads v10

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20090618051338.GH11363@kernel.dk>
Date:	Thu, 18 Jun 2009 07:13:39 +0200
From:	Jens Axboe <jens.axboe@...cle.com>
To:	"Zhang, Yanmin" <yanmin_zhang@...ux.intel.com>
Cc:	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
	chris.mason@...cle.com, david@...morbit.com, hch@...radead.org,
	akpm@...ux-foundation.org, jack@...e.cz, richard@....demon.co.uk,
	damien.wyart@...e.fr, dedekind1@...il.com, fweisbec@...il.com
Subject: Re: [PATCH 0/15] Per-bdi writeback flusher threads v10

On Thu, Jun 18 2009, Zhang, Yanmin wrote:
> On Tue, 2009-06-16 at 21:53 +0200, Jens Axboe wrote:
> > On Tue, Jun 16 2009, Jens Axboe wrote:
> > > On Tue, Jun 16 2009, Zhang, Yanmin wrote:
> > > > On Fri, 2009-06-12 at 14:54 +0200, Jens Axboe wrote:
> > > > > Hi,
> > > > > 
> > > > > Here's the 10th version of the writeback patches. Changes since v9:
> > > > > 
> > > > > - Fix bdi task exit race leaving work on the list, flush it after we
> > > > >   know we cannot be found anymore.
> > > > > - Rename flusher tasks from bdi-foo to flush-foo. Should make it more
> > > > >   clear to the casual observer.
> > > > > - Fix a problem with the btrfs bdi register patch that would spew
> > > > >   warnings for > 1 mounted btrfs file system.
> > > > > - Rebase to current -git, there were some conflicts with the latest work
> > > > >   from viro/hch.
> > > > > - Fix a block layer core problem were stacked devices would overwrite
> > > > >   the bdi state, causing problems and warning spew.
> > > > > - In bdi_writeback_all(), in the race occurence of a work allocation
> > > > >   failure, restart scanning from the beginning. Then we can drop the
> > > > >   bdi_lock mutex before diving into bdi specific writeback.
> > > > > - Convert bdi_lock to a spinlock.
> > > > > - Use spin_trylock() in bdi_writeback_all(), if this isn't a data
> > > > >   integrity writeback. Debatable, I kind of like it...
> > > > > - Get rid of BDI_CAP_FLUSH_FORKER, just check for match with the
> > > > >   default_backing_dev_info.
> > > > > - Fix race in list checking in bdi_forker_task().
> > > > > 
> > > > > 
> > > > > For ease of patching, I've put the full diff here:
> > > > > 
> > > > >   http://kernel.dk/writeback-v10.patch
> > > > Jens,
> > > > 
> > > > I applied the patch to 2.6.30 and got a confliction. The attachment is
> > > > the patch I ported to 2.6.30. Did I miss anything?
> > > > 
> > > > 
> > > > With the patch, kernel reports below messages on 2 machines.
> > > > 
> > > > INFO: task sync:29984 blocked for more than 120 seconds.
> > > > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > > > sync          D ffff88002805e300  6168 29984  24581
> > > >  ffff88022f84b780 0000000000000082 7fffffffffffffff ffff880133dbfe70
> > > >  0000000000000000 ffff88022e2b4c50 ffff88022e2b4fd8 00000001000c7bb8
> > > >  ffff88022f513fd0 ffff880133dbfde8 ffff880133dbfec8 ffff88022d5d13c8
> > > > Call Trace:
> > > >  [<ffffffff802b69e4>] ? bdi_sched_wait+0x0/0xd
> > > >  [<ffffffff80780fde>] ? schedule+0x9/0x1d
> > > >  [<ffffffff802b69ed>] ? bdi_sched_wait+0x9/0xd
> > > >  [<ffffffff8078158d>] ? __wait_on_bit+0x40/0x6f
> > > >  [<ffffffff802b69e4>] ? bdi_sched_wait+0x0/0xd
> > > >  [<ffffffff80781628>] ? out_of_line_wait_on_bit+0x6c/0x78
> > > >  [<ffffffff8024a426>] ? wake_bit_function+0x0/0x23
> > > >  [<ffffffff802b67ac>] ? bdi_writeback_all+0x12a/0x152
> > > >  [<ffffffff802b6805>] ? generic_sync_sb_inodes+0x31/0xde
> > > >  [<ffffffff802b6935>] ? sync_inodes_sb+0x83/0x88
> > > >  [<ffffffff802b6980>] ? __sync_inodes+0x46/0x8f
> > > >  [<ffffffff802b94f2>] ? do_sync+0x36/0x5a
> > > >  [<ffffffff802b9538>] ? sys_sync+0xe/0x12
> > > >  [<ffffffff8020b9ab>] ? system_call_fastpath+0x16/0x1b
> > > 
> > > I don't think it is your backport, for some reason the v10 missed a
> > > change that I think could solve this race. If not, there's another in
> > > there that I need to look at.
> > > 
> > > So against your current base, could you try with the below added as
> > > well? The printk() is just so we can see if this triggers for you or
> > > not.
> > 
> > OK that wont work, since we need to actually wait for the work to be
> > flushed, otherwise we wreak things when we free the bdi immediately
> > after that.
> > 
> > Can you try with this patch?
> Jens,
> 
> I tested below patch on 4 machines (run all fio sub-test cases twice which
> need more than 10 hours). The previous 2 machines don't stop this time.
> Unfortunately, the 3rd machine stops. I double-check the disassembled codes
> of kernel and make sure bdi_start_fn really calls wb_do_writeback.

Sorry I should have made that more clear when posting v11. This patch
wont fully solve the problem, however the v11 patch series should. So if
you test with that, hopefully all soft hangs should be gone.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/