linux-kernel - Re: [PATCH 0/15] Per-bdi writeback flusher threads v10

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <1245202532.2560.402.camel@ymzhang>
Date:	Wed, 17 Jun 2009 09:35:32 +0800
From:	"Zhang, Yanmin" <yanmin_zhang@...ux.intel.com>
To:	Jens Axboe <jens.axboe@...cle.com>
Cc:	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
	chris.mason@...cle.com, david@...morbit.com, hch@...radead.org,
	akpm@...ux-foundation.org, jack@...e.cz, richard@....demon.co.uk,
	damien.wyart@...e.fr, dedekind1@...il.com, fweisbec@...il.com
Subject: Re: [PATCH 0/15] Per-bdi writeback flusher threads v10

On Tue, 2009-06-16 at 10:00 +0200, Jens Axboe wrote:
> On Tue, Jun 16 2009, Zhang, Yanmin wrote:
> > On Fri, 2009-06-12 at 14:54 +0200, Jens Axboe wrote:
> > > Hi,
> > > 
> > > Here's the 10th version of the writeback patches. Changes since v9:
> > > 
> > > - Fix bdi task exit race leaving work on the list, flush it after we
> > >   know we cannot be found anymore.
> > > - Rename flusher tasks from bdi-foo to flush-foo. Should make it more
> > >   clear to the casual observer.
> > > - Fix a problem with the btrfs bdi register patch that would spew
> > >   warnings for > 1 mounted btrfs file system.
> > > - Rebase to current -git, there were some conflicts with the latest work
> > >   from viro/hch.
> > > - Fix a block layer core problem were stacked devices would overwrite
> > >   the bdi state, causing problems and warning spew.
> > > - In bdi_writeback_all(), in the race occurence of a work allocation
> > >   failure, restart scanning from the beginning. Then we can drop the
> > >   bdi_lock mutex before diving into bdi specific writeback.
> > > - Convert bdi_lock to a spinlock.
> > > - Use spin_trylock() in bdi_writeback_all(), if this isn't a data
> > >   integrity writeback. Debatable, I kind of like it...
> > > - Get rid of BDI_CAP_FLUSH_FORKER, just check for match with the
> > >   default_backing_dev_info.
> > > - Fix race in list checking in bdi_forker_task().
> > > 
> > > 
> > > For ease of patching, I've put the full diff here:
> > > 
> > >   http://kernel.dk/writeback-v10.patch
> > Jens,
> > 
> > I applied the patch to 2.6.30 and got a confliction. The attachment is
> > the patch I ported to 2.6.30. Did I miss anything?
> > 
> > 
> > With the patch, kernel reports below messages on 2 machines.
> > 
> > INFO: task sync:29984 blocked for more than 120 seconds.
> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > sync          D ffff88002805e300  6168 29984  24581
> >  ffff88022f84b780 0000000000000082 7fffffffffffffff ffff880133dbfe70
> >  0000000000000000 ffff88022e2b4c50 ffff88022e2b4fd8 00000001000c7bb8
> >  ffff88022f513fd0 ffff880133dbfde8 ffff880133dbfec8 ffff88022d5d13c8
> > Call Trace:
> >  [<ffffffff802b69e4>] ? bdi_sched_wait+0x0/0xd
> >  [<ffffffff80780fde>] ? schedule+0x9/0x1d
> >  [<ffffffff802b69ed>] ? bdi_sched_wait+0x9/0xd
> >  [<ffffffff8078158d>] ? __wait_on_bit+0x40/0x6f
> >  [<ffffffff802b69e4>] ? bdi_sched_wait+0x0/0xd
> >  [<ffffffff80781628>] ? out_of_line_wait_on_bit+0x6c/0x78
> >  [<ffffffff8024a426>] ? wake_bit_function+0x0/0x23
> >  [<ffffffff802b67ac>] ? bdi_writeback_all+0x12a/0x152
> >  [<ffffffff802b6805>] ? generic_sync_sb_inodes+0x31/0xde
> >  [<ffffffff802b6935>] ? sync_inodes_sb+0x83/0x88
> >  [<ffffffff802b6980>] ? __sync_inodes+0x46/0x8f
> >  [<ffffffff802b94f2>] ? do_sync+0x36/0x5a
> >  [<ffffffff802b9538>] ? sys_sync+0xe/0x12
> >  [<ffffffff8020b9ab>] ? system_call_fastpath+0x16/0x1b
> 
> I don't think it is your backport, for some reason the v10 missed a
> change that I think could solve this race. If not, there's another in
> there that I need to look at.
> 
> So against your current base, could you try with the below added as
> well? The printk() is just so we can see if this triggers for you or
> not.
> 
> diff --git a/mm/backing-dev.c b/mm/backing-dev.c
> index b3e80c5..a065da5 100644
> --- a/mm/backing-dev.c
> +++ b/mm/backing-dev.c
> @@ -384,6 +384,15 @@ static int bdi_start_fn(void *ptr)
>  	 */
>  	synchronize_srcu(&bdi->srcu);
>  
> +	/*
> +	 * Flush any pending work. No more can be added, since
> +	 * the bdi is no longer discoverable.
> +	 */
> +	if (!list_empty(&bdi->work_list)) {
> +		printk("bdi: flushing racy work\n");
I ran testings with the patch. 2 machines reported the same issues when do sync
just after mounting filesystems for testing.

These 2 machines did print out above info. One printed it just before dumping the
blocking info.

I ran a series of test cases. Every case does umount filesystems, then mount
them again for testing.

> +		wb_do_writeback(wb);
> +	}
> +
>  	bdi_put_wb(bdi, wb);
>  	return ret;
>  }
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/