lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 10 Dec 2014 23:11:47 -0000
From:	"Ming Lin" <mlin@...ggr.net>
To:	"Kent Overstreet" <kmo@...erainc.com>
Cc:	"Ming Lin" <mlin@...ggr.net>,
	"Dongsu Park" <dongsu.park@...fitbricks.com>,
	linux-fsdevel@...r.kernel.org,
	"lkml" <linux-kernel@...r.kernel.org>,
	"Jens Axboe" <axboe@...nel.dk>,
	"Christoph Hellwig" <hch@...radead.org>
Subject: Re: Block layer projects that I haven't had time for

> On Wed, Dec 10, 2014 at 02:42:14PM -0800, Ming Lin wrote:
>> On Mon, Dec 8, 2014 at 3:48 AM, Dongsu Park
>> <dongsu.park@...fitbricks.com> wrote:
>> > Thanks for the reply.
>> >
>> > On 05.12.2014 19:02, Kent Overstreet wrote:
>> >> On Thu, Dec 04, 2014 at 12:00:27PM +0100, Dongsu Park wrote:
>> >> > Playing a little with your block_stuff tree based on 3.15, however,
>> >> > I think there still seems to be a couple of issues.
>> >> > First of all, it doesn't work with virtio-blk. A testing Qemu VM
>> panics
>> >> > at the very early stage of booting. This issue should be addressed
>> as
>> >> > the first step, so that other parts can be tested.
>> >>
>> >> Really? I was testing with virtio-blk, that's odd..
>> >
>> > The culprit seems to be the plugging commit.
>> > Before that change, it works well also with virtio-blk.
>> > Though that's not the only issue...
>> >
>> >> > Moreover, I've already tried to rebase these patches on top of
>> current
>> >> > mainline, 3.18-rc7. It's now compilable, but it seems to introduce
>> >> > more bugs about direct-IO. I didn't manage to find out the reason.
>> >> > I'd need to also look at the previous review comments in [1], [2].
>> >> >
>> >> > Don't you have other trees based on top of 3.17 or higher?
>> >> > If not, can I create my own tree based on 3.18-rc7 to publish?
>> >>
>> >> Yeah, I'd post what you have now and I'll try and take a look.
>> >
>> > I've created a git tree to include what I have right now.
>> > Please see <https://github.com/dongsupark/linux>.
>> >
>> > To be able to handle different issues one by one,
>> > I got the entire tree separated out into 4 branches based on 3.18.
>> >
>> > * block-generic-req-for-next : the most stable branch you can test
>> with.
>> >   With this branch, you can test most of block drivers as well as file
>> >   systems with less critical bugs. Though it's not 100% perfect yet,
>> >   e.g. btrfs doesn't seem to work quite well. Thus more tests are
>> needed.
>> >
>> > * block-mpage-bvecs-for-next : block-generic-req-for-next + multipage
>> bvecs.
>> >   This branch shows a critical issue that writing blocks to ext4
>> rootfs
>> >   causes the whole system to crash. Need-to-investigate.
>>
>> I tried block-mpage-bvecs-for-next branch on qemu-kvm with ext4 rootfs.
>> Run "sync" will stuck in kernel.
>>
>> [  480.751901] INFO: task sync:4424 blocked for more than 120 seconds.
>> [  480.753064]       Not tainted 3.18.0-00025-g46c8231 #39
>> [  480.753720] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>> disables this message.
>> [  480.754737] sync            D ffff88001fc11180     0  4424   4338
>> 0x00000000
>> [  480.755719]  ffff88001cdfbc98 0000000000000086 ffff88001cdfbba8
>> ffff880014adefc0
>> [  480.756810]  0000000000011180 0000000000004000 ffffffff81813460
>> ffff880014adefc0
>> [  480.758102]  ffff88001cdfbbc8 ffffffff812f08be ffff88001cdfbc18
>> ffff880014adf028
>> [  480.759454] Call Trace:
>> [  480.759852]  [<ffffffff812f08be>] ? debug_smp_processor_id+0x17/0x19
>> [  480.760609]  [<ffffffff8106093e>] ? __enqueue_entity+0x69/0x6b
>> [  480.761318]  [<ffffffff8106017e>] ? __dequeue_entity+0x33/0x38
>> [  480.762026]  [<ffffffff810601ab>] ? set_next_entity+0x28/0x7d
>> [  480.762739]  [<ffffffff8105a4fb>] ? get_parent_ip+0xf/0x3f
>> [  480.763425]  [<ffffffff8108562b>] ? ktime_get+0x50/0x8f
>> [  480.763848]  [<ffffffff8148abdb>] ? bit_wait_timeout+0x60/0x60
>> [  480.764555]  [<ffffffff8148a6be>] schedule+0x6a/0x6c
>> [  480.765186]  [<ffffffff8148a74f>] io_schedule+0x8f/0xcd
>> [  480.765841]  [<ffffffff8148ac19>] bit_wait_io+0x3e/0x42
>> [  480.766493]  [<ffffffff8148ae80>] __wait_on_bit+0x4d/0x86
>> [  480.767183]  [<ffffffff810d4302>] ? find_get_pages_tag+0x106/0x133
>> [  480.767847]  [<ffffffff810d4a63>] wait_on_page_bit+0x76/0x78
>> [  480.768532]  [<ffffffff8106ab59>] ? wake_atomic_t_function+0x2d/0x2d
>> [  480.769262]  [<ffffffff810d511f>] filemap_fdatawait_range+0x7e/0x11d
>> [  480.769992]  [<ffffffff8148a639>] ? preempt_schedule+0x36/0x51
>> [  480.770677]  [<ffffffff8105a4fb>] ? get_parent_ip+0xf/0x3f
>> [  480.771848]  [<ffffffff810d51df>] filemap_fdatawait+0x21/0x23
>> [  480.772530]  [<ffffffff811458ce>] sync_inodes_sb+0x158/0x1aa
>> [  480.773201]  [<ffffffff81480303>] ? br_mdb_dump+0x225/0x495
>> [  480.773885]  [<ffffffff81149ad8>] ? fdatawrite_one_bdev+0x18/0x18
>> [  480.774592]  [<ffffffff81149aec>] sync_inodes_one_sb+0x14/0x16
>> [  480.775278]  [<ffffffff81125937>] iterate_supers+0x6f/0xc4
>> [  480.775847]  [<ffffffff81149bf4>] sys_sync+0x35/0x83
>> [  480.776460]  [<ffffffff8148da52>] system_call_fastpath+0x12/0x17
>>
>>
>> Here is a quick hack.
>>
>> diff --git a/block/bio.c b/block/bio.c
>> index 4020ccc..fbc7108 100644
>> --- a/block/bio.c
>> +++ b/block/bio.c
>> @@ -829,6 +829,11 @@ int bio_add_page(struct bio *bio, struct page
>> *page,
>>                 if (bvec_to_phys(bv) + bv->bv_len ==
>>                     page_to_phys(page) + offset) {
>>                         bv->bv_len += len;
>> +                       /*
>> +                        * Page is not added to bio vec.
>> +                        * Clear PG_writeback so
>> filemap_fdatawait_range() won't wait for it.
>> +                        */
>> +                       TestClearPageWriteback(page);
>>                         goto done;
>>                 }
>>         }
>>
>> Thanks,
>> Ming
>
> Try this fix:

Yes, it fixed ext4 problem.

Just tried to edit a btrfs file.

[   45.216351] BTRFS error (device sdb1): partial page write in btrfs with
offset 0 and length 8192
[   45.217522] BTRFS critical (device sdb1): bad ordered accounting left 0
size 4096

Thanks,
Ming

>
> diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c
> index b24a2541a9..3d2610b02e 100644
> --- a/fs/ext4/page-io.c
> +++ b/fs/ext4/page-io.c
> @@ -63,15 +63,15 @@ static void buffer_io_error(struct buffer_head *bh)
>
>  static void ext4_finish_bio(struct bio *bio)
>  {
> -	int i;
>  	int error = !test_bit(BIO_UPTODATE, &bio->bi_flags);
> -	struct bio_vec *bvec;
> +	struct bio_vec bvec;
> +	struct bvec_iter iter;
>
> -	bio_for_each_segment_all(bvec, bio, i) {
> -		struct page *page = bvec->bv_page;
> +	bio_for_each_page_all(bvec, bio, iter) {
> +		struct page *page = bvec.bv_page;
>  		struct buffer_head *bh, *head;
> -		unsigned bio_start = bvec->bv_offset;
> -		unsigned bio_end = bio_start + bvec->bv_len;
> +		unsigned bio_start = bvec.bv_offset;
> +		unsigned bio_end = bio_start + bvec.bv_len;
>  		unsigned under_io = 0;
>  		unsigned long flags;
>
>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ