[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <x49tyrpkjmq.fsf@segfault.boston.devel.redhat.com>
Date: Mon, 05 Apr 2010 11:24:13 -0400
From: Jeff Moyer <jmoyer@...hat.com>
To: Jan Kara <jack@...e.cz>
Cc: linux-ext4@...r.kernel.org, linux-kernel@...r.kernel.org,
jens.axboe@...cle.com, esandeen@...hat.com
Subject: Re: [patch/rft] jbd2: tag journal writes as metadata I/O
Jan Kara <jack@...e.cz> writes:
> Hi,
>
>> In running iozone for writes to small files, we noticed a pretty big
>> discrepency between the performance of the deadline and cfq I/O
>> schedulers. Investigation showed that I/O was being issued from 2
>> different contexts: the iozone process itself, and the jbd2/sdh-8 thread
>> (as expected). Because of the way cfq performs slice idling, the delays
>> introduced between the metadata and data I/Os were significant. For
>> example, cfq would see about 7MB/s versus deadline's 35 for the same
>> workload. I also tested fs_mark with writing and fsyncing 1000 64k
>> files, and a similar 5x performance difference was observed. Eric
>> Sandeen suggested that I flag the journal writes as metadata, and once I
>> did that, the performance difference went away completely (cfq has
>> special logic to prioritize metadata I/O).
>>
>> So, I'm submitting this patch for comments and testing. I have a
>> similar patch for jbd that I will submit if folks agree that this is a
>> good idea.
> This looks like a good idea to me. I'd just be careful about data=journal
> mode where even data is written via journal and thus you'd incorrectly
> prioritize all the IO. I suppose that could have negative impact on performace
> of other filesystems on the same disk. So for data=journal mode, I'd leave
> write_op to be just WRITE / WRITE_SYNC_PLUG.
Hi, Jan, thanks for the review! I'm trying to figure out the best way
to relay the journal mode from ext3 or ext4 to jbd or jbd2. Would a new
journal flag, set in journal_init_inode, be appropriate? This wouldn't
cover the case of data journalling set per inode, though. It also puts
some ext3-specific code into the purportedly fs-agnostic jbd code
(specifically, testing the superblock for the data journal mount flag).
Do you have any suggestions?
Thanks!
Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists