[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0d6e3c02-1952-2177-02d7-10ebeb133940@csail.mit.edu>
Date: Thu, 30 May 2019 01:29:23 -0700
From: "Srivatsa S. Bhat" <srivatsa@...il.mit.edu>
To: Paolo Valente <paolo.valente@...aro.org>
Cc: linux-fsdevel@...r.kernel.org,
linux-block <linux-block@...r.kernel.org>,
linux-ext4@...r.kernel.org, cgroups@...r.kernel.org,
kernel list <linux-kernel@...r.kernel.org>,
Jens Axboe <axboe@...nel.dk>, Jan Kara <jack@...e.cz>,
jmoyer@...hat.com, Theodore Ts'o <tytso@....edu>,
amakhalov@...are.com, anishs@...are.com, srivatsab@...are.com
Subject: Re: CFQ idling kills I/O performance on ext4 with blkio cgroup
controller
On 5/29/19 12:41 AM, Paolo Valente wrote:
>
>
>> Il giorno 29 mag 2019, alle ore 03:09, Srivatsa S. Bhat <srivatsa@...il.mit.edu> ha scritto:
>>
>> On 5/23/19 11:51 PM, Paolo Valente wrote:
>>>
>>>> Il giorno 24 mag 2019, alle ore 01:43, Srivatsa S. Bhat <srivatsa@...il.mit.edu> ha scritto:
>>>>
>>>> When trying to run multiple dd tasks simultaneously, I get the kernel
>>>> panic shown below (mainline is fine, without these patches).
>>>>
>>>
>>> Could you please provide me somehow with a list *(bfq_serv_to_charge+0x21) ?
>>>
>>
>> Hi Paolo,
>>
>> Sorry for the delay! Here you go:
>>
>> (gdb) list *(bfq_serv_to_charge+0x21)
>> 0xffffffff814bad91 is in bfq_serv_to_charge (./include/linux/blkdev.h:919).
>> 914
>> 915 extern unsigned int blk_rq_err_bytes(const struct request *rq);
>> 916
>> 917 static inline unsigned int blk_rq_sectors(const struct request *rq)
>> 918 {
>> 919 return blk_rq_bytes(rq) >> SECTOR_SHIFT;
>> 920 }
>> 921
>> 922 static inline unsigned int blk_rq_cur_sectors(const struct request *rq)
>> 923 {
>> (gdb)
>>
>>
>> For some reason, I've not been able to reproduce this issue after
>> reporting it here. (Perhaps I got lucky when I hit the kernel panic
>> a bunch of times last week).
>>
>> I'll test with your fix applied and see how it goes.
>>
>
> Great! the offending line above gives me hope that my fix is correct.
> If no more failures occur, then I'm eager (and a little worried ...)
> to see how it goes with throughput :)
>
Your fix held up well under my testing :)
As for throughput, with low_latency = 1, I get around 1.4 MB/s with
bfq (vs 1.6 MB/s with mq-deadline). This is a huge improvement
compared to what it was before (70 KB/s).
With tracing on, the throughput is a bit lower (as expected I guess),
about 1 MB/s, and the corresponding trace file
(trace-waker-detection-1MBps) is available at:
https://www.dropbox.com/s/3roycp1zwk372zo/bfq-traces.tar.gz?dl=0
Thank you so much for your tireless efforts in fixing this issue!
Regards,
Srivatsa
VMware Photon OS
Powered by blists - more mailing lists