lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20D79D51-468A-4FA7-9213-F0EC2AD3D78A@linaro.org>
Date:   Mon, 19 Aug 2019 19:00:56 +0200
From:   Paolo Valente <paolo.valente@...aro.org>
To:     Josef Bacik <josef@...icpanda.com>
Cc:     linux-block <linux-block@...r.kernel.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Jens Axboe <axboe@...nel.dk>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        noreply-spamdigest via bfq-iosched 
        <bfq-iosched@...glegroups.com>, Tejun Heo <tj@...nel.org>
Subject: Re: io.latency controller apparently not working



> Il giorno 19 ago 2019, alle ore 18:41, Paolo Valente <paolo.valente@...aro.org> ha scritto:
> 
> 
> 
>> Il giorno 16 ago 2019, alle ore 20:17, Paolo Valente <paolo.valente@...aro.org> ha scritto:
>> 
>> 
>> 
>>> Il giorno 16 ago 2019, alle ore 19:59, Josef Bacik <josef@...icpanda.com> ha scritto:
>>> 
>>> On Fri, Aug 16, 2019 at 07:52:40PM +0200, Paolo Valente wrote:
>>>> 
>>>> 
>>>>> Il giorno 16 ago 2019, alle ore 15:21, Josef Bacik <josef@...icpanda.com> ha scritto:
>>>>> 
>>>>> On Fri, Aug 16, 2019 at 12:57:41PM +0200, Paolo Valente wrote:
>>>>>> Hi,
>>>>>> I happened to test the io.latency controller, to make a comparison
>>>>>> between this controller and BFQ.  But io.latency seems not to work,
>>>>>> i.e., not to reduce latency compared with what happens with no I/O
>>>>>> control at all.  Here is a summary of the results for one of the
>>>>>> workloads I tested, on three different devices (latencies in ms):
>>>>>> 
>>>>>>          no I/O control        io.latency         BFQ
>>>>>> NVMe SSD     1.9                   1.9                0.07
>>>>>> SATA SSD     39                    56                 0.7
>>>>>> HDD          4500                  4500               11
>>>>>> 
>>>>>> I have put all details on hardware, OS, scenarios and results in the
>>>>>> attached pdf.  For your convenience, I'm pasting the source file too.
>>>>>> 
>>>>> 
>>>>> Do you have the fio jobs you use for this?
>>>> 
>>>> The script mentioned in the draft (executed with the command line
>>>> reported in the draft), executes one fio instance for the target
>>>> process, and one fio instance for each interferer.  I couldn't do with
>>>> just one fio instance executing all jobs, because the weight parameter
>>>> doesn't work in fio jobfiles for some reason, and because the ioprio
>>>> class cannot be set for individual jobs.
>>>> 
>>>> In particular, the script generates a job with the following
>>>> parameters for the target process:
>>>> 
>>>> ioengine=sync
>>>> loops=10000
>>>> direct=0
>>>> readwrite=randread
>>>> fdatasync=0
>>>> bs=4k
>>>> thread=0
>>>> filename=/mnt/scsi_debug/largefile_interfered0
>>>> iodepth=1
>>>> numjobs=1
>>>> invalidate=1
>>>> 
>>>> and a job with the following parameters for each of the interferers,
>>>> in case, e.g., of a workload made of reads:
>>>> 
>>>> ioengine=sync
>>>> direct=0
>>>> readwrite=read
>>>> fdatasync=0
>>>> bs=4k
>>>> filename=/mnt/scsi_debug/largefileX
>>>> invalidate=1
>>>> 
>>>> Should you fail to reproduce this issue by creating groups, setting
>>>> latencies and starting fio jobs manually, what if you try by just
>>>> executing my script?  Maybe this could help us spot the culprit more
>>>> quickly.
>>> 
>>> Ah ok, you are doing it on a mountpoint.
>> 
>> Yep
>> 
>>> Are you using btrfs?
>> 
>> ext4
>> 
>>> Cause otherwise
>>> you are going to have a sad time.
>> 
>> Could you elaborate more on this?  I/O seems to be controllable on ext4.
>> 
>>> The other thing is you are using buffered,
>> 
>> Actually, the problem is suffered by sync random reads, which always
>> hit the disk in this test.
>> 
>>> which may or may not hit the disk.  This is what I use to test io.latency
>>> 
>>> https://patchwork.kernel.org/patch/10714425/
>>> 
>>> I had to massage it since it didn't apply directly, but running this against the
>>> actual block device, with O_DIRECT so I'm sure to be measure the actual impact
>>> of the controller, it all works out fine.
>> 
>> I'm not getting why non-direct sync reads, or buffered writes, should
>> be uncontrollable.  As a trivial example, BFQ in this tests controls
>> I/O as expected, and keeps latency extremely low.
>> 
>> What am I missing?
>> 
> 
> While waiting for your answer, I've added also the direct-I/O case to
> my test.  Now we have also this new case reproduced by the command
> line reported in the draft.
> 
> Even with direct I/O, nothing changes with writers as interferers,
> apart from latency becoming at least equal to the case of no I/O
> control for the HDD.  Summing up, with writers as interferers (latency
> in ms):
> 
>            no I/O control        io.latency         BFQ
> NVMe SSD     3                     3                 0.2
> SATA SSD     3                     3                 0.2
> HDD          56                    56                13
> 
> In contrast, there are important improvements with the SSDs, in case
> of readers as interferers.  This is the new situation (latency still
> in ms):
> 
>            no I/O control        io.latency         BFQ
> NVMe SSD     1.9                   0.08              0.07
> SATA SSD     39                    0.2               0.7
> HDD          4500                  118               11
> 

I'm sorry, I didn't repeat tests with direct I/O for BFQ too.  And
results change for BFQ too in case of readers as interferes.  Here
are all correct figures for readers as interferers (latency in ms):

           no I/O control        io.latency         BFQ
NVMe SSD     1.9                   0.08              0.07
SATA SSD     39                    0.2               0.2
HDD          4500                  118               10

Thanks,
Paolo


> Thanks,
> Paolo
> 
>> Thanks,
>> Paolo
>> 
>>> Thanks,
>>> 
>>> Josef

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ