linux-kernel - Re: [PATCH 7/8] wbt: add general throttling mechanism

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5728C48F.9010102@fb.com>
Date:	Tue, 3 May 2016 09:32:31 -0600
From:	Jens Axboe <axboe@...com>
To:	Jan Kara <jack@...e.cz>
CC:	<linux-kernel@...r.kernel.org>, <linux-fsdevel@...r.kernel.org>,
	<linux-block@...r.kernel.org>, <dchinner@...hat.com>,
	<sedat.dilek@...il.com>
Subject: Re: [PATCH 7/8] wbt: add general throttling mechanism

On 05/03/2016 09:22 AM, Jan Kara wrote:
> On Tue 03-05-16 08:23:27, Jens Axboe wrote:
>> On 05/03/2016 03:34 AM, Jan Kara wrote:
>>> On Thu 28-04-16 12:53:50, Jens Axboe wrote:
>>>>> 2) As far as I can see in patch 8/8, you have plugged the throttling above
>>>>>     the IO scheduler. When there are e.g. multiple cgroups with different IO
>>>>>     limits operating, this throttling can lead to strange results (like a
>>>>>     cgroup with low limit using up all available background "slots" and thus
>>>>>     effectively stopping background writeback for other cgroups)? So won't
>>>>>     it make more sense to plug this below the IO scheduler? Now I understand
>>>>>     there may be other problems with this but I think we should put more
>>>>>     though to that and provide some justification in changelogs.
>>>>
>>>> One complexity is that we have to do this early for blk-mq, since once you
>>>> get a request, you're already sitting on the hw tag. CoDel should actually
>>>> work fine at each hop, so hopefully this will as well.
>>>
>>> OK, I see. But then this suggests that any IO scheduling and / or
>>> cgroup-related throttling should happen before we get a request for blk-mq
>>> as well? And then we can still do writeback throttling below that layer?
>>
>> Not necessarily. For IO scheduling, basically we care about two parts:
>>
>> 1) Are you allowed to allocate the resources to queue some IO
>> 2) Are you allowed to dispatch
>
> But then it seems suboptimal to waste a relatively scarce resource (which
> HW tag is AFAIU) just because you happen to run from a cgroup that is
> bandwidth limited and thus are not allowed to dispatch?

For some cases, you are absolutely right, and #1 is the main one. For 
your case of QD=1, that's obviously the case. For SATA, it's a bit more 
grey zone, and for others (nvme, scsi, etc), it's not really a scarce 
resource so #2 is the bigger part of it.

-- 
Jens Axboe