linux-kernel - Re: [PATCH 12/15] block: introduce blk-iolatency io controller

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <69aaf06b-ab1a-9982-a547-fcab7daff55f@kernel.dk>
Date:   Thu, 28 Jun 2018 09:35:33 -0600
From:   Jens Axboe <axboe@...nel.dk>
To:     Josef Bacik <josef@...icpanda.com>, Jens Axboe <axboe@...nel.dk>
Cc:     linux-block@...r.kernel.org, kernel-team@...com,
        akpm@...ux-foundation.org, hannes@...xchg.org,
        linux-kernel@...r.kernel.org, tj@...nel.org,
        linux-fsdevel@...r.kernel.org, Josef Bacik <jbacik@...com>
Subject: Re: [PATCH 12/15] block: introduce blk-iolatency io controller

On 6/28/18 7:26 AM, Josef Bacik wrote:
> On Wed, Jun 27, 2018 at 01:24:55PM -0600, Jens Axboe wrote:
>> On 6/27/18 1:20 PM, Josef Bacik wrote:
>>> On Wed, Jun 27, 2018 at 01:06:31PM -0600, Jens Axboe wrote:
>>>> On 6/25/18 9:12 AM, Josef Bacik wrote:
>>>>> +static void __blkcg_iolatency_throttle(struct rq_qos *rqos,
>>>>> +				       struct iolatency_grp *iolat,
>>>>> +				       spinlock_t *lock, bool issue_as_root,
>>>>> +				       bool use_memdelay)
>>>>> +	__releases(lock)
>>>>> +	__acquires(lock)
>>>>> +{
>>>>> +	struct rq_wait *rqw = &iolat->rq_wait;
>>>>> +	unsigned use_delay = atomic_read(&lat_to_blkg(iolat)->use_delay);
>>>>> +	DEFINE_WAIT(wait);
>>>>> +	bool first_block = true;
>>>>> +
>>>>> +	if (use_delay)
>>>>> +		blkcg_schedule_throttle(rqos->q, use_memdelay);
>>>>> +
>>>>> +	/*
>>>>> +	 * To avoid priority inversions we want to just take a slot if we are
>>>>> +	 * issuing as root.  If we're being killed off there's no point in
>>>>> +	 * delaying things, we may have been killed by OOM so throttling may
>>>>> +	 * make recovery take even longer, so just let the IO's through so the
>>>>> +	 * task can go away.
>>>>> +	 */
>>>>> +	if (issue_as_root || fatal_signal_pending(current)) {
>>>>> +		atomic_inc(&rqw->inflight);
>>>>> +		return;
>>>>> +	}
>>>>> +
>>>>> +	if (iolatency_may_queue(iolat, &wait, first_block))
>>>>> +		return;
>>>>> +
>>>>> +	do {
>>>>> +		prepare_to_wait_exclusive(&rqw->wait, &wait,
>>>>> +					  TASK_UNINTERRUPTIBLE);
>>>>> +
>>>>> +		iolatency_may_queue(iolat, &wait, first_block);
>>>>> +		first_block = false;
>>>>> +
>>>>> +		if (lock) {
>>>>> +			spin_unlock_irq(lock);
>>>>> +			io_schedule();
>>>>> +			spin_lock_irq(lock);
>>>>> +		} else {
>>>>> +			io_schedule();
>>>>> +		}
>>>>> +	} while (1);
>>>>
>>>> So how does this wait loop ever exit?
>>>>
>>>
>>> Sigh, I cleaned this up from what we're using in production and did it poorly,
>>> I'll fix it up.  Thanks,
>>
>> Also may want to consider NOT using exclusive add if first_block == false, as
>> you'll end up at the tail of the waitqueue after sleeping and being denied.
>> This is similar to the wbt change I posted last week.
>>
> 
> This isn't how it works though.  You aren't removed from the list until you do
> finish_wait(), so you don't lose your spot on the list.  We only get added to
> the end of the list if
> 
>         if (list_empty(&wq_entry->entry))
> 
> otherwise nothing changes.

I missed that you don't do finish_wait() in the loop, I had played with that
to see if it fixes things. But yeah, as it stands, you are right.

>> For may_queue(), your wq_has_sleeper() is also going to be always true
>> inside your loop, since you call it after doing the prepare_to_wait()
>> which adds you to the queue. That's why wbt does the list checks, but
>> it'd be nicer to have a wq_has_other_sleepers() for that. So your
>> first iolatency_may_queue() inside the loop will always be false.
> 
> Ah yeah that's a good point, I'll go back to using what you had to catch that
> case.  Thanks,

Basically we need to do the same thing in wbt and blk-iolatency for this,
so we should sync them up.

-- 
Jens Axboe