linux-kernel - Re: [PATCH RFC 1/3] block/mq-deadline: Revert "block/mq-deadline: Fix the tag reservation code"

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <7081765f-28d7-f594-1221-6034b9e88899@huaweicloud.com>
Date: Tue, 10 Dec 2024 14:22:35 +0800
From: Yu Kuai <yukuai1@...weicloud.com>
To: Yu Kuai <yukuai1@...weicloud.com>, Bart Van Assche <bvanassche@....org>,
 axboe@...nel.dk, akpm@...ux-foundation.org, yang.yang@...o.com,
 ming.lei@...hat.com, osandov@...com, paolo.valente@...aro.org
Cc: linux-block@...r.kernel.org, linux-kernel@...r.kernel.org,
 yi.zhang@...wei.com, yangerkun@...wei.com, "yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [PATCH RFC 1/3] block/mq-deadline: Revert "block/mq-deadline: Fix
 the tag reservation code"

Hi,

在 2024/12/10 9:50, Yu Kuai 写道:
> Hi,
> 
> 在 2024/12/10 2:02, Bart Van Assche 写道:
>> This is not correct. dd->async_depth can be modified via sysfs.
> 
> How about the following patch to fix min_shallow_depth for deadline?
> 
> Thanks,
> Kuai
> 
> diff --git a/block/mq-deadline.c b/block/mq-deadline.c
> index a9cf8e19f9d1..040ebb0b192d 100644
> --- a/block/mq-deadline.c
> +++ b/block/mq-deadline.c
> @@ -667,8 +667,7 @@ static void dd_depth_updated(struct blk_mq_hw_ctx 
> *hctx)
>          struct blk_mq_tags *tags = hctx->sched_tags;
> 
>          dd->async_depth = q->nr_requests;
> -
> -       sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, 1);
> +       sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, 
> dd->async_depth);
>   }
> 
>   /* Called by blk_mq_init_hctx() and blk_mq_init_sched(). */
> @@ -1012,6 +1011,47 @@ SHOW_INT(deadline_fifo_batch_show, dd->fifo_batch);
>   #undef SHOW_INT
>   #undef SHOW_JIFFIES
> 
> +static ssize_t deadline_async_depth_store(struct elevator_queue *e,
> +                                         const char *page, size_t count)
> +{
> +       struct deadline_data *dd = e->elevator_data;
> +       struct request_queue *q = dd->q;
> +       struct blk_mq_hw_ctx *hctx;
> +       unsigned long i;
> +       int v;
> +       int ret = kstrtoint(page, 0, &v);
> +
> +       if (ret < 0)
> +               return ret;
> +
> +       if (v < 1)
> +               v = 1;
> +       else if (v > dd->q->nr_requests)
> +               v = dd->q->nr_requests;
> +
> +       if (v == dd->async_depth)
> +               return count;
> +
> +       blk_mq_freeze_queue(q);
> +       blk_mq_quiesce_queue(q);
> +
> +       dd->async_depth = v;
> +       if (blk_mq_is_shared_tags(q->tag_set->flags)) {
> +               sbitmap_queue_min_shallow_depth(
> +                       &q->sched_shared_tags->bitmap_tags, 
> dd->async_depth);
> +       } else {
> +               queue_for_each_hw_ctx(q, hctx, i)
> +                       sbitmap_queue_min_shallow_depth(
> +                               &hctx->sched_tags->bitmap_tags,
> +                               dd->async_depth);
> +       }

Just realized that this is not ok, q->sysfs_lock must be held to protect
changing hctx, however, the lock ordering is q->sysfs_lock before
eq->sysfs_lock, and this context already hold eq->sysfs_lock.

First of all, are we in the agreement that it's not acceptable to
sacrifice performance in the default scenario just to make sure
functional correctness if async_depth is set to 1?

If so, following are the options that I can think of to fix this:

1) make async_depth read-only, if 75% tags will hurt performance in some
cases, user can increase nr_requests to prevent it.
2) refactor elevator sysfs api, remove eq->sysfs_lock and replace it
with q->sysfs_lock, so deadline_async_depth_store() will be protected
against changing hctxs, and min_shallow_depth can be updated here.
3) other options?

Thanks,
Kuai