linux-kernel - Re: [PATCH -next v2 8/9] block: fix null-pointer dereference in ioc_pd

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <96487803-12cc-a694-0099-784106596fd1@huaweicloud.com>
Date:   Mon, 5 Dec 2022 17:32:17 +0800
From:   Yu Kuai <yukuai1@...weicloud.com>
To:     Tejun Heo <tj@...nel.org>, Yu Kuai <yukuai1@...weicloud.com>,
        Christoph Hellwig <hch@...radead.org>
Cc:     Li Nan <linan122@...wei.com>, josef@...icpanda.com,
        axboe@...nel.dk, cgroups@...r.kernel.org,
        linux-block@...r.kernel.org, linux-kernel@...r.kernel.org,
        yi.zhang@...wei.com, "yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [PATCH -next v2 8/9] block: fix null-pointer dereference in
 ioc_pd_init

Hi, Tejun!

While reviewing rq_qos code, I found that there are some other possible
defects:

1) queue_lock is held to protect rq_qos_add() and rq_qos_del(), whlie
it's not held to protect rq_qos_exit(), which is absolutely not safe
because they can be called concurrently by configuring iocost and
removing device.
I'm thinking about holding the lock to fetch the list and reset
q->rq_qos first:

diff --git a/block/blk-rq-qos.c b/block/blk-rq-qos.c
index 88f0fe7dcf54..271ad65eebd9 100644
--- a/block/blk-rq-qos.c
+++ b/block/blk-rq-qos.c
@@ -288,9 +288,15 @@ void rq_qos_wait(struct rq_wait *rqw, void 
*private_data,

  void rq_qos_exit(struct request_queue *q)
  {
-       while (q->rq_qos) {
-               struct rq_qos *rqos = q->rq_qos;
-               q->rq_qos = rqos->next;
+       struct rq_qos *rqos;
+
+       spin_lock_irq(&q->queue_lock);
+       rqos = q->rq_qos;
+       q->rq_qos = NULL;
+       spin_unlock_irq(&q->queue_lock);
+
+       while (rqos) {
                 rqos->ops->exit(rqos);
+               rqos = rqos->next;
         }
  }

2) rq_qos_add() can still succeed after rq_qos_exit() is done, which
will cause memory leak. Hence a checking is required beforing adding
to q->rq_qos. I'm thinking about flag QUEUE_FLAG_DYING first, but the
flag will not set if disk state is not marked GD_OWNS_QUEUE. Since
blk_unregister_queue() is called before rq_qos_exit(), use the queue
flag QUEUE_FLAG_REGISTERED should be OK.

For the current problem that device can be removed while initializing
, I'm thinking about some possible solutions:

Since bfq is initialized in elevator initialization, and others are
in queue initialization, such problem is only possible in iocost, hence
it make sense to fix it in iocost:

1) use open mutex to prevet concurrency, however, this will cause
that configuring iocost will block some other operations that is relied
on open_mutex.

@@ -2889,7 +2889,15 @@ static int blk_iocost_init(struct gendisk *disk)
         if (ret)
                 goto err_free_ioc;

+       mutex_lock(&disk->open_mutex);
+       if (!disk_live(disk)) {
+               mutex_unlock(&disk->open_mutex);
+               ret = -ENODEV;
+               goto err_del_qos;
+       }
+
         ret = blkcg_activate_policy(q, &blkcg_policy_iocost);
+       mutex_unlock(&disk->open_mutex);
         if (ret)
                 goto err_del_qos;

2) like 1), the difference is that define a new mutex just in iocst.

3) Or is it better to fix it in the higher level? For example:
add a new restriction that blkcg_deactivate_policy() should be called
with blkcg_activate_policy() in pairs, and blkcg_deactivate_policy()
will wait for blkcg_activate_policy() to finish. Something like:

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index ef4fef1af909..6266f702157f 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -1410,7 +1410,7 @@ int blkcg_activate_policy(struct request_queue *q,
         struct blkcg_gq *blkg, *pinned_blkg = NULL;
         int ret;

-       if (blkcg_policy_enabled(q, pol))
+       if (WARN_ON_ONCE(blkcg_policy_enabled(q, pol)))
                 return 0;

         if (queue_is_mq(q))
@@ -1477,6 +1477,8 @@ int blkcg_activate_policy(struct request_queue *q,
                 blkg_put(pinned_blkg);
         if (pd_prealloc)
                 pol->pd_free_fn(pd_prealloc);
+       if (!ret)
+               wake_up(q->policy_waitq);
         return ret;

  enomem:
@@ -1512,7 +1514,7 @@ void blkcg_deactivate_policy(struct request_queue *q,
         struct blkcg_gq *blkg;

         if (!blkcg_policy_enabled(q, pol))
-               return;
+               wait_event(q->policy_waitq, blkcg_policy_enabled(q, pol));
    wait_event(q->xxx, blkcg_policy_enabled(q, pol));