[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <786aacda-b25d-67f6-bad3-0030b0e2637e@kernel.dk>
Date: Mon, 28 Nov 2022 08:42:51 -0700
From: Jens Axboe <axboe@...nel.dk>
To: Waiman Long <longman@...hat.com>, Tejun Heo <tj@...nel.org>
Cc: cgroups@...r.kernel.org, linux-block@...r.kernel.org,
linux-kernel@...r.kernel.org, Ming Lei <ming.lei@...hat.com>,
Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Michal Koutný <mkoutny@...e.com>,
Hillf Danton <hdanton@...a.com>, Yi Zhang <yi.zhang@...hat.com>
Subject: Re: [PATCH-block] blk-cgroup: Use css_tryget() in
blkcg_destroy_blkgs()
On 11/28/22 8:38?AM, Waiman Long wrote:
> On 11/28/22 09:14, Jens Axboe wrote:
>> On 11/27/22 8:30?PM, Waiman Long wrote:
>>> Commit 951d1e94801f ("blk-cgroup: Flush stats at blkgs destruction
>>> path") incorrectly assumes that css_get() will always succeed. That may
>>> not be true if there is no blkg associated with the blkcg. If css_get()
>>> fails, the subsequent css_put() call may lead to data corruption as
>>> was illustrated in a test system that it crashed on bootup when that
>>> commit was included. Also blkcg may be freed at any time leading to
>>> use-after-free. Fix it by using css_tryget() instead and bail out if
>>> the tryget fails.
>>>
>>> Fixes: 951d1e94801f ("blk-cgroup: Flush stats at blkgs destruction path")
>>> Reported-by: Yi Zhang <yi.zhang@...hat.com>
>>> Signed-off-by: Waiman Long <longman@...hat.com>
>>> ---
>>> ? block/blk-cgroup.c | 7 ++++++-
>>> ? 1 file changed, 6 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
>>> index 57941d2a8ba3..74fefc8cbcdf 100644
>>> --- a/block/blk-cgroup.c
>>> +++ b/block/blk-cgroup.c
>>> @@ -1088,7 +1088,12 @@ static void blkcg_destroy_blkgs(struct blkcg *blkcg)
>>> ? ????? might_sleep();
>>> ? -??? css_get(&blkcg->css);
>>> +??? /*
>>> +???? * If css_tryget() fails, there is no blkg to destroy.
>>> +???? */
>>> +??? if (!css_tryget(&blkcg->css))
>>> +??????? return;
>>> +
>>> ????? spin_lock_irq(&blkcg->lock);
>>> ????? while (!hlist_empty(&blkcg->blkg_list)) {
>>> ????????? struct blkcg_gq *blkg = hlist_entry(blkcg->blkg_list.first,
>> This doesn't seem safe to me, but maybe I'm missing something. A tryget
>> operation can be fine if we're under RCU lock and the entity is freed
>> appropriately, but what makes it safe here? Could blkcg already be gone
>> at this point?
>
> The actual freeing of the blkcg structure is under RCU protection. So
> the structure won't be freed immediately even if css_tryget() fails. I
> suspect what Michal found may be the root cause of this problem. If
> so, this is an existing bug which gets exposed by my patch.
But what prevents it from going away here since you're not under RCU
lock for the tryget? Doesn't help that the freeing side is done in an
RCU safe manner, if the ref attempt is not.
--
Jens Axboe
Powered by blists - more mailing lists