[<prev] [next>] [day] [month] [year] [list]
Message-ID: <5125cfc1-7710-9145-bf42-1826a30514e9@redhat.com>
Date: Thu, 6 Oct 2022 17:34:30 -0400
From: Waiman Long <longman@...hat.com>
To: Hillf Danton <hdanton@...a.com>
Cc: Tejun Heo <tj@...nel.org>, Jens Axboe <axboe@...nel.dk>,
cgroups@...r.kernel.org, linux-block@...r.kernel.org,
linux-kernel@...r.kernel.org, Ming Lei <ming.lei@...hat.com>,
linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH v8 3/3] blk-cgroup: Optimize blkcg_rstat_flush()
On 10/6/22 06:11, Hillf Danton wrote:
> On 4 Oct 2022 11:17:48 -0400 Waiman Long <longman@...hat.com>
>> For a system with many CPUs and block devices, the time to do
>> blkcg_rstat_flush() from cgroup_rstat_flush() can be rather long. It
>> can be especially problematic as interrupt is disabled during the flush.
>> It was reported that it might take seconds to complete in some extreme
>> cases leading to hard lockup messages.
>>
>> As it is likely that not all the percpu blkg_iostat_set's has been
>> updated since the last flush, those stale blkg_iostat_set's don't need
>> to be flushed in this case. This patch optimizes blkcg_rstat_flush()
>> by keeping a lockless list of recently updated blkg_iostat_set's in a
>> newly added percpu blkcg->lhead pointer.
>>
>> The blkg_iostat_set is added to a sentinel lockless list on the update
>> side in blk_cgroup_bio_start(). It is removed from the sentinel lockless
>> list when flushed in blkcg_rstat_flush(). Due to racing, it is possible
>> that blk_iostat_set's in the lockless list may have no new IO stats to
>> be flushed, but that is OK.
> So it is likely that another flag, updated when bis is added to/deleted
> from llist, can cut 1/3 off without raising the risk of getting your patch
> over complicated.
>
>>
>> struct blkg_iostat_set {
>> struct u64_stats_sync sync;
>> + struct llist_node lnode;
>> + struct blkcg_gq *blkg;
> + atomic_t queued;
>
>> struct blkg_iostat cur;
>> struct blkg_iostat last;
>> };
Yes, by introducing a flag to record the lockless list state, it is
possible to just use the current llist implementation. Maybe I can
rework it for now without the sentinel variant and post a separate llist
patch for that later on.
Cheers,
Longman
Powered by blists - more mailing lists