lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <5125cfc1-7710-9145-bf42-1826a30514e9@redhat.com>
Date:   Thu, 6 Oct 2022 17:34:30 -0400
From:   Waiman Long <longman@...hat.com>
To:     Hillf Danton <hdanton@...a.com>
Cc:     Tejun Heo <tj@...nel.org>, Jens Axboe <axboe@...nel.dk>,
        cgroups@...r.kernel.org, linux-block@...r.kernel.org,
        linux-kernel@...r.kernel.org, Ming Lei <ming.lei@...hat.com>,
        linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH v8 3/3] blk-cgroup: Optimize blkcg_rstat_flush()

On 10/6/22 06:11, Hillf Danton wrote:
> On 4 Oct 2022 11:17:48 -0400 Waiman Long <longman@...hat.com>
>> For a system with many CPUs and block devices, the time to do
>> blkcg_rstat_flush() from cgroup_rstat_flush() can be rather long. It
>> can be especially problematic as interrupt is disabled during the flush.
>> It was reported that it might take seconds to complete in some extreme
>> cases leading to hard lockup messages.
>>
>> As it is likely that not all the percpu blkg_iostat_set's has been
>> updated since the last flush, those stale blkg_iostat_set's don't need
>> to be flushed in this case. This patch optimizes blkcg_rstat_flush()
>> by keeping a lockless list of recently updated blkg_iostat_set's in a
>> newly added percpu blkcg->lhead pointer.
>>
>> The blkg_iostat_set is added to a sentinel lockless list on the update
>> side in blk_cgroup_bio_start(). It is removed from the sentinel lockless
>> list when flushed in blkcg_rstat_flush(). Due to racing, it is possible
>> that blk_iostat_set's in the lockless list may have no new IO stats to
>> be flushed, but that is OK.
> So it is likely that another flag, updated when bis is added to/deleted
> from llist, can cut 1/3 off without raising the risk of getting your patch
> over complicated.
>
>>   
>>   struct blkg_iostat_set {
>>   	struct u64_stats_sync		sync;
>> +	struct llist_node		lnode;
>> +	struct blkcg_gq		       *blkg;
> +	atomic_t			queued;
>
>>   	struct blkg_iostat		cur;
>>   	struct blkg_iostat		last;
>>   };

Yes, by introducing a flag to record the lockless list state, it is 
possible to just use the current llist implementation. Maybe I can 
rework it for now without the sentinel variant and post a separate llist 
patch for that later on.

Cheers,
Longman

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ