[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <00b74229-f943-fb4b-ade4-3f4511c33146@acm.org>
Date: Fri, 29 May 2020 12:55:43 -0700
From: Bart Van Assche <bvanassche@....org>
To: paulmck@...nel.org, Ming Lei <ming.lei@...hat.com>
Cc: Christoph Hellwig <hch@....de>, linux-block@...r.kernel.org,
John Garry <john.garry@...wei.com>,
Hannes Reinecke <hare@...e.com>,
Thomas Gleixner <tglx@...utronix.de>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 8/8] blk-mq: drain I/O when all CPUs in a hctx are offline
On 2020-05-29 11:13, Paul E. McKenney wrote:
> On Fri, May 29, 2020 at 11:53:15AM +0800, Ming Lei wrote:
>> Another pair is in blk_mq_get_tag(), and we expect the following two
>> memory OPs are ordered:
>>
>> 1) set bit in successful test_and_set_bit_lock(), which is called
>> from sbitmap_get()
>>
>> 2) test_bit(BLK_MQ_S_INACTIVE, &data->hctx->state)
>>
>> Do you think that the above two OPs are ordered?
>
> Given that he has been through the code, I would like to hear Bart's
> thoughts, actually.
Hi Paul,
My understanding of the involved instructions is as follows (see also
https://lore.kernel.org/linux-block/b98f055f-6f38-a47c-965d-b6bcf4f5563f@huawei.com/T/#t
for the entire e-mail thread):
* blk_mq_hctx_notify_offline() sets the BLK_MQ_S_INACTIVE bit in
hctx->state, calls smp_mb__after_atomic() and waits in a loop until all
tags have been freed. Each tag is an integer number that has a 1:1
correspondence with a block layer request structure. The code that
iterates over block layer request tags relies on
__sbitmap_for_each_set(). That function examines both the 'word' and
'cleared' members of struct sbitmap_word.
* What blk_mq_hctx_notify_offline() waits for is freeing of tags by
blk_mq_put_tag(). blk_mq_put_tag() frees a tag by setting a bit in
sbitmap_word.cleared (see also sbitmap_deferred_clear_bit()).
* Tag allocation by blk_mq_get_tag() relies on test_and_set_bit_lock().
The actual allocation happens by sbitmap_get() that sets a bit in
sbitmap_word.word. blk_mg_get_tag() tests the BLK_MQ_S_INACTIVE bit
after tag allocation succeeded.
What confuses me is that blk_mq_hctx_notify_offline() uses
smp_mb__after_atomic() to enforce the order of memory accesses while
blk_mq_get_tag() relies on the acquire semantics of
test_and_set_bit_lock(). Usually ordering is enforced by combining two
smp_mb() calls or by combining a store-release with a load-acquire.
Does the Linux memory model provide the expected ordering guarantees
when combining load-acquire with smp_mb__after_atomic() as used in patch
8/8 of this series?
Thanks,
Bart.
Powered by blists - more mailing lists