lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 29 May 2020 12:55:43 -0700
From:   Bart Van Assche <bvanassche@....org>
To:     paulmck@...nel.org, Ming Lei <ming.lei@...hat.com>
Cc:     Christoph Hellwig <hch@....de>, linux-block@...r.kernel.org,
        John Garry <john.garry@...wei.com>,
        Hannes Reinecke <hare@...e.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 8/8] blk-mq: drain I/O when all CPUs in a hctx are offline

On 2020-05-29 11:13, Paul E. McKenney wrote:
> On Fri, May 29, 2020 at 11:53:15AM +0800, Ming Lei wrote:
>> Another pair is in blk_mq_get_tag(), and we expect the following two
>> memory OPs are ordered:
>>
>> 1) set bit in successful test_and_set_bit_lock(), which is called
>> from sbitmap_get()
>>
>> 2) test_bit(BLK_MQ_S_INACTIVE, &data->hctx->state)
>>
>> Do you think that the above two OPs are ordered?
> 
> Given that he has been through the code, I would like to hear Bart's
> thoughts, actually.

Hi Paul,

My understanding of the involved instructions is as follows (see also
https://lore.kernel.org/linux-block/b98f055f-6f38-a47c-965d-b6bcf4f5563f@huawei.com/T/#t
for the entire e-mail thread):
* blk_mq_hctx_notify_offline() sets the BLK_MQ_S_INACTIVE bit in
hctx->state, calls smp_mb__after_atomic() and waits in a loop until all
tags have been freed. Each tag is an integer number that has a 1:1
correspondence with a block layer request structure. The code that
iterates over block layer request tags relies on
__sbitmap_for_each_set(). That function examines both the 'word' and
'cleared' members of struct sbitmap_word.
* What blk_mq_hctx_notify_offline() waits for is freeing of tags by
blk_mq_put_tag(). blk_mq_put_tag() frees a tag by setting a bit in
sbitmap_word.cleared (see also sbitmap_deferred_clear_bit()).
* Tag allocation by blk_mq_get_tag() relies on test_and_set_bit_lock().
The actual allocation happens by sbitmap_get() that sets a bit in
sbitmap_word.word. blk_mg_get_tag() tests the BLK_MQ_S_INACTIVE bit
after tag allocation succeeded.

What confuses me is that blk_mq_hctx_notify_offline() uses
smp_mb__after_atomic() to enforce the order of memory accesses while
blk_mq_get_tag() relies on the acquire semantics of
test_and_set_bit_lock(). Usually ordering is enforced by combining two
smp_mb() calls or by combining a store-release with a load-acquire.

Does the Linux memory model provide the expected ordering guarantees
when combining load-acquire with smp_mb__after_atomic() as used in patch
8/8 of this series?

Thanks,

Bart.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ