lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZshyPVEc9w4sqXJy@fedora>
Date: Fri, 23 Aug 2024 19:27:57 +0800
From: Ming Lei <ming.lei@...hat.com>
To: Muchun Song <songmuchun@...edance.com>
Cc: axboe@...nel.dk, linux-block@...r.kernel.org,
	linux-kernel@...r.kernel.org, ming.lei@...hat.com
Subject: Re: [PATCH 4/4] block: fix fix ordering between checking
 QUEUE_FLAG_QUIESCED and adding requests to hctx->dispatch

On Sun, Aug 11, 2024 at 06:19:21PM +0800, Muchun Song wrote:
> Supposing the following scenario.
> 
> CPU0                                                                CPU1
> 
> blk_mq_request_issue_directly()                                     blk_mq_unquiesce_queue()
>     if (blk_queue_quiesced())                                           blk_queue_flag_clear(QUEUE_FLAG_QUIESCED)   3) store
>         blk_mq_insert_request()                                         blk_mq_run_hw_queues()
>             /*                                                              blk_mq_run_hw_queue()
>              * Add request to dispatch list or set bitmap of                    if (!blk_mq_hctx_has_pending())     4) load
>              * software queue.                  1) store                            return
>              */
>         blk_mq_run_hw_queue()
>             if (blk_queue_quiesced())           2) load
>                 return
>             blk_mq_sched_dispatch_requests()
> 
> The full memory barrier should be inserted between 1) and 2), as well as
> between 3) and 4) to make sure that either CPU0 sees QUEUE_FLAG_QUIESCED is
> cleared or CPU1 sees dispatch list or setting of bitmap of software queue.
> Otherwise, either CPU will not re-run the hardware queue causing starvation.

Memory barrier shouldn't serve as bug fix for two slow code paths.

One simple fix is to add helper of blk_queue_quiesced_lock(), and
call the following check on CPU0:

	if (blk_queue_quiesced_lock())
         blk_mq_run_hw_queue();


thanks,
Ming


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ