lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201222132428.GA2938310@T590>
Date:   Tue, 22 Dec 2020 21:24:28 +0800
From:   Ming Lei <ming.lei@...hat.com>
To:     John Garry <john.garry@...wei.com>
Cc:     Bart Van Assche <bvanassche@....org>, axboe@...nel.dk,
        linux-block@...r.kernel.org, linux-kernel@...r.kernel.org,
        hch@....de, hare@...e.de, kashyap.desai@...adcom.com,
        linuxarm@...wei.com
Subject: Re: [RFC PATCH v2 2/2] blk-mq: Lockout tagset iter when freeing rqs

On Tue, Dec 22, 2020 at 11:22:19AM +0000, John Garry wrote:
> Resend without ppvk@...eaurora.org, which bounces for me
> 
> On 22/12/2020 02:13, Bart Van Assche wrote:
> > On 12/21/20 10:47 AM, John Garry wrote:
> >> Yes, I agree, and I'm not sure what I wrote to give that impression.
> >>
> >> About "root partition", above, I'm just saying that / is mounted on a
> >> sda partition:
> >>
> >> root@...ntu:/home/john# mount | grep sda
> >> /dev/sda2 on / type ext4 (rw,relatime,errors=remount-ro,stripe=32)
> >> /dev/sda1 on /boot/efi type vfat
> >> (rw,relatime,fmask=0077,dmask=0077,codepage=437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro)
> > Hi John,
> >
> 
> Hi Bart, Ming,
> 
> > Thanks for the clarification. I want to take back my suggestion about
> > adding rcu_read_lock() / rcu_read_unlock() in blk_mq_tagset_busy_iter()
> > since it is not allowed to sleep inside an RCU read-side critical
> > section, since blk_mq_tagset_busy_iter() is used in request timeout
> > handling and since there may be blk_mq_ops.timeout implementations that
> > sleep.
> 
> Yes, that's why I was going with atomic, rather than some synchronization
> primitive which may sleep.
> 
> >
> > Ming's suggestion to serialize blk_mq_tagset_busy_iter() and
> > blk_mq_free_rqs() looks interesting to me.
> >
> 
> So then we could have something like this:
> 
> ---8<---
> 
>  -435,9 +444,13 @@ void blk_mq_queue_tag_busy_iter(struct request_queue *q,
> busy_iter_fn *fn,
>     if (!blk_mq_hw_queue_mapped(hctx))
>             continue;
> 
> +    while (!atomic_inc_not_zero(&tags->iter_usage_counter));
> +
>     if (tags->nr_reserved_tags)
>         bt_for_each(hctx, tags->breserved_tags, fn, priv, true);
>     bt_for_each(hctx, tags->bitmap_tags, fn, priv, false);
> 
> +    atomic_dec(&tags->iter_usage_counter);
> }

Then it is just one spin_lock variant, and you may have to consider
lock validation.

For example, scsi_host_busy() is called from scsi_log_completion()<-scsi_softirq_done(),
which may be run in irq context, then dead lock can be triggered when the irq
is fired during freeing request.

thanks,
Ming

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ