lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200415131915.GV11244@42.do-not-panic.com>
Date:   Wed, 15 Apr 2020 13:19:15 +0000
From:   Luis Chamberlain <mcgrof@...nel.org>
To:     Christoph Hellwig <hch@...radead.org>
Cc:     Alan Jenkins <alan.christopher.jenkins@...il.com>, axboe@...nel.dk,
        viro@...iv.linux.org.uk, bvanassche@....org,
        gregkh@...uxfoundation.org, rostedt@...dmis.org, mingo@...hat.com,
        jack@...e.cz, ming.lei@...hat.com, nstange@...e.de,
        akpm@...ux-foundation.org, mhocko@...e.com, yukuai3@...wei.com,
        linux-block@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        Omar Sandoval <osandov@...com>,
        Hannes Reinecke <hare@...e.com>,
        Michal Hocko <mhocko@...nel.org>
Subject: Re: [PATCH 4/5] mm/swapfile: refcount block and queue before using
 blkcg_schedule_throttle()

On Wed, Apr 15, 2020 at 12:34:43AM -0700, Christoph Hellwig wrote:
> On Wed, Apr 15, 2020 at 12:27:12AM -0700, Christoph Hellwig wrote:
> > On Wed, Apr 15, 2020 at 05:42:34AM +0000, Luis Chamberlain wrote:
> > > > I don't understand the atomic part of the comment.  How does
> > > > bdgrab/bdput help us there?
> > > 
> > > The commit log above did a better job at explaining this in terms of our
> > > goal to use the request_queue and how this use would prevent the risk of
> > > releasing the request_queue, which could sleep.
> > 
> > So bdput eventually does and iput, but what leads to an out of context
> > offload?
> > 
> > But anyway, isn't the original problem better solved by simply not
> > releasing the queue from atomic context to start with?  There isn't
> > really any good reason we keep holding the spinlock once we have a
> > reference on the queue, so something like this (not even compile tested)
> > should do the work:
> 
> Actually - mem_cgroup_throttle_swaprate already checks for a non-NULL
> current->throttle_queue above, so we should never even call
> blk_put_queue here.  Was this found by code inspection, or was there
> a real report?

No but report, this code path came up as a possible use of
blk_put_queue() which already exists in atomic context. So yes, it
already uses blk_put_queue(), but how are we *sure* its not going to be
the last one? Because if it is that mean the release will happen in
atomic context, defeating the goal of the last patch in this series.

Using bdgrab() however ensures that during the lifecycle of this path,
blk_put_queue() won't be the last if used after, but instead we are
sure it will be upon disk release.

In fact, with bdgrab() isn't the blk_get_queue() on
blkcg_schedule_throttle() no longer needed?

> In the latter case we need to figure out what protects >throttle_queue,
> as the way blkcg_schedule_throttle first put the reference and only then
> assign a value to it already looks like it introduces a tiny race
> window.
> 
> Otherwise just open coding the applicable part of blkcg_schedule_throttle
> in mem_cgroup_throttle_swaprate seems easiest:
> 
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 5871a2aa86a5..e16051ef074c 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -3761,15 +3761,20 @@ void mem_cgroup_throttle_swaprate(struct mem_cgroup *memcg, int node,
>  	 */
>  	if (current->throttle_queue)
>  		return;
> +	if (unlikely(current->flags & PF_KTHREAD))
> +		return;
>  
>  	spin_lock(&swap_avail_lock);
>  	plist_for_each_entry_safe(si, next, &swap_avail_heads[node],
>  				  avail_lists[node]) {
> -		if (si->bdev) {
> -			blkcg_schedule_throttle(bdev_get_queue(si->bdev),
> -						true);
> -			break;
> +		if (!si->bdev)
> +			continue;
> +		if (blk_get_queue(dev_get_queue(si->bdev))) {
> +			current->throttle_queue = dev_get_queue(si->bdev);
> +			current->use_memdelay = true;
> +			set_notify_resume(current);
>  		}
> +		break;
>  	}
>  	spin_unlock(&swap_avail_lock);
>  }

Sorry, its not clear to me  who calls the respective blk_put_queue()
here?

  Luis

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ