lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bu3ppl5dpe4kf5ykx4mkkg3vccsqkie335oa7wywv6eyvbp2fk@f4yyphvoczty>
Date: Wed, 16 Apr 2025 11:32:18 +0200
From: Jan Kara <jack@...e.cz>
To: Davidlohr Bueso <dave@...olabs.net>
Cc: jack@...e.cz, tytso@....edu, adilger.kernel@...ger.ca, 
	brauner@...nel.org, mcgrof@...nel.org, willy@...radead.org, hare@...e.de, 
	djwong@...nel.org, linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org, 
	linux-mm@...ck.org
Subject: Re: [PATCH 1/7] fs/buffer: split locking for pagecache lookups

On Tue 15-04-25 16:16:29, Davidlohr Bueso wrote:
> Callers of __find_get_block() may or may not allow for blocking
> semantics, and is currently assumed that it will not. Layout
> two paths based on this. The the private_lock scheme will
> continued to be used for atomic contexts. Otherwise take the
> folio lock instead, which protects the buffers, such as
> vs migration and try_to_free_buffers().
> 
> Per the "hack idea", the latter can alleviate contention on
> the private_lock for bdev mappings. For reasons of determinism
> and avoid making bugs hard to reproduce, the trylocking is not
> attempted.
> 
> No change in semantics. All lookup users still take the spinlock.
> 
> Signed-off-by: Davidlohr Bueso <dave@...olabs.net>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@...e.cz>

								Honza

> ---
>  fs/buffer.c | 41 +++++++++++++++++++++++++----------------
>  1 file changed, 25 insertions(+), 16 deletions(-)
> 
> diff --git a/fs/buffer.c b/fs/buffer.c
> index b99dc69dba37..c72ebff1b3f0 100644
> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -176,18 +176,8 @@ void end_buffer_write_sync(struct buffer_head *bh, int uptodate)
>  }
>  EXPORT_SYMBOL(end_buffer_write_sync);
>  
> -/*
> - * Various filesystems appear to want __find_get_block to be non-blocking.
> - * But it's the page lock which protects the buffers.  To get around this,
> - * we get exclusion from try_to_free_buffers with the blockdev mapping's
> - * i_private_lock.
> - *
> - * Hack idea: for the blockdev mapping, i_private_lock contention
> - * may be quite high.  This code could TryLock the page, and if that
> - * succeeds, there is no need to take i_private_lock.
> - */
>  static struct buffer_head *
> -__find_get_block_slow(struct block_device *bdev, sector_t block)
> +__find_get_block_slow(struct block_device *bdev, sector_t block, bool atomic)
>  {
>  	struct address_space *bd_mapping = bdev->bd_mapping;
>  	const int blkbits = bd_mapping->host->i_blkbits;
> @@ -204,7 +194,16 @@ __find_get_block_slow(struct block_device *bdev, sector_t block)
>  	if (IS_ERR(folio))
>  		goto out;
>  
> -	spin_lock(&bd_mapping->i_private_lock);
> +	/*
> +	 * Folio lock protects the buffers. Callers that cannot block
> +	 * will fallback to serializing vs try_to_free_buffers() via
> +	 * the i_private_lock.
> +	 */
> +	if (atomic)
> +		spin_lock(&bd_mapping->i_private_lock);
> +	else
> +		folio_lock(folio);
> +
>  	head = folio_buffers(folio);
>  	if (!head)
>  		goto out_unlock;
> @@ -236,7 +235,10 @@ __find_get_block_slow(struct block_device *bdev, sector_t block)
>  		       1 << blkbits);
>  	}
>  out_unlock:
> -	spin_unlock(&bd_mapping->i_private_lock);
> +	if (atomic)
> +		spin_unlock(&bd_mapping->i_private_lock);
> +	else
> +		folio_unlock(folio);
>  	folio_put(folio);
>  out:
>  	return ret;
> @@ -1388,14 +1390,15 @@ lookup_bh_lru(struct block_device *bdev, sector_t block, unsigned size)
>   * it in the LRU and mark it as accessed.  If it is not present then return
>   * NULL
>   */
> -struct buffer_head *
> -__find_get_block(struct block_device *bdev, sector_t block, unsigned size)
> +static struct buffer_head *
> +find_get_block_common(struct block_device *bdev, sector_t block,
> +			unsigned size, bool atomic)
>  {
>  	struct buffer_head *bh = lookup_bh_lru(bdev, block, size);
>  
>  	if (bh == NULL) {
>  		/* __find_get_block_slow will mark the page accessed */
> -		bh = __find_get_block_slow(bdev, block);
> +		bh = __find_get_block_slow(bdev, block, atomic);
>  		if (bh)
>  			bh_lru_install(bh);
>  	} else
> @@ -1403,6 +1406,12 @@ __find_get_block(struct block_device *bdev, sector_t block, unsigned size)
>  
>  	return bh;
>  }
> +
> +struct buffer_head *
> +__find_get_block(struct block_device *bdev, sector_t block, unsigned size)
> +{
> +	return find_get_block_common(bdev, block, size, true);
> +}
>  EXPORT_SYMBOL(__find_get_block);
>  
>  /**
> -- 
> 2.39.5
> 
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ