lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z7Oqy2j4xew7FW9Z@dread.disaster.area>
Date: Tue, 18 Feb 2025 08:31:55 +1100
From: Dave Chinner <david@...morbit.com>
To: Yunsheng Lin <linyunsheng@...wei.com>
Cc: Yishai Hadas <yishaih@...dia.com>, Jason Gunthorpe <jgg@...pe.ca>,
	Shameer Kolothum <shameerali.kolothum.thodi@...wei.com>,
	Kevin Tian <kevin.tian@...el.com>,
	Alex Williamson <alex.williamson@...hat.com>,
	Chris Mason <clm@...com>, Josef Bacik <josef@...icpanda.com>,
	David Sterba <dsterba@...e.com>, Gao Xiang <xiang@...nel.org>,
	Chao Yu <chao@...nel.org>, Yue Hu <zbestahu@...il.com>,
	Jeffle Xu <jefflexu@...ux.alibaba.com>,
	Sandeep Dhavale <dhavale@...gle.com>,
	Carlos Maiolino <cem@...nel.org>,
	"Darrick J. Wong" <djwong@...nel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Jesper Dangaard Brouer <hawk@...nel.org>,
	Ilias Apalodimas <ilias.apalodimas@...aro.org>,
	"David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>,
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
	Simon Horman <horms@...nel.org>,
	Trond Myklebust <trondmy@...nel.org>,
	Anna Schumaker <anna@...nel.org>,
	Chuck Lever <chuck.lever@...cle.com>,
	Jeff Layton <jlayton@...nel.org>, Neil Brown <neilb@...e.de>,
	Olga Kornievskaia <okorniev@...hat.com>,
	Dai Ngo <Dai.Ngo@...cle.com>, Tom Talpey <tom@...pey.com>,
	Luiz Capitulino <luizcap@...hat.com>,
	Mel Gorman <mgorman@...hsingularity.net>, kvm@...r.kernel.org,
	virtualization@...ts.linux.dev, linux-kernel@...r.kernel.org,
	linux-btrfs@...r.kernel.org, linux-erofs@...ts.ozlabs.org,
	linux-xfs@...r.kernel.org, linux-mm@...ck.org,
	netdev@...r.kernel.org, linux-nfs@...r.kernel.org
Subject: Re: [RFC] mm: alloc_pages_bulk: remove assumption of populating only
 NULL elements

On Mon, Feb 17, 2025 at 08:31:23PM +0800, Yunsheng Lin wrote:
> As mentioned in [1], it seems odd to check NULL elements in
> the middle of page bulk allocating, and it seems caller can
> do a better job of bulk allocating pages into a whole array
> sequentially without checking NULL elements first before
> doing the page bulk allocation.
....

IMO, the new API is a poor one, and you've demonstrated it clearly
in this patch.

.....

> diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> index 15bb790359f8..9e1ce0ab9c35 100644
> --- a/fs/xfs/xfs_buf.c
> +++ b/fs/xfs/xfs_buf.c
> @@ -377,16 +377,17 @@ xfs_buf_alloc_pages(
>  	 * least one extra page.
>  	 */
>  	for (;;) {
> -		long	last = filled;
> +		long	alloc;
>  
> -		filled = alloc_pages_bulk(gfp_mask, bp->b_page_count,
> -					  bp->b_pages);
> +		alloc = alloc_pages_bulk(gfp_mask, bp->b_page_count - refill,
> +					 bp->b_pages + refill);
> +		refill += alloc;
>  		if (filled == bp->b_page_count) {
>  			XFS_STATS_INC(bp->b_mount, xb_page_found);
>  			break;
>  		}
>  
> -		if (filled != last)
> +		if (alloc)
>  			continue;

You didn't even compile this code - refill is not defined
anywhere.

Even if it did complile, you clearly didn't test it. The logic is
broken (what updates filled?) and will result in the first
allocation attempt succeeding and then falling into an endless retry
loop.

i.e. you stepped on the API landmine of your own creation where
it is impossible to tell the difference between alloc_pages_bulk()
returning "memory allocation failed, you need to retry" and
it returning "array is full, nothing more to allocate". Both these
cases now return 0.

The existing code returns nr_populated in both cases, so it doesn't
matter why alloc_pages_bulk() returns with nr_populated != full, it
is very clear that we still need to allocate more memory to fill it.

The whole point of the existing API is to prevent callers from
making stupid, hard to spot logic mistakes like this. Forcing
callers to track both empty slots and how full the array is itself,
whilst also constraining where in the array empty slots can occur
greatly reduces both the safety and functionality that
alloc_pages_bulk() provides. Anyone that has code that wants to
steal a random page from the array and then refill it now has a heap
more complex code to add to their allocator wrapper.

IOWs, you just demonstrated why the existing API is more desirable
than a highly constrained, slightly faster API that requires callers
to get every detail right. i.e. it's hard to get it wrong with the
existing API, yet it's so easy to make mistakes with the proposed
API that the patch proposing the change has serious bugs in it.

-Dave.
-- 
Dave Chinner
david@...morbit.com

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ