linux-kernel - Re: [PATCH] zswap: do not crash the kernel on decompression failure

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250226035730.GA1775487@cmpxchg.org>
Date: Tue, 25 Feb 2025 22:57:30 -0500
From: Johannes Weiner <hannes@...xchg.org>
To: Yosry Ahmed <yosry.ahmed@...ux.dev>
Cc: Nhat Pham <nphamcs@...il.com>, akpm@...ux-foundation.org,
	chengming.zhou@...ux.dev, linux-mm@...ck.org, kernel-team@...a.com,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] zswap: do not crash the kernel on decompression failure

On Wed, Feb 26, 2025 at 02:45:41AM +0000, Yosry Ahmed wrote:
> On Tue, Feb 25, 2025 at 07:51:49PM -0500, Johannes Weiner wrote:
> > On Tue, Feb 25, 2025 at 01:32:00PM -0800, Nhat Pham wrote:
> > > +	}
> > >  	mutex_unlock(&acomp_ctx->mutex);
> > >  
> > >  	if (src != acomp_ctx->buffer)
> > >  		zpool_unmap_handle(zpool, entry->handle);
> > > +	return ret;
> > >  }
> > >  
> > >  /*********************************
> > > @@ -1018,6 +1028,7 @@ static int zswap_writeback_entry(struct zswap_entry *entry,
> > >  	struct writeback_control wbc = {
> > >  		.sync_mode = WB_SYNC_NONE,
> > >  	};
> > > +	int ret = 0;
> > >  
> > >  	/* try to allocate swap cache folio */
> > >  	mpol = get_task_policy(current);
> > > @@ -1034,8 +1045,8 @@ static int zswap_writeback_entry(struct zswap_entry *entry,
> > >  	 * and freed when invalidated by the concurrent shrinker anyway.
> > >  	 */
> > >  	if (!folio_was_allocated) {
> > > -		folio_put(folio);
> > > -		return -EEXIST;
> > > +		ret = -EEXIST;
> > > +		goto put_folio;
> > >  	}
> > >  
> > >  	/*
> > > @@ -1048,14 +1059,17 @@ static int zswap_writeback_entry(struct zswap_entry *entry,
> > >  	 * be dereferenced.
> > >  	 */
> > >  	tree = swap_zswap_tree(swpentry);
> > > -	if (entry != xa_cmpxchg(tree, offset, entry, NULL, GFP_KERNEL)) {
> > > -		delete_from_swap_cache(folio);
> > > -		folio_unlock(folio);
> > > -		folio_put(folio);
> > > -		return -ENOMEM;
> > > +	if (entry != xa_load(tree, offset)) {
> > > +		ret = -ENOMEM;
> > > +		goto fail;
> > >  	}
> > >  
> > > -	zswap_decompress(entry, folio);
> > > +	if (!zswap_decompress(entry, folio)) {
> > > +		ret = -EIO;
> > > +		goto fail;
> > > +	}
> > > +
> > > +	xa_erase(tree, offset);
> > >  
> > >  	count_vm_event(ZSWPWB);
> > >  	if (entry->objcg)
> > > @@ -1071,9 +1085,14 @@ static int zswap_writeback_entry(struct zswap_entry *entry,
> > >  
> > >  	/* start writeback */
> > >  	__swap_writepage(folio, &wbc);
> > > -	folio_put(folio);
> > > +	goto put_folio;
> > >  
> > > -	return 0;
> > > +fail:
> > > +	delete_from_swap_cache(folio);
> > > +	folio_unlock(folio);
> > > +put_folio:
> > > +	folio_put(folio);
> > > +	return ret;
> > >  }
> > 
> > Nice, yeah it's time for factoring out the error unwinding. If you
> > write it like this, you can save a jump in the main sequence:
> > 
> > 	__swap_writepage(folio, &wbc);
> > 	ret = 0;
> > put:
> > 	folio_put(folio);
> > 	return ret;
> > delete_unlock:
> 
> (I like how you sneaked the label rename in here, I didn't like 'fail'
> either :P)
> 
> > 	delete_from_swap_cache(folio);
> > 	folio_unlock(folio);
> > 	goto put;
> 
> I would go even further and avoid gotos completely (and make it super
> clear what gets executed in the normal path vs the failure path):
> 
> 	__swap_writepage(folio, &wbc);
> 	folio_put(folio);
> 	if (ret) {
> 		delete_from_swap_cache(folio);
> 		folio_unlock(folio);
> 	}
> 	return ret;

The !folio_was_allocated case only needs the put. I guess that could
stay open-coded.

And I think you still need one goto for the other two error legs to
jump past the __swap_writepage.

> > Something like this?
> > 
> > 	if (!zswap_decompress(entry, folio)) {
> > 		/*
> > 		 * The zswap_load() return value doesn't indicate success or
> > 		 * failure, but whether zswap owns the swapped out contents.
> > 		 * This MUST return true here, otherwise swap_readpage() will
> > 		 * read garbage from the backend.
> > 		 *
> > 		 * Success is signaled by marking the folio uptodate.
> > 		 */
> 
> We use the same trick in the folio_test_large() branch, so maybe this
> should be moved to above the function definition. Then we can perhaps
> refer to it in places where we return true wihout setting uptodate for
> added clarity if needed.

That makes sense to me. Nhat, what do you think?