lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 22 Dec 2011 17:39:00 +0100
From:	Oleg Nesterov <oleg@...hat.com>
To:	Tejun Heo <tj@...nel.org>,
	Andrew Morton <akpm@...ux-foundation.org>
Cc:	linux-kernel@...r.kernel.org
Subject: Re: + mempool-fix-first-round-failure-behavior.patch added to -mm
	tree

Hello,

> For the initial allocation, mempool passes modified gfp mask to the
> backing allocator so that it doesn't try too hard when there are reserved
> elements waiting in the pool; however, when that allocation fails and pool
> is empty too, it either waits for the pool to be replenished before
> retrying or fails if !__GFP_WAIT.
>
> * If the caller was calling in with GFP_ATOMIC, it never gets to try
>   emergency reserve.  Allocations which would have succeeded without
>   mempool may fail, which is just wrong.
>
> * Allocation which could have succeeded after a bit of reclaim now has
>   to wait on the reserved items and it's not like mempool doesn't retry
>   with the original gfp mask.  It just does that *after* someone returns
>   an element, pointlessly delaying things.
>
> Fix it by retrying immediately with the gfp mask requested by the caller
> if the first round of allocation attempts fails with modified mask.

I can't even explain why this (simple!) logic looks confusing to me,
with or without the patch. A couple of questions:

	1. Why do we remove __GFP_WAIT unconditionally before the the
	   very 1st allocation?

	2. Why do we always restore it after io_schedule(), even if
	   we have the reserved items?

IOW, what do you think about the patch below instead?

Oleg.

--- x/mm/mempool.c
+++ x/mm/mempool.c
@@ -212,10 +212,12 @@ void * mempool_alloc(mempool_t *pool, gf
 	gfp_mask |= __GFP_NORETRY;	/* don't loop in __alloc_pages */
 	gfp_mask |= __GFP_NOWARN;	/* failures are OK */
 
-	gfp_temp = gfp_mask & ~(__GFP_WAIT|__GFP_IO);
-
 repeat_alloc:
 
+	gfp_temp = gfp_mask;
+	if (pool->curr_nr)
+		gfp_temp &= ~(__GFP_WAIT|__GFP_IO);
+
 	element = pool->alloc(gfp_temp, pool->pool_data);
 	if (likely(element != NULL))
 		return element;
@@ -229,13 +231,15 @@ repeat_alloc:
 	}
 
 	/* We must not sleep in the GFP_ATOMIC case */
-	if (!(gfp_mask & __GFP_WAIT)) {
+	if (!(gfp_temp & __GFP_WAIT)) {
 		spin_unlock_irqrestore(&pool->lock, flags);
+		/* raced with another mempool_alloc? */
+		if (gfp_mask & __GFP_WAIT)
+			goto repeat_alloc;
 		return NULL;
 	}
 
 	/* Let's wait for someone else to return an element to @pool */
-	gfp_temp = gfp_mask;
 	init_wait(&wait);
 	prepare_to_wait(&pool->wait, &wait, TASK_UNINTERRUPTIBLE);
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ