[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20111222163900.GA1448@redhat.com>
Date: Thu, 22 Dec 2011 17:39:00 +0100
From: Oleg Nesterov <oleg@...hat.com>
To: Tejun Heo <tj@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>
Cc: linux-kernel@...r.kernel.org
Subject: Re: + mempool-fix-first-round-failure-behavior.patch added to -mm
tree
Hello,
> For the initial allocation, mempool passes modified gfp mask to the
> backing allocator so that it doesn't try too hard when there are reserved
> elements waiting in the pool; however, when that allocation fails and pool
> is empty too, it either waits for the pool to be replenished before
> retrying or fails if !__GFP_WAIT.
>
> * If the caller was calling in with GFP_ATOMIC, it never gets to try
> emergency reserve. Allocations which would have succeeded without
> mempool may fail, which is just wrong.
>
> * Allocation which could have succeeded after a bit of reclaim now has
> to wait on the reserved items and it's not like mempool doesn't retry
> with the original gfp mask. It just does that *after* someone returns
> an element, pointlessly delaying things.
>
> Fix it by retrying immediately with the gfp mask requested by the caller
> if the first round of allocation attempts fails with modified mask.
I can't even explain why this (simple!) logic looks confusing to me,
with or without the patch. A couple of questions:
1. Why do we remove __GFP_WAIT unconditionally before the the
very 1st allocation?
2. Why do we always restore it after io_schedule(), even if
we have the reserved items?
IOW, what do you think about the patch below instead?
Oleg.
--- x/mm/mempool.c
+++ x/mm/mempool.c
@@ -212,10 +212,12 @@ void * mempool_alloc(mempool_t *pool, gf
gfp_mask |= __GFP_NORETRY; /* don't loop in __alloc_pages */
gfp_mask |= __GFP_NOWARN; /* failures are OK */
- gfp_temp = gfp_mask & ~(__GFP_WAIT|__GFP_IO);
-
repeat_alloc:
+ gfp_temp = gfp_mask;
+ if (pool->curr_nr)
+ gfp_temp &= ~(__GFP_WAIT|__GFP_IO);
+
element = pool->alloc(gfp_temp, pool->pool_data);
if (likely(element != NULL))
return element;
@@ -229,13 +231,15 @@ repeat_alloc:
}
/* We must not sleep in the GFP_ATOMIC case */
- if (!(gfp_mask & __GFP_WAIT)) {
+ if (!(gfp_temp & __GFP_WAIT)) {
spin_unlock_irqrestore(&pool->lock, flags);
+ /* raced with another mempool_alloc? */
+ if (gfp_mask & __GFP_WAIT)
+ goto repeat_alloc;
return NULL;
}
/* Let's wait for someone else to return an element to @pool */
- gfp_temp = gfp_mask;
init_wait(&wait);
prepare_to_wait(&pool->wait, &wait, TASK_UNINTERRUPTIBLE);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists