If we exhaust the reserves in the page allocator when PF_MEMALLOC is set then no longer give up but call into reclaim with PF_MEMALLOC set. This is in essence a recursive call back into page reclaim with another page flag (__GFP_NOMEMALLOC) set. The recursion is bounded since potential allocations with __PF_NOMEMALLOC set will not enter that branch again. This means that allocation under PF_MEMALLOC will no longer run out of memory. Allocations under PF_MEMALLOC will do a limited form of reclaim instead. The reclaim is of particular important to stacked filesystems that may do a lot of allocations in the write path. Reclaim will be working as long as there are clean file backed pages to reclaim. Signed-off-by: Christoph Lameter --- mm/page_alloc.c | 11 +++++++++++ 1 file changed, 11 insertions(+) Index: linux-2.6/mm/page_alloc.c =================================================================== --- linux-2.6.orig/mm/page_alloc.c 2007-08-13 23:50:01.000000000 -0700 +++ linux-2.6/mm/page_alloc.c 2007-08-13 23:58:43.000000000 -0700 @@ -1306,6 +1306,17 @@ nofail_alloc: zonelist, ALLOC_NO_WATERMARKS); if (page) goto got_pg; + /* + * If we are already in reclaim then the environment + * is already setup. We can simply call + * try_to_get_free_pages(). Just make sure that + * we do not allocate anything. + */ + if (p->flags & PF_MEMALLOC && wait && + try_to_free_pages(zonelist->zones, order, + gfp_mask | __GFP_NOMEMALLOC)) + goto restart; + if (gfp_mask & __GFP_NOFAIL) { congestion_wait(WRITE, HZ/50); goto nofail_alloc; -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/