[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20111014162933.d8fead58.akpm@linux-foundation.org>
Date: Fri, 14 Oct 2011 16:29:33 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: Marek Szyprowski <m.szyprowski@...sung.com>
Cc: linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
linux-media@...r.kernel.org, linux-mm@...ck.org,
linaro-mm-sig@...ts.linaro.org,
Michal Nazarewicz <mina86@...a86.com>,
Kyungmin Park <kyungmin.park@...sung.com>,
Russell King <linux@....linux.org.uk>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
Ankita Garg <ankita@...ibm.com>,
Daniel Walker <dwalker@...eaurora.org>,
Mel Gorman <mel@....ul.ie>, Arnd Bergmann <arnd@...db.de>,
Jesse Barker <jesse.barker@...aro.org>,
Jonathan Corbet <corbet@....net>,
Shariq Hasnain <shariq.hasnain@...aro.org>,
Chunsang Jeong <chunsang.jeong@...aro.org>,
Dave Hansen <dave@...ux.vnet.ibm.com>
Subject: Re: [PATCH 2/9] mm: alloc_contig_freed_pages() added
On Thu, 06 Oct 2011 15:54:42 +0200
Marek Szyprowski <m.szyprowski@...sung.com> wrote:
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
>
> This commit introduces alloc_contig_freed_pages() function
The "freed" seems redundant to me. Wouldn't "alloc_contig_pages" be a
better name?
> which allocates (ie. removes from buddy system) free pages
> in range. Caller has to guarantee that all pages in range
> are in buddy system.
>
> Along with this function, a free_contig_pages() function is
> provided which frees all (or a subset of) pages allocated
> with alloc_contig_free_pages().
>
> Michal Nazarewicz has modified the function to make it easier
> to allocate not MAX_ORDER_NR_PAGES aligned pages by making it
> return pfn of one-past-the-last allocated page.
>
>
> ...
>
> +#if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
> +/*
> + * Both PFNs must be from the same zone! If this function returns
> + * true, pfn_to_page(pfn1) + (pfn2 - pfn1) == pfn_to_page(pfn2).
> + */
> +static inline bool zone_pfn_same_memmap(unsigned long pfn1, unsigned long pfn2)
> +{
> + return pfn_to_section_nr(pfn1) == pfn_to_section_nr(pfn2);
> +}
> +
> +#else
> +
> +#define zone_pfn_same_memmap(pfn1, pfn2) (true)
Do this in C, please. It's nicer and can prevent unused-var warnings.
> +#endif
> +
>
> ...
>
> +unsigned long alloc_contig_freed_pages(unsigned long start, unsigned long end,
> + gfp_t flag)
> +{
> + unsigned long pfn = start, count;
> + struct page *page;
> + struct zone *zone;
> + int order;
> +
> + VM_BUG_ON(!pfn_valid(start));
> + page = pfn_to_page(start);
> + zone = page_zone(page);
> +
> + spin_lock_irq(&zone->lock);
> +
> + for (;;) {
> + VM_BUG_ON(page_count(page) || !PageBuddy(page) ||
> + page_zone(page) != zone);
> +
> + list_del(&page->lru);
> + order = page_order(page);
> + count = 1UL << order;
> + zone->free_area[order].nr_free--;
> + rmv_page_order(page);
> + __mod_zone_page_state(zone, NR_FREE_PAGES, -(long)count);
__mod_zone_page_state() generally shouldn't be used - it bypasses the
per-cpu magazines and can introduce high lock contentions.
That's hopefully not an issue on this callpath but it is still a red
flag. I'd suggest at least the addition of a suitably apologetic code
comment here - we don't want people to naively copy this code.
Plus such a comment would let me know why this was done ;)
> + pfn += count;
> + if (pfn >= end)
> + break;
> + VM_BUG_ON(!pfn_valid(pfn));
> +
> + if (zone_pfn_same_memmap(pfn - count, pfn))
> + page += count;
> + else
> + page = pfn_to_page(pfn);
> + }
> +
> + spin_unlock_irq(&zone->lock);
> +
> + /* After this, pages in the range can be freed one be one */
> + count = pfn - start;
> + pfn = start;
> + for (page = pfn_to_page(pfn); count; --count) {
> + prep_new_page(page, 0, flag);
> + ++pfn;
> + if (likely(zone_pfn_same_memmap(pfn - 1, pfn)))
> + ++page;
> + else
> + page = pfn_to_page(pfn);
> + }
> +
> + return pfn;
> +}
> +
> +void free_contig_pages(unsigned long pfn, unsigned nr_pages)
> +{
> + struct page *page = pfn_to_page(pfn);
> +
> + while (nr_pages--) {
> + __free_page(page);
> + ++pfn;
> + if (likely(zone_pfn_same_memmap(pfn - 1, pfn)))
> + ++page;
> + else
> + page = pfn_to_page(pfn);
> + }
> +}
You're sure these functions don't need EXPORT_SYMBOL()? Maybe the
design is that only DMA core calls into here (if so, that's good).
> #ifdef CONFIG_MEMORY_HOTREMOVE
> /*
> * All pages in the range must be isolated before calling this.
> --
> 1.7.1.569.g6f426
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists