lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADtm3G4CEhpmrohufmthB_1a49bKEVdVUAQxjWtigq07G4QeTQ@mail.gmail.com>
Date:	Mon, 5 Jan 2015 20:01:45 -0800
From:	Gregory Fong <gregory.0xf0@...il.com>
To:	Joonsoo Kim <iamjoonsoo.kim@....com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Rik van Riel <riel@...hat.com>,
	Johannes Weiner <hannes@...xchg.org>,
	Mel Gorman <mgorman@...e.de>,
	Laura Abbott <lauraa@...eaurora.org>,
	Minchan Kim <minchan@...nel.org>,
	Heesub Shin <heesub.shin@...sung.com>, Marek@...per.es,
	linux-mm@...ck.org,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2 2/3] CMA: aggressively allocate the pages on cma
 reserved memory when not used

+linux-mm and linux-kernel (not sure how those got removed from cc,
sorry about that)

On Mon, Jan 5, 2015 at 7:58 PM, Gregory Fong <gregory.0xf0@...il.com> wrote:
> Hi Joonsoo,
>
> On Wed, May 28, 2014 at 12:04 AM, Joonsoo Kim <iamjoonsoo.kim@....com> wrote:
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 674ade7..ca678b6 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -788,6 +788,56 @@ void __init __free_pages_bootmem(struct page *page, unsigned int order)
>>  }
>>
>>  #ifdef CONFIG_CMA
>> +void adjust_managed_cma_page_count(struct zone *zone, long count)
>> +{
>> +       unsigned long flags;
>> +       long total, cma, movable;
>> +
>> +       spin_lock_irqsave(&zone->lock, flags);
>> +       zone->managed_cma_pages += count;
>> +
>> +       total = zone->managed_pages;
>> +       cma = zone->managed_cma_pages;
>> +       movable = total - cma - high_wmark_pages(zone);
>> +
>> +       /* No cma pages, so do only movable allocation */
>> +       if (cma <= 0) {
>> +               zone->max_try_movable = pageblock_nr_pages;
>> +               zone->max_try_cma = 0;
>> +               goto out;
>> +       }
>> +
>> +       /*
>> +        * We want to consume cma pages with well balanced ratio so that
>> +        * we have consumed enough cma pages before the reclaim. For this
>> +        * purpose, we can use the ratio, movable : cma. And we doesn't
>> +        * want to switch too frequently, because it prevent allocated pages
>> +        * from beging successive and it is bad for some sorts of devices.
>> +        * I choose pageblock_nr_pages for the minimum amount of successive
>> +        * allocation because it is the size of a huge page and fragmentation
>> +        * avoidance is implemented based on this size.
>> +        *
>> +        * To meet above criteria, I derive following equation.
>> +        *
>> +        * if (movable > cma) then; movable : cma = X : pageblock_nr_pages
>> +        * else (movable <= cma) then; movable : cma = pageblock_nr_pages : X
>> +        */
>> +       if (movable > cma) {
>> +               zone->max_try_movable =
>> +                       (movable * pageblock_nr_pages) / cma;
>> +               zone->max_try_cma = pageblock_nr_pages;
>> +       } else {
>> +               zone->max_try_movable = pageblock_nr_pages;
>> +               zone->max_try_cma = cma * pageblock_nr_pages / movable;
>
> I don't know if anyone's already pointed this out (didn't see anything
> when searching lkml), but while testing this, I noticed this can
> result in a div by zero under memory pressure (movable becomes 0).
> This is not unlikely when the majority of pages are in CMA regions
> (this may seem pathological but we do actually do this right now).
>
> [    0.249674] Division by zero in kernel.
> [    0.249682] CPU: 2 PID: 1 Comm: swapper/0 Not tainted
> 3.14.13-1.3pre-00368-g4d90957-dirty #10
> [    0.249710] [<c001619c>] (unwind_backtrace) from [<c0011fa4>]
> (show_stack+0x10/0x14)
> [    0.249725] [<c0011fa4>] (show_stack) from [<c0538d6c>]
> (dump_stack+0x80/0x90)
> [    0.249740] [<c0538d6c>] (dump_stack) from [<c025e9d0>] (Ldiv0+0x8/0x10)
> [    0.249751] [<c025e9d0>] (Ldiv0) from [<c0094ba4>]
> (adjust_managed_cma_page_count+0x64/0xd8)
> [    0.249762] [<c0094ba4>] (adjust_managed_cma_page_count) from
> [<c00cb2f4>] (cma_release+0xa8/0xe0)
> [    0.249776] [<c00cb2f4>] (cma_release) from [<c0721698>]
> (cma_drvr_probe+0x378/0x470)
> [    0.249787] [<c0721698>] (cma_drvr_probe) from [<c02ce9cc>]
> (platform_drv_probe+0x18/0x48)
> [    0.249799] [<c02ce9cc>] (platform_drv_probe) from [<c02ccfb0>]
> (driver_probe_device+0xac/0x3a4)
> [    0.249808] [<c02ccfb0>] (driver_probe_device) from [<c02cd378>]
> (__driver_attach+0x8c/0x90)
> [    0.249817] [<c02cd378>] (__driver_attach) from [<c02cb390>]
> (bus_for_each_dev+0x60/0x94)
> [    0.249825] [<c02cb390>] (bus_for_each_dev) from [<c02cc674>]
> (bus_add_driver+0x15c/0x218)
> [    0.249834] [<c02cc674>] (bus_add_driver) from [<c02cd9a0>]
> (driver_register+0x78/0xf8)
> [    0.249841] [<c02cd9a0>] (driver_register) from [<c02cea24>]
> (platform_driver_probe+0x20/0xa4)
> [    0.249849] [<c02cea24>] (platform_driver_probe) from [<c0008958>]
> (do_one_initcall+0xd4/0x17c)
> [    0.249857] [<c0008958>] (do_one_initcall) from [<c0719d00>]
> (kernel_init_freeable+0x13c/0x1dc)
> [    0.249864] [<c0719d00>] (kernel_init_freeable) from [<c0534578>]
> (kernel_init+0x8/0xe8)
> [    0.249873] [<c0534578>] (kernel_init) from [<c000ed78>]
> (ret_from_fork+0x14/0x3c)
>
> Could probably just add something above similar to the "no cma pages" case, like
>
> /* No movable pages, so only do CMA allocation */
> if (movable <= 0) {
>         zone->max_try_cma = pageblock_nr_pages;
>         zone->max_try_movable = 0;
>         goto out;
> }
>
>> +       }
>> +
>> +out:
>> +       zone->nr_try_movable = zone->max_try_movable;
>> +       zone->nr_try_cma = zone->max_try_cma;
>> +
>> +       spin_unlock_irqrestore(&zone->lock, flags);
>> +}
>> +
>
> Best regards,
> Gregory
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ