lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 2 Jun 2014 09:37:49 +0530
From:	Ritesh Harjani <ritesh.list@...il.com>
To:	Joonsoo Kim <js1304@...il.com>
Cc:	Joonsoo Kim <iamjoonsoo.kim@....com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Rik van Riel <riel@...hat.com>,
	Johannes Weiner <hannes@...xchg.org>,
	Mel Gorman <mgorman@...e.de>,
	Laura Abbott <lauraa@...eaurora.org>,
	Minchan Kim <minchan@...nel.org>,
	Heesub Shin <heesub.shin@...sung.com>,
	Marek Szyprowski <m.szyprowski@...sung.com>,
	Michal Nazarewicz <mina86@...a86.com>,
	"Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>,
	Linux Memory Management List <linux-mm@...ck.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Nagachandra P <nagachandra@...il.com>,
	Vinayak Menon <menon.vinayak@...il.com>,
	Ritesh Harjani <ritesh.harjani@...il.com>,
	t.stanislaws@...sung.com
Subject: Re: [PATCH v2 3/3] CMA: always treat free cma pages as non-free on
 watermark checking

Hi Joonsoo,

CC'ing the developer of the patch (Tomasz Stanislawski)


On Fri, May 30, 2014 at 8:16 PM, Joonsoo Kim <js1304@...il.com> wrote:
> 2014-05-30 19:40 GMT+09:00 Ritesh Harjani <ritesh.list@...il.com>:
>> Hi Joonsoo,
>>
>> I think you will be loosing the benefit of below patch with your changes.
>> I am no expert here so please bear with me. I tried explaining in the
>> inline comments, let me know if I am wrong.
>>
>> commit 026b08147923142e925a7d0aaa39038055ae0156
>> Author: Tomasz Stanislawski <t.stanislaws@...sung.com>
>> Date:   Wed Jun 12 14:05:02 2013 -0700
>
> Hello, Ritesh.
>
> Thanks for notifying that.
>
>>
>> On Wed, May 28, 2014 at 12:34 PM, Joonsoo Kim <iamjoonsoo.kim@....com> wrote:
>>> commit d95ea5d1('cma: fix watermark checking') introduces ALLOC_CMA flag
>>> for alloc flag and treats free cma pages as free pages if this flag is
>>> passed to watermark checking. Intention of that patch is that movable page
>>> allocation can be be handled from cma reserved region without starting
>>> kswapd. Now, previous patch changes the behaviour of allocator that
>>> movable allocation uses the page on cma reserved region aggressively,
>>> so this watermark hack isn't needed anymore. Therefore remove it.
>>>
>>> Acked-by: Michal Nazarewicz <mina86@...a86.com>
>>> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@....com>
>>>
>>> diff --git a/mm/compaction.c b/mm/compaction.c
>>> index 627dc2e..36e2fcd 100644
>>> --- a/mm/compaction.c
>>> +++ b/mm/compaction.c
>>> @@ -1117,10 +1117,6 @@ unsigned long try_to_compact_pages(struct zonelist *zonelist,
>>>
>>>         count_compact_event(COMPACTSTALL);
>>>
>>> -#ifdef CONFIG_CMA
>>> -       if (allocflags_to_migratetype(gfp_mask) == MIGRATE_MOVABLE)
>>> -               alloc_flags |= ALLOC_CMA;
>>> -#endif
>>>         /* Compact each zone in the list */
>>>         for_each_zone_zonelist_nodemask(zone, z, zonelist, high_zoneidx,
>>>                                                                 nodemask) {
>>> diff --git a/mm/internal.h b/mm/internal.h
>>> index 07b6736..a121762 100644
>>> --- a/mm/internal.h
>>> +++ b/mm/internal.h
>>> @@ -384,7 +384,6 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
>>>  #define ALLOC_HARDER           0x10 /* try to alloc harder */
>>>  #define ALLOC_HIGH             0x20 /* __GFP_HIGH set */
>>>  #define ALLOC_CPUSET           0x40 /* check for correct cpuset */
>>> -#define ALLOC_CMA              0x80 /* allow allocations from CMA areas */
>>> -#define ALLOC_FAIR             0x100 /* fair zone allocation */
>>> +#define ALLOC_FAIR             0x80 /* fair zone allocation */
>>>
>>>  #endif /* __MM_INTERNAL_H */
>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>> index ca678b6..83a8021 100644
>>> --- a/mm/page_alloc.c
>>> +++ b/mm/page_alloc.c
>>> @@ -1764,20 +1764,22 @@ static bool __zone_watermark_ok(struct zone *z, int order, unsigned long mark,
>>>         long min = mark;
>>>         long lowmem_reserve = z->lowmem_reserve[classzone_idx];
>>>         int o;
>>> -       long free_cma = 0;
>>>
>>>         free_pages -= (1 << order) - 1;
>>>         if (alloc_flags & ALLOC_HIGH)
>>>                 min -= min / 2;
>>>         if (alloc_flags & ALLOC_HARDER)
>>>                 min -= min / 4;
>>> -#ifdef CONFIG_CMA
>>> -       /* If allocation can't use CMA areas don't use free CMA pages */
>>> -       if (!(alloc_flags & ALLOC_CMA))
>>> -               free_cma = zone_page_state(z, NR_FREE_CMA_PAGES);
>>> -#endif
>>> +       /*
>>> +        * We don't want to regard the pages on CMA region as free
>>> +        * on watermark checking, since they cannot be used for
>>> +        * unmovable/reclaimable allocation and they can suddenly
>>> +        * vanish through CMA allocation
>>> +        */
>>> +       if (IS_ENABLED(CONFIG_CMA) && z->managed_cma_pages)
>>> +               free_pages -= zone_page_state(z, NR_FREE_CMA_PAGES);
>>
>> make this free_cma instead of free_pages.
>>
>>>
>>> -       if (free_pages - free_cma <= min + lowmem_reserve)
>>> +       if (free_pages <= min + lowmem_reserve)
>> free_pages - free_cma <= min + lowmem_reserve
>>
>> Because in for loop you subtract nr_free which includes the CMA pages.
>> So if you have subtracted NR_FREE_CMA_PAGES
>> from free_pages above then you will be subtracting cma pages again in
>> nr_free (below in for loop).
>
> Yes, I understand the problem you mentioned.
>
> I think that this is complicated issue.
>
> Comit '026b081' you mentioned makes watermark_ok() loose for high order
> allocation compared to kernel that CMA isn't enabled, since free_pages includes
> free_cma pages and most of high order allocation except THP would be
> non-movable allocation. This non-movable allocation can't use cma pages,
> so we shouldn't include free_cma pages.
>
> If most of free cma pages are 0 order, that commit works correctly. We subtract
> nr of free cma pages at the first loop, so there is no problem. But,
> if the system
> have some free high-order cma pages, watermark checking allow high-order
> allocation more easily.
>
> I think that loosing the watermark check is right solution so will takes your
> comment on v2. But I want to know other developer's opinion.

Thanks for giving this a thought for your v2 patch.


> If needed, I can implement to track free_area[o].nr_cma_free and use it for
> precise freepage calculation in watermark check.

I guess implementing nr_cma_free would be the correct solution.
Because currently for other than 0 order allocation
we still consider high order free_cma pages as free pages in the for
loop which from the code looks incorrect.

This can lead to situation when we have more high order free CMA pages
but very less unmovable pages, but zone_watermark returns
ok for unmovable page, thus leading to allocation failure every time
instead of recovering from this situation.

But its better if experts comment on this.


>
> Thanks.


Thanks
Ritesh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ