lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <E35C2C89-8D5C-4A6C-8750-3D6C3432EF4F@gmail.com>
Date:	Thu, 28 May 2015 01:00:13 +0900
From:	Jungseok Lee <jungseoklee85@...il.com>
To:	Minchan Kim <minchan@...nel.org>
Cc:	Arnd Bergmann <arnd@...db.de>,
	linux-arm-kernel@...ts.infradead.org,
	Catalin Marinas <catalin.marinas@....com>, barami97@...il.com,
	Will Deacon <will.deacon@....com>,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [RFC PATCH 2/2] arm64: Implement vmalloc based thread_info allocator

On May 27, 2015, at 1:24 PM, Minchan Kim wrote:

Hi, Minchan,

> On Tue, May 26, 2015 at 09:10:11PM +0900, Jungseok Lee wrote:
>> On May 25, 2015, at 11:58 PM, Minchan Kim wrote:
>>> On Mon, May 25, 2015 at 07:01:33PM +0900, Jungseok Lee wrote:
>>>> On May 25, 2015, at 2:49 AM, Arnd Bergmann wrote:
>>>>> On Monday 25 May 2015 01:02:20 Jungseok Lee wrote:
>>>>>> Fork-routine sometimes fails to get a physically contiguous region for
>>>>>> thread_info on 4KB page system although free memory is enough. That is,
>>>>>> a physically contiguous region, which is currently 16KB, is not available
>>>>>> since system memory is fragmented.
>>>>>> 
>>>>>> This patch tries to solve the problem as allocating thread_info memory
>>>>>> from vmalloc space, not 1:1 mapping one. The downside is one additional
>>>>>> page allocation in case of vmalloc. However, vmalloc space is large enough,
>>>>>> around 240GB, under a combination of 39-bit VA and 4KB page. Thus, it is
>>>>>> not a big tradeoff for fork-routine service.
>>>>> 
>>>>> vmalloc has a rather large runtime cost. I'd argue that failing to allocate
>>>>> thread_info structures means something has gone very wrong.
>>>> 
>>>> That is why the feature is marked "N" by default.
>>>> I focused on fork-routine stability rather than performance.
>>> 
>>> If VM has trouble with order-2 allocation, your system would be
>>> trouble soon although fork at the moment manages to be successful
>>> because such small high-order(ex, order <= PAGE_ALLOC_COSTLY_ORDER)
>>> allocation is common in the kernel so VM should handle it smoothly.
>>> If VM didn't, it means we should fix VM itself, not a specific
>>> allocation site. Fork is one of victim by that.
>> 
>> A problem I observed is an user space, not a kernel side. As user applications
>> fail to create threads in order to distribute their jobs properly, they are getting
>> in trouble slowly and then gone.
>> 
>> Yes, fork is one of victim, but damages user applications seriously.
>> At this snapshot, free memory is enough.
> 
> Yes, it's the one you found.
> 
>        *Free memory is enough but why forking was failed*
> 
> You should find the exact reason for it rather than papering over by
> hiding forking fail.
> 
> 1. Investigate how many of movable/unmovable page ratio at the moment
> 2. Investigate why compaction doesn't work
> 3. Investigate why reclaim couldn't make order-2 page
> 
> 
>> 
>>>> Could you give me an idea how to evaluate performance degradation?
>>>> Running some benchmarks would be helpful, but I would like to try to
>>>> gather data based on meaningful methodology.
>>>> 
>>>>> Can you describe the scenario that leads to fragmentation this bad?
>>>> 
>>>> Android, but I could not describe an exact reproduction procedure step
>>>> by step since it's behaved and reproduced randomly. As reading the following
>>>> thread from mm mailing list, a similar symptom is observed on other systems. 
>>>> 
>>>> https://lkml.org/lkml/2015/4/28/59
>>>> 
>>>> Although I do not know the details of a system mentioned in the thread,
>>>> even order-2 page allocation is not smoothly operated due to fragmentation on
>>>> low memory system.
>>> 
>>> What Joonsoo have tackle is generic fragmentation problem, not *a* fork fail,
>>> which is more right approach to handle small high-order allocation problem.
>> 
>> I totally agree with that point. One of the best ways is to figure out a generic
>> anti-fragmentation with VM system improvement. Reducing the stack size to 8KB is also
>> a really great approach. My intention is not to overlook them or figure out a workaround.
>> 
>> IMHO, vmalloc would be a different option in case of ARM64 on low memory systems since
>> *fork failure from fragmentation* is a nontrivial issue.
>> 
>> Do you think the patch set doesn't need to be considered?
> 
> I don't know because the changelog doesn't have full description
> about your problem. You just wrote "forking was failed so we want
> to avoid that by vmalloc because forking is important".

A technical feedback is always welcome.
I really thank everyone who leaves comments in this thread.

However, it is pretty disappointing that my commit log is distorted like that.

[Fork-routine sometimes fails to get a physically contiguous region for
thread_info on 4KB page system although free memory is enough. That is,
a physically contiguous region, which is currently 16KB, is not available
since system memory is fragmented.

This patch tries to solve the problem as allocating thread_info memory
from vmalloc space, not 1:1 mapping one. The downside is one additional
page allocation in case of vmalloc. However, vmalloc space is large enough,
around 240GB, under a combination of 39-bit VA and 4KB page. Thus, it is
not a big tradeoff for fork-routine service.]

Is "forking was failed so we want to avoid that by vmalloc because forking is
important" your paraphrase of the above paragraphs?

Best Regards
Jungseok Lee--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ