linux-kernel - Re: [PATCH v2] vmscan: limit concurrent reclaimers in shrink

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Mon, 14 Dec 2009 13:19:58 +0900
From:	Minchan Kim <minchan.kim@...il.com>
To:	Rik van Riel <riel@...hat.com>
Cc:	lwoodman@...hat.com, akpm@...ux-foundation.org,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] vmscan: limit concurrent reclaimers in shrink_zone

On Mon, Dec 14, 2009 at 1:09 PM, Rik van Riel <riel@...hat.com> wrote:
> On 12/13/2009 07:14 PM, Minchan Kim wrote:
>
>> On Sat, Dec 12, 2009 at 6:46 AM, Rik van Riel<riel@...hat.com>  wrote:
>
>>> If too many processes are active doing page reclaim in one zone,
>>> simply go to sleep in shrink_zone().
>
>> I am worried about one.
>>
>> Now, we can put too many processes reclaim_wait with NR_UNINTERRUBTIBLE
>> state.
>> If OOM happens, OOM will kill many innocent processes since
>> uninterruptible task
>> can't handle kill signal until the processes free from reclaim_wait list.
>>
>> I think reclaim_wait list staying time might be long if VM pressure is
>> heavy.
>> Is this a exaggeration?
>>
>> If it is serious problem, how about this?
>>
>> We add new PF_RECLAIM_BLOCK flag and don't pick the process
>> in select_bad_process.
>
> A simpler solution may be to use sleep_on_interruptible, and
> simply have the process continue into shrink_zone() if it
> gets a signal.

I thought it but I was not sure.
Okay. If it is possible, It' more simple.
Could you repost patch with that?


Sorry but I have one requesting.


===

        +The default value is 8.
        +
        +=============================================================


    I like this. but why do you select default value as constant 8?
    Do you have any reason?

    I think it would be better to select the number proportional to NR_CPU.
    ex) NR_CPU * 2 or something.

    Otherwise looks good to me.


Pessimistically, I assume that the pageout code spends maybe
10% of its time on locking (we have seen far, far worse than
this with thousands of processes in the pageout code).  That
means if we have more than 10 threads in the pageout code,
we could end up spending more time on locking and less doing
real work - slowing everybody down.

I rounded it down to the closest power of 2 to come up with
an arbitrary number that looked safe :)
===

We discussed above.
I want to add your desciption into changelog.
That's because after long time, We don't know why we select '8' as
default value.
Your desciption in changelog will explain it to follow-up people. :)

Sorry for bothering you.


> --
> All rights reversed.
>



-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/