linux-kernel - Re: [PATCH v2] vmscan: limit concurrent reclaimers in shrink

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <28c262360912131614h62d8e0f7qf6ea9ab882f446d4@mail.gmail.com>
Date:	Mon, 14 Dec 2009 09:14:39 +0900
From:	Minchan Kim <minchan.kim@...il.com>
To:	Rik van Riel <riel@...hat.com>
Cc:	lwoodman@...hat.com, akpm@...ux-foundation.org,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] vmscan: limit concurrent reclaimers in shrink_zone

Hi, Rik.

On Sat, Dec 12, 2009 at 6:46 AM, Rik van Riel <riel@...hat.com> wrote:
> Under very heavy multi-process workloads, like AIM7, the VM can
> get into trouble in a variety of ways.  The trouble start when
> there are hundreds, or even thousands of processes active in the
> page reclaim code.
>
> Not only can the system suffer enormous slowdowns because of
> lock contention (and conditional reschedules) between thousands
> of processes in the page reclaim code, but each process will try
> to free up to SWAP_CLUSTER_MAX pages, even when the system already
> has lots of memory free.
>
> It should be possible to avoid both of those issues at once, by
> simply limiting how many processes are active in the page reclaim
> code simultaneously.
>
> If too many processes are active doing page reclaim in one zone,
> simply go to sleep in shrink_zone().
>
> On wakeup, check whether enough memory has been freed already
> before jumping into the page reclaim code ourselves.  We want
> to use the same threshold here that is used in the page allocator
> for deciding whether or not to call the page reclaim code in the
> first place, otherwise some unlucky processes could end up freeing
> memory for the rest of the system.

I am worried about one.

Now, we can put too many processes reclaim_wait with NR_UNINTERRUBTIBLE state.
If OOM happens, OOM will kill many innocent processes since
uninterruptible task
can't handle kill signal until the processes free from reclaim_wait list.

I think reclaim_wait list staying time might be long if VM pressure is heavy.
Is this a exaggeration?

If it is serious problem, how about this?

We add new PF_RECLAIM_BLOCK flag and don't pick the process
in select_bad_process.

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/