[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <AANLkTi=VnTkuyYht8D+2MPO1d4mXR1ah-0aQeAjZsTaq@mail.gmail.com>
Date: Fri, 29 Oct 2010 08:28:23 +0900
From: Minchan Kim <minchan.kim@...il.com>
To: Mandeep Singh Baines <msb@...omium.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
Rik van Riel <riel@...hat.com>, Mel Gorman <mel@....ul.ie>,
Johannes Weiner <hannes@...xchg.org>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org, wad@...omium.org,
olofj@...omium.org, hughd@...omium.org
Subject: Re: [PATCH] RFC: vmscan: add min_filelist_kbytes sysctl for
protecting the working set
On Fri, Oct 29, 2010 at 7:03 AM, Mandeep Singh Baines <msb@...omium.org> wrote:
> Andrew Morton (akpm@...ux-foundation.org) wrote:
>> On Thu, 28 Oct 2010 12:15:23 -0700
>> Mandeep Singh Baines <msb@...omium.org> wrote:
>>
>> > On ChromiumOS, we do not use swap.
>>
>> Well that's bad. Why not?
>>
>
> We're using SSDs. We're still in the "make it work" phase so wanted
> avoid swap unless/until we learn how to use it effectively with
> an SSD.
>
> You'll want to tune swap differently if you're using an SSD. Not sure
> if swappiness is the answer. Maybe a new tunable to control how aggressive
> swap is unless such a thing already exits?
>
>> > When memory is low, the only way to
>> > free memory is to reclaim pages from the file list. This results in a
>> > lot of thrashing under low memory conditions. We see the system become
>> > unresponsive for minutes before it eventually OOMs. We also see very
>> > slow browser tab switching under low memory. Instead of an unresponsive
>> > system, we'd really like the kernel to OOM as soon as it starts to
>> > thrash. If it can't keep the working set in memory, then OOM.
>> > Losing one of many tabs is a better behaviour for the user than an
>> > unresponsive system.
>> >
>> > This patch create a new sysctl, min_filelist_kbytes, which disables reclaim
>> > of file-backed pages when when there are less than min_filelist_bytes worth
>> > of such pages in the cache. This tunable is handy for low memory systems
>> > using solid-state storage where interactive response is more important
>> > than not OOMing.
>> >
>> > With this patch and min_filelist_kbytes set to 50000, I see very little
>> > block layer activity during low memory. The system stays responsive under
>> > low memory and browser tab switching is fast. Eventually, a process a gets
>> > killed by OOM. Without this patch, the system gets wedged for minutes
>> > before it eventually OOMs. Below is the vmstat output from my test runs.
>> >
>> > BEFORE (notice the high bi and wa, also how long it takes to OOM):
>>
>> That's an interesting result.
>>
>> Having the machine "wedged for minutes" thrashing away paging
>> executable text is pretty bad behaviour. I wonder how to fix it.
>> Perhaps simply declaring oom at an earlier stage.
>>
>> Your patch is certainly simple enough but a bit sad. It says "the VM
>> gets this wrong, so lets just disable it all". And thereby reduces the
>> motivation to fix it for real.
>>
>
> Yeah, I used the RFC label because we're thinking this is just a temporary
> bandaid until something better comes along.
>
> Couple of other nits I have with our patch:
> * Not really sure what to do for the cgroup case. We do something
> reasonable for now.
> * One of my colleagues also brought up the point that we might want to do
> something different if swap was enabled.
>
>> But the patch definitely improves the situation in real-world
>> situations and there's a case to be made that it should be available at
>> least as an interim thing until the VM gets fixed for real. Which
>> means that the /proc tunable might disappear again (or become a no-op)
>> some time in the future.
I think this feature that "System response time doesn't allow but OOM allow".
While we can control process to not killed by OOM using
/oom_score_adj, we can't control response time directly.
But in mobile system, we have to control response time. One of cause
to avoid swap is due to response time.
How about using memcg?
Isolate processes related to system response(ex, rendering engine, IPC
engine and so no) to another group.
--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists