[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20080220114317.642F.KOSAKI.MOTOHIRO@jp.fujitsu.com>
Date: Wed, 20 Feb 2008 11:48:41 +0900
From: KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
To: Rik van Riel <riel@...hat.com>
Cc: kosaki.motohiro@...fujitsu.com, Pavel Machek <pavel@....cz>,
Paul Jackson <pj@....com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, marcelo@...ck.org,
daniel.spang@...il.com, akpm@...ux-foundation.org,
alan@...rguk.ukuu.org.uk, linux-fsdevel@...r.kernel.org,
a1426z@...ab.com, jonathan@...masters.org, zlynx@....org
Subject: Re: [PATCH 0/8][for -mm] mem_notify v6
Hi Rik
> > Sounds like a job for memory limits (ulimit?), not for OOM
> > notification, right?
>
> I suspect one problem could be that an HPC job scheduling program
> does not know exactly how much memory each job can take, so it can
> sometimes end up making a mistake and overcommitting the memory on
> one HPC node.
>
> In that case the user is better off having that job killed and
> restarted elsewhere, than having all of the jobs on that node
> crawl to a halt due to swapping.
>
> Paul, is this guess correct? :)
Yes.
Fujitsu HPC middleware watching sum of memory consumption of the job
and, if over-consumption happened, kill process and remove job schedule.
I think that is common hpc requirement.
but we watching to user defined memory limit, not swap.
Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists