lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130701190222.GA10367@sergelap>
Date:	Mon, 1 Jul 2013 14:02:22 -0500
From:	Serge Hallyn <serge.hallyn@...ntu.com>
To:	Johannes Weiner <hannes@...xchg.org>
Cc:	Aaron Staley <aaron@...loud.com>,
	containers@...ts.linux-foundation.org,
	Paul Menage <menage@...gle.com>,
	Li Zefan <lizf@...fujitsu.com>, Michal Hocko <mhocko@...e.cz>,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: PROBLEM: Processes writing large files in memory-limited LXC
 container are killed by OOM

Quoting Johannes Weiner (hannes@...xchg.org):
> On Mon, Jul 01, 2013 at 01:01:01PM -0500, Serge Hallyn wrote:
> > Quoting Aaron Staley (aaron@...loud.com):
> > > This is better explained here:
> > > http://serverfault.com/questions/516074/why-are-applications-in-a-memory-limited-lxc-container-writing-large-files-to-di
> > > (The
> > > highest-voted answer believes this to be a kernel bug.)
> > 
> > Hi,
> > 
> > in irc it has been suggested that indeed the kernel should be slowing
> > down new page creates while waiting for old page cache entries to be
> > written out to disk, rather than ooming.
> > 
> > With a 3.0.27-1-ac100 kernel, doing dd if=/dev/zero of=xxx bs=1M
> > count=100 is immediately killed.  In contrast, doing the same from a
> > 3.0.8 kernel did the right thing for me.  But I did reproduce your
> > experiment below on ec2 with the same result.
> >
> > So, cc:ing linux-mm in the hopes someone can tell us whether this
> > is expected behavior, known mis-behavior, or an unknown bug.
> 
> It's a known issue that was fixed/improved in e62e384 'memcg: prevent

Ah ok, I see the commit says:

    The solution is far from being ideal - long term solution is memcg aware
    dirty throttling - but it is meant to be a band aid until we have a real
    fix.  We are seeing this happening during nightly backups which are placed

... and ...

    The issue is more visible with slower devices for output.

I'm guessing we see it on ec2 because of slowed fs.

> OOM with too many dirty pages', included in 3.6+.

Is anyone actively working on the long term solution?

thanks,
-serge
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ