lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 25 Nov 2010 15:02:53 +0100
From:	Oleg Nesterov <oleg@...hat.com>
To:	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	LKML <linux-kernel@...r.kernel.org>,
	linux-mm <linux-mm@...ck.org>, pageexec@...email.hu,
	Solar Designer <solar@...nwall.com>,
	Eugene Teo <eteo@...hat.com>,
	Brad Spengler <spender@...ecurity.net>,
	Roland McGrath <roland@...hat.com>
Subject: Re: [resend][PATCH 4/4] oom: don't ignore rss in nascent mm

On 11/25, KOSAKI Motohiro wrote:
>
> > > > It is very simple. copy_strings() increments MM_ANONPAGES every
> > > > time we add a new page into bprm->vma. This makes this memory
> > > > visible to select_bad_process().
> > > >
> > > > When exec changes ->mm (or if it fails), we change MM_ANONPAGES
> > > > counter back.
> > > >
> > > > Most probably I missed something, but what do you think?
> > >
> > > Because, If the pages of argv is swapping out when processing execve,
> > > This accouing doesn't work.
> >
> > Why?
> >
> > If copy_strings() inserts the new page into bprm->vma and then
> > this page is swapped out, inc_mm_counter(current->mm, MM_ANONPAGES)
> > becomes incorrect, yes. And we can't turn it into MM_SWAPENTS.
> >
> > But does this really matter? oom_badness() counts MM_ANONPAGES +
> > MM_SWAPENTS, and result is the same.
>
> Ah, I got it. I did too strongly get stucked correct accounting. but
> you mean it's not must.

Yes. In fact, I _think_ this patch makes accounting better, even if
the extra MM_ANONPAGES numbers are not 100% correct.

Even if we add signal->in_exec_mm, nobody except oom_badness() will
look at it.

With this patch, say, /proc/pid/statm or /proc/pid/status will report
the memory allocated by the execing task. Even if technically this is
not correct (and 'swap' part may be wrong), this makes sense imho.
Otherwise, there is no way to see that this task allocates (may be
a lot) of memory.

This can "confuse" update_hiwater_rss(), but imho this is fine too.


> > > Is this enough explanation? Please don't hesitate say "no". If people
> > > don't like my approach, I don't hesitate change my thinking.
> >
> > Well, certainly I can't say no ;)
> >
> > But it would be nice to find a more simple fix (if it can work,
> > of course).
> >
> >
> > And. I need a simple solution for the older kernels.
>
> Alright. It is certinally considerable one.

Great! I'll send the patch tomorrow.

Even if you prefer another fix for 2.6.37/stable, I'd like to see
your review to know if it is correct or not (for backporting).

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ