linux-kernel - Re: -mm merge plans for 2.6.23

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <2c0942db0707260043h18d878baq9b3be72c01e2680a@mail.gmail.com>
Date:	Thu, 26 Jul 2007 00:43:05 -0700
From:	"Ray Lee" <ray-lk@...rabbit.org>
To:	"Andrew Morton" <akpm@...ux-foundation.org>
Cc:	"Nick Piggin" <nickpiggin@...oo.com.au>,
	"Eric St-Laurent" <ericstl34@...patico.ca>,
	"Rene Herman" <rene.herman@...il.com>,
	"Jesper Juhl" <jesper.juhl@...il.com>,
	"ck list" <ck@....kolivas.org>, "Ingo Molnar" <mingo@...e.hu>,
	"Paul Jackson" <pj@....com>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: -mm merge plans for 2.6.23

On 7/25/07, Andrew Morton <akpm@...ux-foundation.org> wrote:
> On Wed, 25 Jul 2007 23:33:24 -0700 "Ray Lee" <ray-lk@...rabbit.org> wrote:
> > If you think that adding that API and maintaining it is
> > simpler/better than including a variation on the above hueristic I
> > offered, then yeah, I guess we are. It'll all have that vague
> > userspace s2ram odor about it, but I'm sure it could be made to work.
>
> Actually, I overdesigned the API, I suspect.  What we _could_ do is to
> provide a way of allowing userspace to say "pretend process A touched page
> B": adopt its mm and go touch the page.  We in fact already have that:
> PTRACE_PEEKTEXT.

Huh. All right.

> So I suspect this could all be done by polling maps2 and using PEEKTEXT.
> The tricky part would be working out when to poll, and when to reestablish.

Welllllll.... there is the taskstats interface. It's not required
right now, though, and lacks most of what userspace would need, I
think. It does at least currently provide a notification of process
exit, which is a clue for when to start reestablishment. Gotta be
another way we can get at that...

Oh, stat on /proc, does that work? Huh, it does, sort of. It seems to
be off by 12 or 13, but hey, that's something.

Wish I had the time to look at the maps2 stuff, but regardless, it
probably currently provides too much detail for continual polling? I
suspect what we'd want to do is to take a detailed snapshot a little
after the beginning of a process's lifetime (once the block-in counts
subside), then poll aggregate residency or evicition counts to know
which processes are suffering the burden of the transient workload.

Eh, wait, that doesn't help with inodes. No matter, I guess; I'm the
one who said targetting swap-in would be good enough for a first pass.

On process exit, if userspace can get a hold of an estimate of the
size of what just freed up, it could then spend
min(that,evicted_count) on repopulation. That's probably already
available by polling whatever `free` calls.

> A neater implementation than PEEKTEXT would be to make the maps2 files
> writeable(!) so as a party trick you could tar 'em up and then, when you
> want to reestablish firefox's previous working set, do a untar in
> /proc/$(pidof firefox)/

I'm going to get into trouble if I wake up the other person in the
house with my laughter. That's laughter in a positive sense, not a
"you're daft" kind of way.

Huh. <thinks> So, to go back a little bit, I guess one of my problems
with polling is that it means that userspace can only approximate an
MRU of what's been evicted. Perhaps an approximation is good enough, I
don't know, but that's something to keep in mind. (Hmm, how many pages
can an average desktop evict per second? If we poll everything once
per second, that's how off we could be.)

Another is a more philosophical hangup -- running a process that polls
periodically to improve system performance seems backward. Okay, so
that's my problem to get over, not yours.

Another problem is what poor sod would be willing to write and test
this, given that there's already a written and tested kernel patch to
do much the same thing? Yeah, that's sorta rhetorical, but it's sorta
not. Given that swap prefetch could be ripped out of 2.6.n+1 if it's
introduced in 2.6.n, and nothing in userspace would be the wiser,
where's the burden?  There is some, just as any kernel code has some,
and as it's core code (versus, say, a driver), the burden is
correspondingly greater per line, but given the massive changesets
flowing through each release now, I have to think that the burden this
introduces is marginal compared to the rest of the bulk sweeping
through the kernel weekly.

This is obviously where I'm totally conjecturing, and you'll know far,
far better than I.

Offline for about 20 hours or so, not that anyone would probably notice :-).

Ray
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/