[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090609111719.GA4463@elte.hu>
Date: Tue, 9 Jun 2009 13:17:19 +0200
From: Ingo Molnar <mingo@...e.hu>
To: Nick Piggin <npiggin@...e.de>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Rusty Russell <rusty@...tcorp.com.au>,
Jeremy Fitzhardinge <jeremy@...p.org>,
"H. Peter Anvin" <hpa@...or.com>,
Thomas Gleixner <tglx@...utronix.de>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Avi Kivity <avi@...hat.com>,
Arjan van de Ven <arjan@...radead.org>
Subject: Re: [benchmark] 1% performance overhead of paravirt_ops on native
kernels
* Nick Piggin <npiggin@...e.de> wrote:
> On Thu, Jun 04, 2009 at 08:02:14AM -0700, Linus Torvalds wrote:
> >
> >
> > On Thu, 4 Jun 2009, Rusty Russell wrote:
> > > >
> > > > Turn off HIGHMEM64G, please (and HIGHMEM4G too, for that matter - you
> > > > can't compare it to a no-highmem case).
> > >
> > > Thanks, your point is demonstrated below. I don't think HIGHMEM4G is
> > > unreasonable for a distro tho, so I turned that on instead.
> >
> > Well, I agree that HIGHMEM4G is a _reasonable_ thing to turn on.
> >
> > The thing I disagree with is that it's at all valid to then compare to
> > some all-software feature thing. HIGHMEM doesn't expand any esoteric
> > capability that some people might use - it's about regular RAM for regular
> > users.
> >
> > And don't get me wrong - I don't like HIGHMEM. I detest the damn thing. I
> > hated having to merge it, and I still hate it. It's a stupid, ugly, and
> > very invasive config option. It's just that it's there to support a
> > stupid, ugly and very annoying fundamental hardware problem.
>
> I was looking forward to be able to get rid of it... unfortunately
> other 32-bit architectures are starting to use it again :(
>
> I guess it is not incredibly intrusive for generic mm code. A bit
> of kmap sprinkled around which is actually quite a useful
> delimiter of where pagecache is addressed via its kernel mapping.
>
> Do you hate more the x86 code? Maybe that can be removed?
IMHO what hurts most about highmem isnt even its direct source code
overhead, but three factors:
- The buddy allocator allocates top down, with highmem pages first.
So a lot of critical apps (the first ones started) will have
highmem footprint, and that shows up every time they use it for
file IO or other ops. kmap() overhead and more.
- Highmem is not really a 'solvable' problem in terms of good VM
balancing. It gives conflicting constraints and there's no single
'good VM' that can really work - just a handful of bad solutions
that differ in their level and area of suckiness.
- The kmap() cache itself can be depleted, and using atomic kmaps
is fragile and error-prone. I think we still have a FIXME of a
possibly triggerable deadlock somewhere in the core MM code ...
OTOH, highmem is clearly a useful hardware enablement feature with a
slowly receding upside and a constant downside. The outcome is
clear: when a critical threshold is reached distros will stop
enabling it. (or more likely, there will be pure 64-bit x86 distros)
Highmem simply enables a sucky piece of hardware so the code itself
has an intrinsic level of suckage, so to speak. There's not much to
be done about it but it's not a _big_ problem either: this type of
hw is moving fast out of the distro attention span.
( What scares/worries me much more than sucky hardware is sucky
_software_ ABIs. Those have a half-life measured not in years but
in decades and they get put into new products stubbornly, again
and again. There's no Moore's Law getting rid of sucky software
really and unlike the present set of sucky highmem hardware
there's no influx of cosmic particles chipping away on their
installed base either. )
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists