[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191107082541.GF30739@gmail.com>
Date: Thu, 7 Nov 2019 09:25:41 +0100
From: Ingo Molnar <mingo@...nel.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Thomas Gleixner <tglx@...utronix.de>,
LKML <linux-kernel@...r.kernel.org>,
the arch/x86 maintainers <x86@...nel.org>,
Stephen Hemminger <stephen@...workplumber.org>,
Willy Tarreau <w@....eu>, Juergen Gross <jgross@...e.com>,
Sean Christopherson <sean.j.christopherson@...el.com>,
"H. Peter Anvin" <hpa@...or.com>
Subject: Re: [patch 5/9] x86/ioport: Reduce ioperm impact for sane usage
further
* Linus Torvalds <torvalds@...ux-foundation.org> wrote:
> On Wed, Nov 6, 2019 at 12:57 PM Thomas Gleixner <tglx@...utronix.de> wrote:
> >
> > Calculate both the position of the first zero bit and the last zero bit to
> > limit the range which needs to be copied. This does not solve the problem
> > when the previous tasked had only byte 0 cleared and the next one has only
> > byte 65535 cleared, but trying to solve that would be too complex and
> > heavyweight for the context switch path. As the ioperm() usage is very rare
> > the case which is optimized is the single task/process which uses ioperm().
>
> Hmm.
>
> I may read this patch wrong, but from what I can tell, if we really
> have just one process with an io bitmap, we're doing unnecessary
> copies.
>
> If we really have just one process that has an iobitmap, I think we
> could just keep the bitmap of that process entirely unchanged. Then,
> when we switch away from it, we set the io_bitmap_base to an invalid
> base outside the TSS segment, and when we switch back, we set it back
> to the valid one. No actual bitmap copies at all.
>
> So I think that rather than the "begin/end offset" games, we should
> perhaps have a "what was the last process that used the IO bitmap for
> this TSS" pointer (and, I think, some sequence counter, so that when
> the process updates its bitmap, it invalidates that case)?
>
> Of course, you can do *nboth*, but if we really think that the common
> case is "one special process", then I think the begin/end offset is
> useless, but a "last bitmap process" would be very useful.
>
> Am I missing something?
In fact on SMP systems this would result in a very nice optimization:
pretty quickly *all* TSS's would be populated with that single task's
bitmap, and it would persist even across migrations from CPU to CPU.
I'd love to get rid of the offset caching and bit scanning games as well
- it doesn't really help in a number of common scenarios and it
complicates this non-trivial piece of code a *LOT* - and we probably
don't really have the natural testing density of this code anymore to
find any regressions quickly.
So intuitively I'd suggest we gravitate towards the simplest
implementation, with a good caching optimization for the single-task
case.
I.e. the model I'm suggesting is that if a task uses ioperm() or iopl()
then it should have a bitmap from that point on until exit(), even if
it's all zeroes or all ones. Most applications that are using those
primitives really need it all the time and are using just a few ioports,
so all the tracking doesn't help much anyway.
On a related note, another simplification would be that in principle we
could also use just a single bitmap and emulate iopl() as an ioperm(all)
or ioperm(none) calls. Yeah, it's not fully ABI compatible for mixed
ioperm()/iopl() uses, but is that ABI actually being relied on in
practice?
Thanks,
Ingo
Powered by blists - more mailing lists