lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6cac6943-2f6c-d48a-658e-08b3bf87921a@zytor.com>
Date:   Thu, 7 Nov 2019 17:12:04 -0800
From:   "H. Peter Anvin" <hpa@...or.com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Brian Gerst <brgerst@...il.com>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        LKML <linux-kernel@...r.kernel.org>,
        the arch/x86 maintainers <x86@...nel.org>,
        Stephen Hemminger <stephen@...workplumber.org>,
        Willy Tarreau <w@....eu>, Juergen Gross <jgross@...e.com>,
        Sean Christopherson <sean.j.christopherson@...el.com>
Subject: Re: [patch 5/9] x86/ioport: Reduce ioperm impact for sane usage
 further

On 2019-11-07 13:44, Linus Torvalds wrote:
> On Thu, Nov 7, 2019 at 1:00 PM Brian Gerst <brgerst@...il.com> wrote:
>>
>> There wouldn't have to be a flush on every task switch.
> 
> No. But we'd have to flush on any switch that currently does that memcpy.
> 
> And my point is that a tlb flush (even the single-page case) is likely
> more expensive than the memcpy.
> 
>> Going a step further, we could track which task is mapped to the
>> current cpu like proposed above, and only flush when a different task
>> needs the IO bitmap, or when the bitmap is being freed on task exit.
> 
> Well, that's exactly my "track the last task" optimization for copying
> the thing.
> 
> IOW, it's the same optimization as avoiding the memcpy.
> 
> Which I think is likely very effective, but also makes it fairly
> pointless to then try to be clever..
> 
> So the basic issue remains that playing VM games has almost
> universally been slower and more complex than simply not playing VM
> games. TLB flushes - even invlpg - tends to be pretty slow.
> 
> Of course, we probably end up invalidating the TLB's anyway, so maybe
> in this case we don't care. The ioperm bitmap is _technically_
> per-thread, though, so it should be flushed even if the VM isn't
> flushed...
> 

One option, probably a lot saner (if we care at all, after all, copying 8K
really isn't that much, but it might have some impact on real-time processes,
which is one of the rather few use cases for direct I/O) would be to keep the
bitmask in a pre-formatted TSS (ioperm being per thread, so no concerns about
the TSS being in use on another processor), and copy the TSS fields (88 bytes)
over if and only if the thread has been migrated to a different CPU, then
switch the TSS rather than switching  For the common case (no ioperms) we use
the standard per-cpu TSS.

That being said, I don't actually know that copying 88 bytes + LTR is any
cheaper than copying 8K.

	-hpa

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ