linux-kernel - Re: [patch 5/9] x86/ioport: Reduce ioperm impact for sane usage further

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <54ae3c3a-f49a-f5cc-d174-de6e64afe899@kernel.org>
Date:   Sun, 10 Nov 2019 09:17:39 -0800
From:   Andy Lutomirski <luto@...nel.org>
To:     Ingo Molnar <mingo@...nel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        LKML <linux-kernel@...r.kernel.org>,
        the arch/x86 maintainers <x86@...nel.org>,
        Stephen Hemminger <stephen@...workplumber.org>,
        Willy Tarreau <w@....eu>, Juergen Gross <jgross@...e.com>,
        Sean Christopherson <sean.j.christopherson@...el.com>,
        "H. Peter Anvin" <hpa@...or.com>
Subject: Re: [patch 5/9] x86/ioport: Reduce ioperm impact for sane usage
 further

On 11/7/19 12:25 AM, Ingo Molnar wrote:
> 
> * Linus Torvalds <torvalds@...ux-foundation.org> wrote:
> 
>> On Wed, Nov 6, 2019 at 12:57 PM Thomas Gleixner <tglx@...utronix.de> wrote:
>>>
>>> Calculate both the position of the first zero bit and the last zero bit to
>>> limit the range which needs to be copied. This does not solve the problem
>>> when the previous tasked had only byte 0 cleared and the next one has only
>>> byte 65535 cleared, but trying to solve that would be too complex and
>>> heavyweight for the context switch path. As the ioperm() usage is very rare
>>> the case which is optimized is the single task/process which uses ioperm().
>>
>> Hmm.
>>
>> I may read this patch wrong, but from what I can tell, if we really
>> have just one process with an io bitmap, we're doing unnecessary
>> copies.
>>
>> If we really have just one process that has an iobitmap, I think we
>> could just keep the bitmap of that process entirely unchanged. Then,
>> when we switch away from it, we set the io_bitmap_base to an invalid
>> base outside the TSS segment, and when we switch back, we set it back
>> to the valid one. No actual bitmap copies at all.
>>
>> So I think that rather than the "begin/end offset" games, we should
>> perhaps have a "what was the last process that used the IO bitmap for
>> this TSS" pointer (and, I think, some sequence counter, so that when
>> the process updates its bitmap, it invalidates that case)?
>>
>>  Of course, you can do *nboth*, but if we really think that the common
>> case is "one special process", then I think the begin/end offset is
>> useless, but a "last bitmap process" would be very useful.
>>
>> Am I missing something?
> 
> In fact on SMP systems this would result in a very nice optimization: 
> pretty quickly *all* TSS's would be populated with that single task's 
> bitmap, and it would persist even across migrations from CPU to CPU.
> 
> I'd love to get rid of the offset caching and bit scanning games as well 
> - it doesn't really help in a number of common scenarios and it 
> complicates this non-trivial piece of code a *LOT* - and we probably 
> don't really have the natural testing density of this code anymore to 
> find any regressions quickly.

I think we should not over-optimize this.  I am all for penalizing
ioperm() and iopl() users as much as is convenient for us.  There is
simply no legitimate use case.  Sorry, DPDK, but "virtio-legacy sucks,
let's optimize the crap out of something that is slow anyway and use
iopl()" is not a good excuse.  Just use the %*!7 syscall to write to
/sys/.../resource0 and suck up the probably negligible performance hit.
And tell your customers to upgrade their hypervisors.  And quite
kvetching about performance of the control place on an old
software-emulated NIC while you're at it.

For the TLB case, it's worth tracking who last used which ASID and
whether it's still up to date, since *everyone* uses the MMU.  For
ioperm, I don't really believe this is worth it.

> 
> So intuitively I'd suggest we gravitate towards the simplest 
> implementation, with a good caching optimization for the single-task 
> case.

I agree with the first bit, but caching on an SMP system is necessarily
subtle.  Some kind of invalidation is needed.

> 
> I.e. the model I'm suggesting is that if a task uses ioperm() or iopl() 
> then it should have a bitmap from that point on until exit(), even if 
> it's all zeroes or all ones. Most applications that are using those 
> primitives really need it all the time and are using just a few ioports, 
> so all the tracking doesn't help much anyway.
> 
> On a related note, another simplification would be that in principle we 
> could also use just a single bitmap and emulate iopl() as an ioperm(all) 
> or ioperm(none) calls. Yeah, it's not fully ABI compatible for mixed 
> ioperm()/iopl() uses, but is that ABI actually being relied on in 
> practice?
> 

Let's please keep the ABI.  Or rather, let's attempt to eventually
remove the ABI, but let's not change it in the mean time please.