lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87mu3yq6sf.fsf@nanos.tec.linutronix.de>
Date:   Fri, 17 Jul 2020 21:29:36 +0200
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Kees Cook <keescook@...omium.org>
Cc:     LKML <linux-kernel@...r.kernel.org>, x86@...nel.org,
        linux-arch@...r.kernel.org, Will Deacon <will@...nel.org>,
        Arnd Bergmann <arnd@...db.de>,
        Mark Rutland <mark.rutland@....com>,
        Keno Fischer <keno@...iacomputing.com>,
        Paolo Bonzini <pbonzini@...hat.com>, kvm@...r.kernel.org,
        Gabriel Krisman Bertazi <krisman@...labora.com>
Subject: Re: [patch V3 01/13] entry: Provide generic syscall entry functionality

Kees Cook <keescook@...omium.org> writes:
> On Thu, Jul 16, 2020 at 11:55:59PM +0200, Thomas Gleixner wrote:
>> Kees Cook <keescook@...omium.org> writes:
>> >> +/*
>> >> + * Define dummy _TIF work flags if not defined by the architecture or for
>> >> + * disabled functionality.
>> >> + */
>> >
>> > When I was thinking about this last week I was pondering having a split
>> > between the arch-agnositc TIF flags and the arch-specific TIF flags, and
>> > that each arch could have a single "there is agnostic work to be done"
>> > TIF in their thread_info, and the agnostic flags could live in
>> > task_struct or something. Anyway, I'll keep reading...
>> 
>> That's going to be nasty. We rather go and expand the TIF storage to
>> 64bit. And then do the following in a generic header:
>
> I though the point was to make the TIF_WORK check as fast as possible,
> even on the 32-bit word systems. I mean it's not a huge performance hit,
> but *shrug*

For 64bit it's definitely faster to have 64bit flags.

For 32bit it's debatable whether having to fiddle with two words and
taking care about ordering is faster or not. It's a pain on 32bit in any
case.

But what we can do is distangle the flags into those which really need
to be grouped into a single 32bit TIF word and architecture specific
ones which can live in a different place.

Looking at x86 which has the most pressure on TIF bits:

Real TIF bits
TIF_SYSCALL_TRACE       Entry/exit
TIF_SYSCALL_TRACEPOINT	Entry/exit
TIF_NOTIFY_RESUME       Entry/exit
TIF_SIGPENDING          Entry/exit
TIF_NEED_RESCHED        Entry/exit
TIF_SINGLESTEP          Entry/exit
TIF_SYSCALL_EMU         Entry/exit
TIF_SYSCALL_AUDIT       Entry/exit
TIF_SECCOMP             Entry/exit
TIF_USER_RETURN_NOTIFY	Entry/exit      (x86 specific)
TIF_UPROBE		Entry/exit
TIF_PATCH_PENDING       Entry/exit
TIF_POLLING_NRFLAG      Scheduler (related to TIF_NEED_RESCHED)
TIF_MEMDIE              Historical, but not required to be a real one
TIF_FSCHECK		Historical, but not required to be a real one

Context switch group
TIF_NOCPUID             X86 Context switch
TIF_NOTSC               X86 Context switch
TIF_SLD                 X86 Context switch
TIF_BLOCKSTEP           X86 Context switch
TIF_IO_BITMAP           X86 Context switch + Entry/exit (see below)
TIF_NEED_FPU_LOAD       X86 Context switch + Entry/exit (see below)
TIF_SSBD                X86 Context switch
TIF_SPEC_IB             X86 Context switch
TIF_SPEC_FORCE_UPDATE   X86 Context switch

No group requirements
TIF_IA32                X86 random
TIF_FORCED_TF           X86 random
TIF_LAZY_MMU_UPDATES    X86 random
TIF_ADDR32              X86 random
TIF_X32                 X86 random

So the only interesting ones are TIF_IO_BITMAP and TIF_NEED_FPU_LOAD,
but those are really not required to be in the real TIF group because
they are independently evaluated _after_ the real TIF flags on exit to
user space and that requires a reread of the flags anyway. So if we put
the context switch and the random bits into a seperate word right after
thread_info->flags then the second word is in the same cacheline and it
wont matter. That way we gain plenty of free bits on 32 bit and have no
dependency between the two words at all.

The alternative is to play nasty games with TIF_IA32, TIF_ADDR32 and
TIF_X32 to free up bits for 32bit and make the flags field 64 bit on 64
bit kernels, but I prefer to do the above seperation.

Thanks,

        tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ