lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrU4U+4FTpje0wJfCDyQ3+5SJ56+gW_6ezqfFM3mu53YYw@mail.gmail.com>
Date:	Fri, 19 Feb 2016 23:30:43 -0800
From:	Andy Lutomirski <luto@...capital.net>
To:	"Luis R. Rodriguez" <mcgrof@...nl.org>
Cc:	"Luis R. Rodriguez" <mcgrof@...nel.org>,
	"H. Peter Anvin" <hpa@...or.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
	X86 ML <x86@...nel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Boris Ostrovsky <boris.ostrovsky@...cle.com>,
	Rusty Russell <rusty@...tcorp.com.au>,
	David Vrabel <david.vrabel@...rix.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
	Michael Brown <mcb30@...e.org>,
	Juergen Gross <jgross@...e.com>,
	Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
	Xen Devel <xen-devel@...ts.xensource.com>,
	"Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
	Mauro Carvalho Chehab <mchehab@....samsung.com>,
	vegard.nossum@...il.com
Subject: Re: [RFC v2 2/6] x86/init: use linker tables to simplify x86 init and
 annotate dependencies

On Fri, Feb 19, 2016 at 4:42 PM, Luis R. Rodriguez <mcgrof@...nl.org> wrote:
> On Fri, Feb 19, 2016 at 08:40:49AM -0800, Andy Lutomirski wrote:
>> On Fri, Feb 19, 2016 at 6:15 AM, Luis R. Rodriguez <mcgrof@...nel.org> wrote:
>> > Any failure on the x86 init path can be catastrophic.
>> > A simple shift of a call from one place to another can
>> > easily break things. Likewise adding a new call to
>> > one path without considering all x86 requirements
>> > can make certain x86 run time environments crash.
>> > We currently account for these requirements through
>> > peer code review and run time testing. We could do
>> > much better if we had a clean and simple way to annotate
>> > strong semantics for run time requirements, init sequence
>> > dependencies, and detection mechanisms for additions of
>> > new x86 init sequences.
>>
>> Please document this in a way that will be useful for people trying to
>> understand what the code does as opposed to just people who are trying
>> to understand why you wrote it.  More below.
>
> Sure. I'm kind of glad people are getting tired of me trying to
> explain *why* I wrote it though, I figured that might be the harder
> thing to explain. Hopefully it sinks in.
>
>> > +/**
>> > + * struct x86_init_fn - x86 generic kernel init call
>> > + *
>> > + * Linux x86 features vary in complexity, features may require work done at
>> > + * different levels of the full x86 init sequence. Today there are also two
>> > + * different possible entry points for Linux on x86, one for bare metal, KVM
>> > + * and Xen HVM, and another for Xen PV guests / dom0.  Assuming a bootloader
>> > + * has set up 64-bit mode, roughly the x86 init sequence follows this path:
>> > + *
>> > + * Bare metal, KVM, Xen HVM                      Xen PV / dom0
>> > + *       startup_64()                             startup_xen()
>> > + *              \                                     /
>> > + *      x86_64_start_kernel()                 xen_start_kernel()
>> > + *                           \               /
>> > + *                      x86_64_start_reservations()
>> > + *                                   |
>> > + *                              start_kernel()
>> > + *                              [   ...        ]
>> > + *                              [ setup_arch() ]
>> > + *                              [   ...        ]
>> > + *                                  init
>> > + *
>>
>>
>> I don't think this paragraph below is necessary.  I also think it's
>> very confusing.  Keep in mind that the reader has no idea what a
>> "level" is at this point and that the reader also doesn't need to
>> think about terms like "paravirtualization yielding".
>>
>> > + * x86_64_start_kernel() and xen_start_kernel() are the respective first C code
>> > + * entry starting points. The different entry points exist to enable Xen to
>> > + * skip a lot of hardware setup already done and managed on behalf of the
>> > + * hypervisor, we refer to this as "paravirtualization yielding". The different
>> > + * levels of init calls on the x86 init sequence exist to account for these
>> > + * slight differences and requirements. These different entry points also share
>> > + * a common entry x86 specific path, x86_64_start_reservations().
>
> OK sure, I'll nuke.
>
>> > + *
>>
>> And here, I don't even know what a "feature" is.
>
> Kasan is an example.

Is kasan different from the cr4 shadow in this context?  I don't
really feel like calling the cr4 shadow a "feature" is meaningful.  Am
I misunderstanding something?

>
>> > + * A generic x86 feature can have different initialization calls, one on each
>> > + * of the different main x86 init sequences, but must also address both entry
>> > + * points in order to work properly across the board on all supported x86
>> > + * subarchitectures. Since x86 features can also have dependencies on other
>> > + * setup code or features, x86 features can at times be subordinate to other
>> > + * x86 features, or conditions. struct x86_init_fn enables feature developers
>> > + * to annotate dependency relationships to ensure subsequent init calls only
>> > + * run once a subordinate's dependencies have run. When needed custom
>> > + * dependency requirements can also be spelled out through a custom dependency
>> > + * checker. In order to account for the dual entry point nature of x86-64 Linux
>> > + * for "paravirtualization yielding" and to make annotations for support for
>> > + * these explicit each struct x86_init_fn must specify supported
>> > + * subarchitectures. The earliest x86-64 code can read the subarchitecture
>> > + * though is after load_idt(), as such the earliest we can currently rely on
>> > + * subarchitecture for semantics and a common init sequences is on the shared
>> > + * common x86_64_start_reservations().  Each struct x86_init_fn must also
>> > + * declare a two-digit decimal number to impose an ordering relative to other
>> > + * features when required.
>>
>> I'm totally lost in the paragraph above.
>
> As an example Kasan could be defined as a struct x86_init_fn with a respective
> callback on x86_64_start_kernel() (kasan_early_init()) and later setup_arch()
> (kasan_init()). Since we can't support for struct x86_init_fn calls to start all
> the way from x86_64_start_kernel() (given the issues I had noted before)
> the earliest a struct x86_init_fn can be used for is for a callback on
> x86_64_start_reservations().

I'm still lost.  Kasan is a pile of code that does something.  Kasan
has some initialization code.  Do you mean that kasan's initialization
code could be encapsulated in one or more x86_init_fn instances?

>
> It just means we can different callbacks used for different parts of the
> x86 init process, it explains the limits to the callbacks we can have right now.

Do you mean that there would be different linker tables containing
x86_init_fn instances for each of these callback points?  Or is
"early_init" the only such callback point.

BTW, why is early_init called early_init?  Why not call it "callback"?
 Then you could reuse this for other things.

>
>> > + *
>> > + * x86_init_fn enables strong semantics and dependencies to be defined and
>> > + * implemented on the full x86 initialization sequence.
>>
>> Please try explaining what this is, instead.  For example:
>>
>> An x86_init_fn represents a function to be called at a certain point
>> during initialization on a specific set of subarchitectures.
>
> OK thanks.
>
>> > + *
>> > + * @order_level: must be set, linker order level, this corresponds to the table
>> > + *     section sub-table index, we record this only for semantic validation
>> > + *     purposes.
>>
>> I read this as "this is purely a debugging feature".  Is it?  I think it's not.
>
> Nope, the order_level really is used for linker table sorting, but since we can
> modify the table for these structs with a run time sort later, we want to ensure
> that the order level is still respected after we sort at run time.
>
>> >  Order-level is always required however you typically would
>> > + *     only use X86_INIT_NORMAL*() and leave ordering to be done by placement
>> > + *     of code in a C file and the order of objects through a Makefile. Custom
>> > + *     order-levels can be used when order on C file and order of objects on
>> > + *     Makfiles does not suffice or much further refinements are needed.
>>
>>
>> Assuming I understand this correctly, here's how I'd write it:
>>
>> @order_level: The temporal order of this x86_init_fn.  Lower
>> order_level numbers are called first.  Ties are broken by order found
>> in the Makefile and then by order in the C file.
>
> Thanks... yes.
>
>>
>> Note, however, that my proposed explanation can't be right because it
>> appears to conflict with "depend".  Please adjust accordingly.
>>
>> > + * @supp_hardware_subarch: must be set, it represents the bitmask of supported
>> > + *     subarchitectures.  We require each struct x86_init_fn to have this set
>> > + *     to require developer considerations for each supported x86
>> > + *     subarchitecture and to build strong annotations of different possible
>> > + *     run time states particularly in consideration for the two main
>> > + *     different entry points for x86 Linux, to account for paravirtualization
>> > + *     yielding.
>> '
>> Too much motivation, too little documentation.
>>
>> @supp_hardware_subarch: A bitmask of subarchitectures on which to call
>> this init function.
>
> OK thanks.
>

>
>> > + *
>> > + * @detect: optional, if set returns true if the feature has been detected to
>> > + *     be required, it returns false if the feature has been detected to not
>> > + *     be required.
>>
>> I have absolutely no idea what this means.
>
> Well, so the order level stuff is used at link time, there however are some
> run time environment conditions that might be important to consider as well,
> the detect lets you write a C code routine which will check to see if the
> feature should run, it returns false if the feature should not run or is
> not required to run.

So how is:

static bool detect_foo(void) { return whatever; }
statis void init_foo(void) { do something; }

.detect = detect_foo,
.early_init = init_foo,

different from:

static void init_foo(void)
{
  if (whatever) do something;
}

.early_init = init_foo,


>
> A cheesy way to implement a cheesy graph is to then use the same callback
> as a pointer for dependency, so that if a feature depends on another, both
> are on the same order level, and if they have a run time dependency
> annotation, they can further annotate this relationship by having the
> thing that depends, set its depends pointer to refer to the other
> feature's detect callback.
>
> This is borrowed from the IOMMU init code in the kernel. Same logic.
> Only thing is we also get link order support too. Link order semantics
> and logic should suffice for most things, but not all as with the IOMMU
> init stuff.

But this doesn't work at all if .detect isn't set.

>> And what happens if the depend-implied order is
>> inconsistent with order_level.
>
> Ah -- that's what I was trying to explain above in the comment
> about inconsistent state -- a run time check *after* we do run
> time sorting goes through and checks for sanity.
>
> All this is a light weight feature graph, if you will, other
> folks are working on different solutions for this and while
> I have considered sharing a solution for early init code this
> seems a bit more ideal for now. Later in the future though we
> should probably get together and talk about a proper feature
> graph, with more consistent checks, etc. Its another reason
> why I am also curious to see a SAT solver at run time in the
> future, but this is super long term [0]...
>
> [0] http://kernelnewbies.org/KernelProjects/kconfig-sat
>
>> I would hope that order_level breaks ties in the topological sort and
>> that there's a bit fat warning if the order_level ordering is
>> inconsistent with the topological ordering.  Preferably the warning
>> would be at build time, in which case it could be an error.
>
> Order level will always be respected at link time. The depend
> and detect thing are heuristics used at run time and since
> it depends on run time state we can't do this at build time.
> The big fat hairy nasty warning is there though. Curious
> enough I ended up finding out that the sanity check for
> IOMMU was intended to only be run at debug time but the
> debug thing was *always* set to true, so we've been running
> the sanity checker in the kernel for ages. Its why I also
> feel a bit more comfortable in keeping it live and matching
> more close to its implementation. I've gone to pains to
> build off of the IOMMU init solution and simply extend
> on the semantics there. Turns out there was much room for
> enhancements. All the fine tuning are documented in the
> userspace tree.

So is this what you're saying:

Callbacks are called in an order consistent with the dependencies
specified by @depend.  Ties are broken first by @order_level and then
by linker order.  It is an error for an x86_init_fn with lower
@order_level to depend on an x86_init_fn with higher @order_level.

>
>> > + * @early_init: required, routine which will run in x86_64_start_reservations()
>> > + *     after we ensure boot_params.hdr.hardware_subarch is accessible and
>> > + *     properly set. Memory is not yet available. This the earliest we can
>> > + *     currently define a common shared callback since all callbacks need to
>> > + *     check for boot_params.hdr.hardware_subarch and this becomes accessible
>> > + *     on x86-64 until after load_idt().
>>
>> What's this for?  Under what conditions is it called?  What order?
>> Why is it part of x86_init_fn at all?
>
> Its one of the reasons we are adding this, callbacks to for features.
> Its called if the its requirements are met (subarch, and depends).
> I'll work on extending this to be clearer.

I think this should just be ".callback".  Let the context by implied
by the linker table in which it appears.
.
>
>> > + */
>> > +struct x86_init_fn {
>> > +       __u32 order_level;
>> > +       __u32 supp_hardware_subarch;
>> > +       bool (*detect)(void);
>> > +       bool (*depend)(void);
>> > +       void (*early_init)(void);
>> > +       __u32 flags;
>> > +};
>> > +
>> > +/**
>> > + * enum x86_init_fn_flags: flags for init sequences
>> > + *
>> > + * X86_INIT_FINISH_IF_DETECTED: tells the core that once this init sequence
>> > + *     has completed it can break out of the loop for init sequences on
>> > + *     its own level.
>>
>> What does this mean?
>
> It borrows the logic from the IOMMU stuff, it enables the core to detect
> if *any* routine in the linker table has completed we can then now just
> bail and stop going through the rest of the linker table.

Can you give an example for which this would be used?

--Andy

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ