[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150717082359.GA13442@gmail.com>
Date: Fri, 17 Jul 2015 10:23:59 +0200
From: Ingo Molnar <mingo@...nel.org>
To: Dave Hansen <dave@...1.net>
Cc: tglx@...utronix.de, mingo@...hat.com, hpa@...or.com,
x86@...nel.org, peterz@...radead.org, bp@...en8.de,
luto@...capital.net, torvalds@...ux-foundation.org,
linux-kernel@...r.kernel.org
Subject: Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'
* Dave Hansen <dave@...1.net> wrote:
> The FPU rewrite removed the dynamic allocations of 'struct fpu'.
> But, this potentially wastes massive amounts of memory (2k per
> task on systems that do not have AVX-512 for instance).
>
> Instead of having a separate slab, this patch just appends the
> space that we need to the 'task_struct' which we dynamically
> allocate already. This saves from doing an extra slab allocation
> at fork(). The only real downside here is that we have to stick
> everything and the end of the task_struct. But, I think the
> BUILD_BUG_ON()s I stuck in there should keep that from being too
> fragile.
>
> This survives a quick build and boot in a VM. Does anyone see any
> real downsides to this?
So considering the complexity of the other patch that makes the static allocation,
I'd massively prefer this patch as it solves the real bug.
It should also work on future hardware a lot better.
This was the dynamic approach I suggested in our discussion of the big FPU code
rework.
> --- a/arch/x86/kernel/fpu/init.c~dynamically-allocate-struct-fpu 2015-07-16 10:50:42.355571648 -0700
> +++ b/arch/x86/kernel/fpu/init.c 2015-07-16 12:02:15.284280976 -0700
> @@ -136,6 +136,45 @@ static void __init fpu__init_system_gene
> unsigned int xstate_size;
> EXPORT_SYMBOL_GPL(xstate_size);
>
> +#define CHECK_MEMBER_AT_END_OF(TYPE, MEMBER) \
> + BUILD_BUG_ON((sizeof(TYPE) - \
> + offsetof(TYPE, MEMBER) - \
> + sizeof(((TYPE *)0)->MEMBER)) > \
> + 0) \
> +
> +/*
> + * We append the 'struct fpu' to the task_struct.
> + */
> +int __weak arch_task_struct_size(void)
This should not be __weak, otherwise we risk getting the generic version:
> --- a/kernel/fork.c~dynamically-allocate-struct-fpu 2015-07-16 10:50:42.357571739 -0700
> +++ b/kernel/fork.c 2015-07-16 11:25:53.873498188 -0700
> @@ -287,15 +287,21 @@ static void set_max_threads(unsigned int
> max_threads = clamp_t(u64, threads, MIN_THREADS, MAX_THREADS);
> }
>
> +int __weak arch_task_struct_size(void)
> +{
> + return sizeof(struct task_struct);
> +}
> +
Your system probably worked due to link order preferring the x86 version but I'm
not sure.
Other than this bug it looks good to me in principle.
Lemme check it on various hardware.
Thanks,
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists