[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wijYb1Ci2Sx2xxggxdqHuMT4HJ5WtW+eGJP-+Fa9b1YtA@mail.gmail.com>
Date: Mon, 22 Apr 2019 15:40:00 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Alexey Dobriyan <adobriyan@...il.com>
Cc: Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Peter Anvin <hpa@...or.com>,
Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
"the arch/x86 maintainers" <x86@...nel.org>
Subject: Re: [PATCH] x86_64: uninline TASK_SIZE
On Sun, Apr 21, 2019 at 9:06 AM Alexey Dobriyan <adobriyan@...il.com> wrote:
>
> TASK_SIZE macro is quite deceptive: it looks like a constant but in fact
> compiles to 50+ bytes.
Honestly, if you are interested in improving TASK_SIZE, I'd really
like to see you try to go even further than this.
TASK_SIZE _used_ to just be a fixed constant, which is why it has that
name and why the usage patterns are what they are.
But since that isn't true any more, I'd much rather fix the _name_,
and I'd much rather fix the nasty complex hidden behavior, rather than
just keep the name and keep the behavior, but turning it from an
inline macro to a function call.
And as Ingo points out, we should be able to just make it a field of
its own, instead of that complex dance of TIF_ADDR32 etc.
However, I think it would be better if that field would be in "struct
mm_struct" instead of Ingo's suggestion of the thread. Because while
it's currently a per-thread flag, I think it is only set by execve(),
so it always ends up being the same per-mm. No?
Also, we could/should just make the existing *users* of TASK_SIZE know
that it's no longer a simple constant, so all those functions that use
it many times could just do
unsigned long task_size = TASK_SIZE;
rather than re-compute it multiple times like they do now.
In fact, making it a function call in many ways makes things *worse*,
although maybe we could at least mark the function "pure" so that gcc
would be able to cache the end result. But that would actually be
wrong for the sequences that maybe do change the thread flags, so I
hate that idea too.
Much better to just cache it explicitly in the cases where we see that
it's currently generating bad code.
Linus
Powered by blists - more mailing lists