[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <4FF2C19D.4040608@parallels.com>
Date: Tue, 3 Jul 2012 13:55:41 +0400
From: Glauber Costa <glommer@...allels.com>
To: Jan Beulich <JBeulich@...e.com>, <jeremy@...p.org>,
Ingo Molnar <mingo@...e.hu>, "H. Peter Anvin" <hpa@...or.com>,
Thomas Gleixner <tglx@...utronix.de>,
linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: __force_order usage on x86's CRn accesses
>
> Pretty soon after the introduction of this via "x86: unify paravirt
> parts of system.h" Jeremy had posted a patch to correct the
> (reversed in direction of access) constraints using that auxiliary
> variable. What happened to that change? Was it dropped for a
> reason?
About the change itself, I don't know. Maybe Jeremy, who proposed it,
can give you a better answer.
> Furthermore, the addition of these constraints happened
> without any real explanation - the code comment that was added
> doesn't really help understand why "volatile" isn't sufficient here.
If my memory dont't fail me, I believe this is because gcc will feel
free to reorder a sequence of instructions that does not access memory.
Specially since it has no knowledge of what's in the inline assembly,
and what are its constraints. It only knows that it is an register
operation, and treats it like one.
Also, I believe what we are concerned with here is not arbitrary reorder
between that and other instructions, which we welcome, but reordering
between a read and a write to the same crX - specially of concern for
things doing read-modify-writes of control registers.
This is something I heavily looked at the time, but never again, so
confirmation from people with more knowledge on the compilers world can
be probably helpful here.
>
> Next, I also saw it being suggested somewhere to replace the
> static definition of the variable (causing an instance to appear
> everywhere any of the referencing inline function being used)
> by an extern one. That also never got carried out, causing
> about two dozen pointless instances of this variable to be
> scattered around, likely consuming cache bandwidth here and
> there.
I indeed see no reason not to do it.
> Now, rather than exporting an auxiliary variable like this, how
> about using an existing, rarely referenced (at least in affected
> contexts) but already exported one instead?
> empty_zero_page, __supported_pte_mask, x86_platform, or
> high_memory might be candidates.
A similar technique is used to determine a safe address for access
somewhere in the scheduler IIRC,
> Finally (and this is because I lack the explanation why the
> artificial constraint is needed in the first place), why is it that
> clts() doesn't need one too?
>
Because we're not using it to do read-modify-write of the control register.
But let me disclaim again that I might be wrong here...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists