lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070921133107.GB13129@Krystal>
Date:	Fri, 21 Sep 2007 09:31:07 -0400
From:	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
To:	Denys Vlasenko <vda.linux@...glemail.com>
Cc:	akpm@...ux-foundation.org, linux-kernel@...r.kernel.org,
	Andi Kleen <ak@....de>, "H. Peter Anvin" <hpa@...or.com>,
	Chuck Ebbert <cebbert@...hat.com>,
	Christoph Hellwig <hch@...radead.org>
Subject: Re: [patch 4/7] Immediate Values - i386 Optimization

* Denys Vlasenko (vda.linux@...glemail.com) wrote:
> On Tuesday 18 September 2007 22:07, Mathieu Desnoyers wrote:
> > i386 optimization of the immediate values which uses a movl with code patching
> > to set/unset the value used to populate the register used as variable source.
> > 
> > Changelog:
> > - Use text_poke_early with cr0 WP save/restore to patch the bypass. We are doing
> >   non atomic writes to a code region only touched by us (nobody can execute it
> >   since we are protected by the immediate_mutex).
> > - Put immediate_set and _immediate_set in the architecture independent header.
> 
> > +struct __immediate {
> > +	long var;		/* Pointer to the identifier variable of the
> > +				 * immediate value
> > +				 */
> > +	long immediate;		/*
> > +				 * Pointer to the memory location of the
> > +				 * immediate value within the instruction.
> > +				 */
> > +	long size;		/* Type size. */
> > +};
> 
> 
> > +		case 2:							\
> > +			asm (	".section __immediate, \"a\", @progbits;\n\t" \
> > +					".long %1, (0f)+2, 2;\n\t"	\
> > +					".previous;\n\t"		\
> > +					"1:\n\t"			\
> > +					".align 2;\n\t"			\
> > +					"0:\n\t"			\
> > +					"mov %2,%0;\n\t"		\
> > +				: "=r" (value)				\
> > +				: "m" (name##__immediate),		\
> > +				  "i" (0));				\
> 
> Instead of letting gcc use whatever instruction it sees fit best
> for accessing the variable (like add/cmp/test...)
> now we force it to use mov imm,reg first. Maybe with preceding nop
> due to "align 2".
> 

Yes, this is true. So, the following branch:

char x;

void testb(void)
{
        if (x > 5)
                testa();
}

Would turn into:
  56:   b0 00                   mov    $0x0,%al
  58:   3c 05                   cmp    $0x5,%al
  5a:   7e 05                   jle    61 <testb+0x11>


Rather than:

  56:   80 3d 00 00 00 00 05    cmpb   $0x5,0x0
  5d:   7e 05                   jle    64 <testb+0x14>

> And then we use 12 more bytes in __immediate section
> *for each* place where you read the variable.
> 

Yes. You must consider the this section is only used when updating the
variable. It is never used by the read-side and therefore does not
consume data cache on hot paths.

> Do you plan to use the same approach on x86_64?
> I mean, longs there are twice as long.
> 

Yup. It's a memory footprint vs active cacheline footprints tradeoff.
When GCC optimizes for size and we see kernel speedups, it is not so
because it "consumes" less memory, but rather that there is less junk
polluting the cachelines. So unless you worry about a few K of data and
are an embedded system developer, I really don't see why you worry about
this. Oh, and by the way, I provide the ability to disable immediate
values in the EMBEDDED menu.

> Can this be made conditional, on CONFIG_CC_OPTIMIZE_FOR_SIZE perhaps?

No. As I just stated, only embedded developers would have an interest in
disabling this features because they would have so few memory available
on their architecture. The memory consumed by the immediate values table
is out of the hot path cachelines and therefore does not impact overall
performance.

> --
> vda

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ