lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 19 Sep 2007 14:22:14 -0400
From:	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
To:	Jeremy Fitzhardinge <jeremy@...p.org>
Cc:	"H. Peter Anvin" <hpa@...or.com>, akpm@...ux-foundation.org,
	linux-kernel@...r.kernel.org, Andi Kleen <ak@....de>,
	Chuck Ebbert <cebbert@...hat.com>,
	Christoph Hellwig <hch@...radead.org>
Subject: Re: [patch 4/7] Immediate Values - i386 Optimization

* Jeremy Fitzhardinge (jeremy@...p.org) wrote:
> H. Peter Anvin wrote:
> > Mathieu Desnoyers wrote:
> >   
> >> Ok, let's have a good look at what we want:
> >>
> >> 1 - get a pointer to the beginning of the immediate value within the
> >>     instruction.
> >> 2 - make sure that the immediate value, within the instruction, is
> >>     written to atomically wrt all CPUs, even on older architectures
> >>     where non aligned writes are not atomic.
> >>
> >>     
> >
> > I think you'll find that even on modern architectures cross-cacheline
> > writes aren't atomic.
> >   
> 
> Cross-cache-line, sure.  But what about just not sizeof aligned?  If its
> enough to avoid cross-cache-line, then that's simpler.
> 

Being sizeof aligned on a cache-line (e.g. 32 bytes boundaries) is a
superset of being aligned on sizeof multiples (e.g. 4 bytes). Therefore,
if we declare data of a certain size not aligned on the sizeof
boundaries, we won't be aligned on cache-lines neither. (unless I am
utterly wrong..) :)

> Which is something I was going to comment on: Mathieu, you try to align
> the constant itself, but you don't prevent the instruction overall from
> crossing a cache line.  Given how delicate all this stuff is, it seems
> like a good idea to do that.
> 

We just can't, for movl is 5 bytes in total : 1 byte for opcode, 4
bytes for the immediate value. But since we do not modify the opcode at
all, CPUs will either see the old or new immediate value (each of those
will be coherent because of the atomic update) and, in every case, they
will use it with the same opcode that haven't been touched.

> 
> >> * 4 bytes
> >> B8 + rd         MOV r32, imm32   (1 byte opcode)
> >> C7 /0           MOV r/m32, imm32 (2 bytes opcode)
> >> (the 2 bytes opcode can be a problem)
> >>
> >>     
> >
> > If gas generates the C7 opcodes by default, then that's a bug, nothing less.
> >   
> 
> Well, in this case, it might be preferred if it brings the constant into
> alignment without explicit padding :)
> 

It will need explicit padding too. We would have to align the 4 bytes
immediate value on 4 bytes multiples. Therefore, this 2 bytes opcode
followed by 4 bytes immediate value would have to be aligned on
(4 bytes - 2) boundaries.

Mathieu

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ