lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4FA2CAD9.6010808@zytor.com>
Date:	Thu, 03 May 2012 11:13:45 -0700
From:	"H. Peter Anvin" <hpa@...or.com>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
CC:	Al Viro <viro@...iv.linux.org.uk>, Nick Piggin <npiggin@...il.com>,
	Jana Saout <jana@...ut.de>, Joel Becker <jlbec@...lplan.org>,
	linux-kernel@...r.kernel.org
Subject: Re: Oops with DCACHE_WORD_ACCESS and ocfs2, autofs4

On 05/03/2012 10:30 AM, Linus Torvalds wrote:
> On Thu, May 3, 2012 at 9:15 AM, Linus Torvalds
> <torvalds@...ux-foundation.org> wrote:
>>
>> So I guess I need to do the exception handling that I was hoping I
>> wouldn't have to. Give me a jiffy.
> 
> Ok, that took longer than a jiffy, the asm was just nasty to get right
> with all the proper suffixes for 32-bit vs 64-bit, and the fact that
> gas apparently really needs %cl for the shift count, and doesn't like
> %rcx. Silly assembler.
> 

Yes, although it's a fixed register you can also just write as %%cl.

> Also, the asm would have been much simpler if I didn't care so much
> about the regular fast-path. I wanted the fast-path for the asm to be
> a single load, with no downside, and everything fixed up in the
> exception case.
> 
> And it's close. It's a single load, and the only downside is that
> register '%rcx' is marked as used, because *if* the exception happens,
> we want to use %rcx do the alignment fixup.
> 
> Peter, in particular, can you double (and triple-) check my asm, to
> see if I missed anything? It does that "lea" of the address into %rcx
> twice, because that way we don't need any other register temporaries.

Just from a cleanliness point of view, I don't think you need the
__WORDSUFFIX for any of these instructions (it is only required if it
would be ambiguous, but the register names should deal with it.)

>  - fast-path single-instruction unaligned load (with gcc free to pick
> registers and addressing modes):
> 
>       movl (%edi,%edx),%eax
> 
>  - with the exception fixup code becoming:
> 
>         leal (%edi,%edx),%ecx
>         andl $-4,%ecx
>         movl (%ecx),%eax
>         leal (%edi,%edx),%ecx
>         andl $3,%ecx
>         shll $3,%ecx
>         shll %cl,%eax
>         shrl %cl,%eax
>         jmp 2b

I think you want to drop the shl instruction.  You're loading what
should end up at the LSB end of the register into the MSB end of the
register, so shr is all you should need.

Let's say %edi+%edx points to 0xcccccffd with the values 66 77 88 99
starting at 0xcccccffc.  If the next page is present and zero, you'd end
up with %eax = 0x00998877, and so you would expect the same.

	lea (%edi,%edx),%ecx	-> %ecx = 0xcccccffd
	and $-4,%ecx		-> %ecx = 0xcccccffc
	mov (%ecx),%eax		-> %eax = 0x99887766
	lea (%edi,%edx),%ecx	-> %ecx = 0xcccccffd
	and $3,%ecx		-> %ecx = 1
	shl $3,%ecx		-> %ecx = 8
	shr %cl,%eax		-> %eax = 0x00998877

	-hpa



-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ