lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 3 May 2012 10:30:38 -0700
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Al Viro <viro@...iv.linux.org.uk>, "H. Peter Anvin" <hpa@...or.com>
Cc:	Nick Piggin <npiggin@...il.com>, Jana Saout <jana@...ut.de>,
	Joel Becker <jlbec@...lplan.org>, linux-kernel@...r.kernel.org
Subject: Re: Oops with DCACHE_WORD_ACCESS and ocfs2, autofs4

On Thu, May 3, 2012 at 9:15 AM, Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
> So I guess I need to do the exception handling that I was hoping I
> wouldn't have to. Give me a jiffy.

Ok, that took longer than a jiffy, the asm was just nasty to get right
with all the proper suffixes for 32-bit vs 64-bit, and the fact that
gas apparently really needs %cl for the shift count, and doesn't like
%rcx. Silly assembler.

Also, the asm would have been much simpler if I didn't care so much
about the regular fast-path. I wanted the fast-path for the asm to be
a single load, with no downside, and everything fixed up in the
exception case.

And it's close. It's a single load, and the only downside is that
register '%rcx' is marked as used, because *if* the exception happens,
we want to use %rcx do the alignment fixup.

Peter, in particular, can you double (and triple-) check my asm, to
see if I missed anything? It does that "lea" of the address into %rcx
twice, because that way we don't need any other register temporaries.

On 32-bit, this results in:

 - fast-path single-instruction unaligned load (with gcc free to pick
registers and addressing modes):

      movl (%edi,%edx),%eax

 - with the exception fixup code becoming:

        leal (%edi,%edx),%ecx
        andl $-4,%ecx
        movl (%ecx),%eax
        leal (%edi,%edx),%ecx
        andl $3,%ecx
        shll $3,%ecx
        shll %cl,%eax
        shrl %cl,%eax
        jmp 2b

which looks ok. I don't worry about the efficiency of the fixup code,
because if that code is ever entered we will have taken a page fault
etc, so the only thing to worry about is that the fixup doesn't
need/fix any unnecessary extra registers so that the fast-path case
doesn't get less flexible.

Does anybody see anything wrong with this?

Anyway, with this, I guess we could enable word-at-a-time even with
CONFIG_DEBUG_PAGEALLOC on x86, and that might even be a good idea for
coverage.

Jana - does the attached patch work for you?

                     Linus

Download attachment "patch.diff" of type "application/octet-stream" (3573 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ