lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 17 Nov 2008 19:23:40 +0100
From:	Andi Kleen <andi@...stfloor.org>
To:	Alexander van Heukelum <heukelum@...tmail.fm>
Cc:	Andi Kleen <andi@...stfloor.org>, Ingo Molnar <mingo@...e.hu>,
	LKML <linux-kernel@...r.kernel.org>,
	Alexander van Heukelum <heukelum@...i.uni-sb.de>,
	Glauber Costa <gcosta@...hat.com>
Subject: Re: [RFC] x86: save_args out of line

> I liked this way of calling an external function, because it
> circumvents exactly most of this ugly setup. For two bytes

It's not too ugly, just one field is swapped. 

> extra per stub, one can make this setup function a completely
> normal one (without relocating the return address on the stack).
> I intended the stubs to fit in one cache line (32 bytes) as it

Cache lines are 64 bytes on pretty much all modern x86 CPUs.

> does not make much sense to make them smaller than that. A
> further advantage is that no indirect call is needed, but maybe
> this is not as slow anymore as it used to be? B.t.w., I intended

It depends on the CPU. But always it requires resources which
someone else might have already swamped.

> to change the exception handler stubs in a similar way to get
> rid of the indirect call.

Ok. Hopefully it's worth the effort. The branch misprediction bubble
should not be too bad, perhaps it'll make up for the other cycles
you're adding. But even if it doesn't decreasing cache line foot print is 
always a good thing.

> The second copy of %rbp is indeed placed at the 'correct' position
> inside the pt_regs struct. However, at this point only a partial
> stack frame is saved: the C argument registers and the scratch
> registers. r12-r15,ebp,and rbx will be saved only later if necessary.
> A problem could arise if some code uses pt_regs.bp of this partial
> stack frame, because it will contain a bogus value. Glauber's patch

Well they shouldn't -- 

> makes pt_regs.bp contain the right value most of the time... Which
> means that if the patch fixed something for him, the problem has only
> been made unlikely to happen. The place where things should be fixed
> are the places where pt_regs.bp is used, but not filled in.

Hmm I first thought it was rather so that backtracing over the exception
stubs with FPs works better. But then it didn't even set up
a correct frame for that so it might have been something else.

The original design was that you should only use these extended
registers in special calls that go through the PTREGS stubs.

But then sysprof/oprofile application profiling was added which
wants to do user space backtracing using FPs (which seemed pointless to me
because most/all of 64bit user space and recently also pretty
much all 32bit user space is compiled without FPs by default)

The only way to make this work though is to always save RBP.
But there is no need to do the strange double saving, you
can just store it always directly.

But I personally have doubts its worth making every interrupt
/syscall slower since it won't work with most apps anyways.
The only sure way to do these backtraces is to do dwarf2 unwinding
in user space too which doesn't need hacks like this.

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists