linux-kernel - Re: [RFC PATCH 2/6] jump label v3 - x86: Introduce generic jump patching without stop

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Fri, 20 Nov 2009 19:06:13 -0500
From:	Masami Hiramatsu <mhiramat@...hat.com>
To:	"H. Peter Anvin" <hpa@...or.com>
CC:	Jason Baron <jbaron@...hat.com>, linux-kernel@...r.kernel.org,
	mingo@...e.hu, mathieu.desnoyers@...ymtl.ca, tglx@...utronix.de,
	rostedt@...dmis.org, andi@...stfloor.org, roland@...hat.com,
	rth@...hat.com
Subject: Re: [RFC PATCH 2/6] jump label v3 - x86: Introduce generic jump patching
 without stop_machine

Hi Peter,

H. Peter Anvin wrote:
> On 11/18/2009 02:43 PM, Jason Baron wrote:
>> Add text_poke_fixup() which takes a fixup address to where a processor
>> jumps if it hits the modifying address while code modifying.
>> text_poke_fixup() does following steps for this purpose.
>>
>>   1. Setup int3 handler for fixup.
>>   2. Put a breakpoint (int3) on the first byte of modifying region,
>>      and synchronize code on all CPUs.
>>   3. Modify other bytes of modifying region, and synchronize code on all CPUs.
>>   4. Modify the first byte of modifying region, and synchronize code
>>      on all CPUs.
>>   5. Clear int3 handler.
>>
>> Thus, if some other processor execute modifying address when step2 to step4,
>> it will be jumped to fixup code.
>>
>> This still has many limitations for modifying multi-instructions at once.
>> However, it is enough for 'a 5 bytes nop replacing with a jump' patching,
>> because;
>>   - Replaced instruction is just one instruction, which is executed atomically.
>>   - Replacing instruction is a jump, so we can set fixup address where the jump
>>     goes to.
>>
>
> I just had a thought about this... regardless of if this is safe or not
> (which still remains to be determined)... I have a bit more of a
> fundamental question about it:
>
> This code ends up taking *two* global IPIs for each instruction
> modification.  Each of those requires whole-system synchronization.

As Mathieu and I talked, first IPI is for synchronizing code, and
second is for waiting for all int3 handling is done.

>  How
> is this better than taking one IPI and having the other CPUs wait until
> the modification is complete before returning?

Would you mean using stop_machine()? :-)

If we don't care about NMI, we can use stop_machine() (for
this reason, kprobe-jump-optimization can use stop_machine(),
because kprobes can't probe NMI code), but tracepoint has
to support NMI.

Actually, it might be possible, even it will be complicated.
If one-byte modifying(int3 injection/removing) is always
synchronized, I assume below timechart can work
(and it can support NMI/SMI too).

----
        <CPU0>                  <CPU1>
flag = 0
setup int3 handler
int3 injection[sync]
other-bytes modifying
smp_call_function(func)    func()
wait_until(flag==1)        irq_disable()
                            sync_core() for other-bytes modifying
                            flag = 1
first-byte modifying[sync] wait_until(flag==2)
flag = 2
wait_until(flag==3)        irq_enable()
                            flag = 3
cleanup int3 handler       return
return
----

I'm not so sure that this flag-based step-by-step code can
work faster than 2 IPIs :-(

Any comments are welcome! :-)

Thank you,

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: mhiramat@...hat.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/