[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20091121162145.GE12100@Krystal>
Date: Sat, 21 Nov 2009 11:21:45 -0500
From: Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
To: Masami Hiramatsu <mhiramat@...hat.com>
Cc: "H. Peter Anvin" <hpa@...or.com>, Jason Baron <jbaron@...hat.com>,
linux-kernel@...r.kernel.org, mingo@...e.hu, tglx@...utronix.de,
rostedt@...dmis.org, andi@...stfloor.org, roland@...hat.com,
rth@...hat.com
Subject: Re: [RFC PATCH 2/6] jump label v3 - x86: Introduce generic jump
patching without stop_machine
* Masami Hiramatsu (mhiramat@...hat.com) wrote:
> Hi Peter,
>
> H. Peter Anvin wrote:
>> On 11/18/2009 02:43 PM, Jason Baron wrote:
>>> Add text_poke_fixup() which takes a fixup address to where a processor
>>> jumps if it hits the modifying address while code modifying.
>>> text_poke_fixup() does following steps for this purpose.
>>>
>>> 1. Setup int3 handler for fixup.
>>> 2. Put a breakpoint (int3) on the first byte of modifying region,
>>> and synchronize code on all CPUs.
>>> 3. Modify other bytes of modifying region, and synchronize code on all CPUs.
>>> 4. Modify the first byte of modifying region, and synchronize code
>>> on all CPUs.
>>> 5. Clear int3 handler.
>>>
>>> Thus, if some other processor execute modifying address when step2 to step4,
>>> it will be jumped to fixup code.
>>>
>>> This still has many limitations for modifying multi-instructions at once.
>>> However, it is enough for 'a 5 bytes nop replacing with a jump' patching,
>>> because;
>>> - Replaced instruction is just one instruction, which is executed atomically.
>>> - Replacing instruction is a jump, so we can set fixup address where the jump
>>> goes to.
>>>
>>
>> I just had a thought about this... regardless of if this is safe or not
>> (which still remains to be determined)... I have a bit more of a
>> fundamental question about it:
>>
>> This code ends up taking *two* global IPIs for each instruction
>> modification. Each of those requires whole-system synchronization.
>
> As Mathieu and I talked, first IPI is for synchronizing code, and
> second is for waiting for all int3 handling is done.
>
>> How
>> is this better than taking one IPI and having the other CPUs wait until
>> the modification is complete before returning?
>
> Would you mean using stop_machine()? :-)
>
> If we don't care about NMI, we can use stop_machine() (for
> this reason, kprobe-jump-optimization can use stop_machine(),
> because kprobes can't probe NMI code), but tracepoint has
> to support NMI.
>
> Actually, it might be possible, even it will be complicated.
> If one-byte modifying(int3 injection/removing) is always
> synchronized, I assume below timechart can work
> (and it can support NMI/SMI too).
>
> ----
> <CPU0> <CPU1>
> flag = 0
> setup int3 handler
> int3 injection[sync]
> other-bytes modifying
> smp_call_function(func) func()
> wait_until(flag==1) irq_disable()
> sync_core() for other-bytes modifying
> flag = 1
> first-byte modifying[sync] wait_until(flag==2)
Hrm, I don't like this too much. In terms of latency, we can get:
CPU 0: CPU 1
interrupts off
* wait_util(flag == 2)
interrupted
softirq runs...
(we have a drink, network bh
processing, etc etc)
back to standard execution
flag = 2
So, as you see, we increase the interrupt latency on all other CPUs of
the duration of a softirq. This is, I think, an unwanted side-effect.
We should really do performance benchmarks comparing stop_machine() and
the int3-based approach rather than to try to come up with tricky
schemes. It's not a real problem until we prove there is indeed a
performance regression. I suspect that the combined effect of cache-line
bouncing, worker thread overhead and the IPI of stop_machine is probably
comparable to the two IPIs we propose for int3.
Thanks,
Mathieu
> flag = 2
> wait_until(flag==3) irq_enable()
> flag = 3
> cleanup int3 handler return
> return
> ----
>
> I'm not so sure that this flag-based step-by-step code can
> work faster than 2 IPIs :-(
>
> Any comments are welcome! :-)
>
> Thank you,
>
> --
> Masami Hiramatsu
>
> Software Engineer
> Hitachi Computer Products (America), Inc.
> Software Solutions Division
>
> e-mail: mhiramat@...hat.com
>
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists