lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 19 Nov 2009 20:00:12 -0500
From:	Masami Hiramatsu <mhiramat@...hat.com>
To:	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
CC:	Jason Baron <jbaron@...hat.com>, mingo@...e.hu,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	linux-kernel@...r.kernel.org, hpa@...or.com, tglx@...utronix.de,
	rostedt@...dmis.org, andi@...stfloor.org, roland@...hat.com,
	rth@...hat.com
Subject: Re: [RFC PATCH 2/6] jump label v3 - x86: Introduce generic jump	patching
 without stop_machine

Mathieu Desnoyers wrote:
[...]
>>>> +	if (unlikely(len<= 1))
>>>> +		return text_poke(addr, opcode, len);
>>>> +
>>>> +	/* Preparing */
>>>> +	patch_fixup_addr = fixup;
>>>> +	wmb();
>>>
>>> hrm, missing comment ?
>>
>> Ah, it's a barrier between patch_fixup_addr and patch_fixup_from.
>> int3 trap handler checks patch_fixup_from first and refers patch_fixup_addr.
>
> When a smp_wmb() is probably enough, and the matching smp_rmb() is
> missing in the int3 handler.

OK, thank you.

> But why to you care about the order of these two ? I agree that an
> unrelated int3 handler (from kprobes ?) could be running concurrently at
> that point, but it clearly cannot be called for this specific address
> until the int3 is written by text_poke.

Ah, it's my fault. I fixed that a month ago, and forgot to push it...
Actually, we don't need to care the order of those two. Instead,
we have to update the patch_fixup_* before int3 embedding.

>
> What I am pretty much certain is missing would be a smp_wmb()...

Agreed.

>
>>
>>>
>>>> +	patch_fixup_from = (u8 *)addr + int3_size; /* IP address after int3 */
>
> ..right here, between where you write to the data used by the int3
> handler and where you write the actual breakpoint. On the read-side,
> this might be a problem with architectures like alpha needing
> smp_read_barrier_depends(), but not for Intel. However, in a spirit to
> make this code solid, what I did in the immed. val. is:
>
>
>                  target_after_int3 = insn + BREAKPOINT_INS_LEN;
>                  /* register_die_notifier has memory barriers */
>                  register_die_notifier(&imv_notify);
>                  /* The breakpoint will single-step the bypass */
>                  text_poke((void *)insn,
>                          ((unsigned char[]){BREAKPOINT_INSTRUCTION}), 1);

Hmm, it strongly depends on arch. Is smp_wmb() right after setting
patch_fixup_from enough on x86?

> And I unregister the die notifier at the end after having reached
> quiescent state. At least we know that the die notifier chain read-side
> has the proper memory barriers, which is not the case for the breakpoint
> instruction itself.
>
>
>>>> +
>>>> +	/* Cap by an int3 */
>>>> +	text_poke(addr,&int3_insn, int3_size);
>>>> +	sync_core_all();
>>>> +
>>>> +	/* Replace tail bytes */
>>>> +	text_poke((char *)addr + int3_size, (const char *)opcode + int3_size,
>>>> +		  len - int3_size);
>>>> +	sync_core_all();
>>>> +
>>>> +	/* Replace int3 with head byte */
>>>> +	text_poke(addr, opcode, int3_size);
>>>> +	sync_core_all();
>>>> +
>>>> +	/* Cleanup */
>>>> +	patch_fixup_from = NULL;
>>>> +	wmb();
>>>
>>> missing comment here too.
>>>
>>>> +	return addr;
>>>
>>> Little quiz question:
>>>
>>> When patch_fixup_from is set to NULL, what ensures that the int3
>>> handlers have completed their execution ?
>>>
>>> I think it's probably OK, because the int3 is an interrupt gate, which
>>> therefore disables interrupts as soon as it runs, and executes the
>>> notifier while irqs are off. When we run sync_core_all() after replacing
>>> the int3 by the new 1st byte, we only return when all other cores have
>>> executed an interrupt, which implies that all int3 handlers previously
>>> running should have ended. Is it right ? It looks to me as if this 3rd
>>> sync_core_all() is only needed because of that. Probably that adding a
>>> comment would be good.
>>
>> Thanks, it's a good point and that's more what I've thought.
>> As you said, it is probably safe. Even if it's not safe,
>> we can add some int3 fixup handler (with lowest priority)
>> which set regs->ip-1 if there is no int3 anymore, for safety.
>
> Well, just ensuring that the we reaches a "disabled IRQ code quiescent
> state" should be enough. Another way would be to use
> synchronize_sched(), but it might take longer. Actively poking the other
> CPUs with IPIs seems quicker. So I would be tempted to leave your code
> as is in this respect, but to add a comment.

Agreed. synchronize_sched() waits too long for this purpose.
OK, I'll add a comment for that "waiting for disabled IRQ code quiescent state" :-)
Thanks for the good advice!

>>> Another thing: I've recently noticed that the following locking seems to
>>> hang the system with doing stress-testing concurrently with cpu
>>> hotplug/hotunplug:
>>>
>>> mutex_lock(&text_mutex);
>>>    on_each_cpu(something, NULL, 1);
>>>
>>> The hang seems to be caused by the fact that alternative.c has:
>>>
>>> within cpu hotplug (cpu hotplug lock held)
>>>    mutex_lock(&text_mutex);
>>>
>>> It might also be caused by the interaction with the stop_machine()
>>> performed within the cpu hotplug lock. I did not find the root cause of
>>> the problem, but this probably calls for lockdep improvements.
>>
>> Hmm, would you mean it will happen even if we use stop_machine()
>> under text_mutex locking?
>> It seems that bigger problem of cpu-hotplug and on_each_cpu() etc.
>
> Yes, but, again.. this calls for more testing. Hopefully it's not
> something else in my own code I haven't seen. For not I can just say
> that I've been noticing hangs involving cpu hotplug and text mutex, and
> taking the cpu hotplug mutex around text mutex (in my immediate values
> code) fixed the problem.

Hmm, I guess that we'd better merge those two mutexes since
text modification always requires disabling cpu-hotplug...

Thank you,


-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: mhiramat@...hat.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ