[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4CA0A76B.6000803@redhat.com>
Date: Mon, 27 Sep 2010 16:17:15 +0200
From: Avi Kivity <avi@...hat.com>
To: Joerg Roedel <joro@...tes.org>
CC: x86@...nel.org, linux-kernel@...r.kernel.org, kvm@...r.kernel.org
Subject: Re: [PATCH] x86, nmi: workaround sti; hlt race vs nmi; intr
On 09/27/2010 12:31 PM, Joerg Roedel wrote:
> On Sun, Sep 19, 2010 at 06:28:19PM +0200, Avi Kivity wrote:
> > On machines without monitor/mwait we use an sti; hlt sequence to atomically
> > enable interrupts and put the cpu to sleep. The sequence uses the "interrupt
> > shadow" property of the sti instruction: interrupts are enabled only after
> > the instruction following sti has been executed. This means an interrupt
> > cannot happen in the middle of the sequence, which would leave us with
> > the interrupt processed but the cpu halted.
> >
> > The interrupt shadow, however, can be broken by an nmi; the following
> > sequence
> >
> > sti
> > nmi ... iret
> > # interrupt shadow disabled
> > intr ... iret
> > hlt
> >
> > puts the cpu to sleep, even though the interrupt may need additional
> > processing after the hlt (like scheduling a task).
>
> Doesn't the interrupt return path check for a re-schedule condition
> before iret? So to my believe the handler would not jump back to the
> idle task if something else becomes running in the interrupt handler,
> no?
>
Perhaps on preemptible kernels? But at least on non-preemptible
kernels, you can't just switch tasks while running kernel code.
void cpu_idle(void)
{
current_thread_info()->status |= TS_POLLING;
/*
* If we're the non-boot CPU, nothing set the stack canary up
* for us. CPU0 already has it initialized but no harm in
* doing it again. This is a good place for updating it, as
* we wont ever return from this function (so the invalid
* canaries already on the stack wont ever trigger).
*/
boot_init_stack_canary();
/* endless idle loop with no priority at all */
while (1) {
tick_nohz_stop_sched_tick(1);
while (!need_resched()) {
rmb();
if (cpu_is_offline(smp_processor_id()))
play_dead();
/*
* Idle routines should keep interrupts disabled
* from here on, until they go to idle.
* Otherwise, idle callbacks can misfire.
*/
local_irq_disable();
enter_idle();
/* Don't trace irqs off for idle */
stop_critical_timings();
pm_idle();
start_critical_timings();
trace_power_end(smp_processor_id());
/* In many cases the interrupt that ended idle
has already called exit_idle. But some idle
loops can be woken up without interrupt. */
__exit_idle();
}
tick_nohz_restart_sched_tick();
preempt_enable_no_resched();
schedule();
preempt_disable();
}
}
Looks like we rely on an explicit schedule() - pm_idle() is called with
preemption disabled.
(pm_idle eventually calls safe_halt() if no other idle method is used)
--
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists