[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <928CFBE8E7CB0040959E56B4EA41A77E926D46A7@irsmsx504.ger.corp.intel.com>
Date: Mon, 30 Mar 2009 14:55:38 +0100
From: "Metzger, Markus T" <markus.t.metzger@...el.com>
To: Oleg Nesterov <oleg@...hat.com>
CC: "Kleen, Andi" <andi.kleen@...el.com>, Ingo Molnar <mingo@...e.hu>,
Roland McGrath <roland@...hat.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Markus Metzger <markus.t.metzger@...glemail.com>
Subject: RE: [rfc] x86, bts: fix crash
>-----Original Message-----
>From: Oleg Nesterov [mailto:oleg@...hat.com]
>Sent: Monday, March 30, 2009 3:29 PM
>To: Metzger, Markus T
>> >The benefit would be that I don't need to hook into do_exit() anymore.
>
>Metzger, I got lost ;) And I didn't sleep today, so most probably I missed
>what you mean...
The way I understood you, I should defer the release of the bts tracer.
I can schedule the work in __ptrace_unlink(), so I don't need the changes
that hook into do_exit().
The work will be done at a later time with interrupts enabled.
I'm looking into schedule_work() right now, since I don't need all the other
features of RCU.
>do you mean the helper below will be called under write_lock_irq(tasklist)?
>In that case,
>
>> >This would rid us of the nasty ->ptraced loop.
>> >I will give it a try.
>> >
>> >
>> >I use something like this to wait for the context switch:
>> > nvcsw = task->nvcsw + 1;
>> > nivcsw = task->nivcsw + 1;
>> > for (;;) {
>> > if (nvcsw < task->nvcsw)
>> > break;
>> > if (nivcsw < task->nivcsw)
>> > break;
>
>Not exactly right, schedule() increments nvcsw/nivcsw before context_switch().
>But this is fixable.
That's why I added +1. There's still the overflow problem. I now use
nvcsw = task->nvcsw;
for (;;) {
if ((task->nvcsw - nvcsw) > 1)
break;
...
if (!task_is_running(task))
break;
schedule();
}
this should work even for overflowing counters. It waits for two
context switches, or - preferably - for task to be currently not running
on any cpu (see below for task_is_running()).
>However. What if this task spins in TASK_RUNNING waiting for tasklist_lock ?
>This is deadlockable even with CONFIG_PREEMPT, we take tasklit for reading
>in interrupt context.
That code is executed with interrupts enabled and tasklist lock not held.
That's why I added the ptrace_bts_exit_tracer() and ptrace_bts_exit_tracee()
calls - to be able to call ds_release_bts() with interrupts enabled.
>Afaics, we can also deadlock if task_cpu(task) sends IPI to us (with wait = 1),
>the sender spins with preemption disabled.
>
>> > if (task->state != TASK_RUNNING)
>> > break;
>> > }
>> >
>>
>> That is not quite right, as well. There's a race on the task state.
>> In my example, I got TASK_DEAD before the dying task could complete its
>> final schedule(), and the cpu continued tracing.
>
>But we still have the same problems.
>
>If the tracee doesn't call a blocking syscall, its ->state is always RUNNING.
Agreed.
I meanwhile added a function task_is_running() to sched.h that checks whether
the parameter task is currently running on any cpu.
I use that instead of checking ->state.
The function is essentially:
int task_is_running(struct task_struct *p)
{
struct rq *rq;
unsigned long flags;
int running;
rq = task_rq_lock(p, &flags);
running = task_running(rq, p);
task_rq_unlock(rq, &flags);
return running;
}
thanks and regards,
markus.
---------------------------------------------------------------------
Intel GmbH
Dornacher Strasse 1
85622 Feldkirchen/Muenchen Germany
Sitz der Gesellschaft: Feldkirchen bei Muenchen
Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
Registergericht: Muenchen HRB 47456 Ust.-IdNr.
VAT Registration No.: DE129385895
Citibank Frankfurt (BLZ 502 109 00) 600119052
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists