lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <928CFBE8E7CB0040959E56B4EA41A77E926D46A7@irsmsx504.ger.corp.intel.com>
Date:	Mon, 30 Mar 2009 14:55:38 +0100
From:	"Metzger, Markus T" <markus.t.metzger@...el.com>
To:	Oleg Nesterov <oleg@...hat.com>
CC:	"Kleen, Andi" <andi.kleen@...el.com>, Ingo Molnar <mingo@...e.hu>,
	Roland McGrath <roland@...hat.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Markus Metzger <markus.t.metzger@...glemail.com>
Subject: RE: [rfc] x86, bts: fix crash

>-----Original Message-----
>From: Oleg Nesterov [mailto:oleg@...hat.com]
>Sent: Monday, March 30, 2009 3:29 PM
>To: Metzger, Markus T


>> >The benefit would be that I don't need to hook into do_exit() anymore.
>
>Metzger, I got lost ;) And I didn't sleep today, so most probably I missed
>what you mean...

The way I understood you, I should defer the release of the bts tracer.
I can schedule the work in __ptrace_unlink(), so I don't need the changes
that hook into do_exit().
The work will be done at a later time with interrupts enabled.

I'm looking into schedule_work() right now, since I don't need all the other
features of RCU.


>do you mean the helper below will be called under write_lock_irq(tasklist)?
>In that case,
>
>> >This would rid us of the nasty ->ptraced loop.
>> >I will give it a try.
>> >
>> >
>> >I use something like this to wait for the context switch:
>> >  nvcsw = task->nvcsw + 1;
>> >  nivcsw = task->nivcsw + 1;
>> >  for (;;) {
>> >	if (nvcsw < task->nvcsw)
>> >		break;
>> >	if (nivcsw < task->nivcsw)
>> >		break;
>
>Not exactly right, schedule() increments nvcsw/nivcsw before context_switch().
>But this is fixable.

That's why I added +1. There's still the overflow problem. I now use

nvcsw = task->nvcsw;
for (;;) {
	if ((task->nvcsw - nvcsw) > 1)
		break;
	...
	if (!task_is_running(task))
		break;
	schedule();
}

this should work even for overflowing counters. It waits for two
context switches, or - preferably - for task to be currently not running
on any cpu (see below for task_is_running()).


>However. What if this task spins in TASK_RUNNING waiting for tasklist_lock ?
>This is deadlockable even with CONFIG_PREEMPT, we take tasklit for reading
>in interrupt context.

That code is executed with interrupts enabled and tasklist lock not held.
That's why I added the ptrace_bts_exit_tracer() and ptrace_bts_exit_tracee()
calls - to be able to call ds_release_bts() with interrupts enabled.


>Afaics, we can also deadlock if task_cpu(task) sends IPI to us (with wait = 1),
>the sender spins with preemption disabled.
>
>> >	if (task->state != TASK_RUNNING)
>> >		break;
>> >  }
>> >
>>
>> That is not quite right, as well. There's a race on the task state.
>> In my example, I got TASK_DEAD before the dying task could complete its
>> final schedule(), and the cpu continued tracing.
>
>But we still have the same problems.
>
>If the tracee doesn't call a blocking syscall, its ->state is always RUNNING.

Agreed.

I meanwhile added a function task_is_running() to sched.h that checks whether
the parameter task is currently running on any cpu.
I use that instead of checking ->state.

The function is essentially:
int task_is_running(struct task_struct *p)
{
	struct rq *rq;
	unsigned long flags;
	int running;

	rq = task_rq_lock(p, &flags);
	running = task_running(rq, p);
	task_rq_unlock(rq, &flags);

	return running;
}


thanks and regards,
markus.

---------------------------------------------------------------------
Intel GmbH
Dornacher Strasse 1
85622 Feldkirchen/Muenchen Germany
Sitz der Gesellschaft: Feldkirchen bei Muenchen
Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
Registergericht: Muenchen HRB 47456 Ust.-IdNr.
VAT Registration No.: DE129385895
Citibank Frankfurt (BLZ 502 109 00) 600119052

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ