linux-kernel - RE: [rfc] x86, bts: fix crash

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <928CFBE8E7CB0040959E56B4EA41A77E9266BB48@irsmsx504.ger.corp.intel.com>
Date:	Mon, 30 Mar 2009 12:29:32 +0100
From:	"Metzger, Markus T" <markus.t.metzger@...el.com>
To:	Oleg Nesterov <oleg@...hat.com>
CC:	"Kleen, Andi" <andi.kleen@...el.com>, Ingo Molnar <mingo@...e.hu>,
	Roland McGrath <roland@...hat.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Markus Metzger <markus.t.metzger@...glemail.com>
Subject: RE: [rfc] x86, bts: fix crash

>-----Original Message-----
>From: Metzger, Markus T
>Sent: Monday, March 30, 2009 9:24 AM
>To: Oleg Nesterov; Markus Metzger
>Cc: Kleen, Andi; Ingo Molnar; Roland McGrath; linux-kernel@...r.kernel.org


>But that's exactly the problem if the bts_tracer is released by another
>task (!= current).
>We first need to disable tracing by clearing bits in debugctlmsr.
>Then, we can set ->bts to NULL and clear TIF_DS_AREA_MSR.
>
>If the traced task is currently in a context switch, clearing debugctlmsr bits
>might come too late. We need to wait for the next context switch until we can
>clear TIF_DS_AREA_MSR.
>
>
>The benefit would be that I don't need to hook into do_exit() anymore.
>This would rid us of the nasty ->ptraced loop.
>I will give it a try.
>
>
>I use something like this to wait for the context switch:
>  nvcsw = task->nvcsw + 1;
>  nivcsw = task->nivcsw + 1;
>  for (;;) {
>	if (nvcsw < task->nvcsw)
>		break;
>	if (nivcsw < task->nivcsw)
>		break;
>	if (task->state != TASK_RUNNING)
>		break;
>  }
>
>I would need to check for overflows, but for my examples, I don't expect any.


That is not quite right, as well. There's a race on the task state.
In my example, I got TASK_DEAD before the dying task could complete its
final schedule(), and the cpu continued tracing.

When I use the task_running() test that wait_task_inactive() uses, as well,
it seems to work. Together with checking the number of context switches,
it should be reasonably fast, as well (I don't run the risk of being scheduled
perfectly synchronous to the task I'm waiting for).

I would need to add an extern function task_is_running(struct task_struct *)
to sched.h, though. I hope that's acceptable.


regards,
markus.
---------------------------------------------------------------------
Intel GmbH
Dornacher Strasse 1
85622 Feldkirchen/Muenchen Germany
Sitz der Gesellschaft: Feldkirchen bei Muenchen
Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
Registergericht: Muenchen HRB 47456 Ust.-IdNr.
VAT Registration No.: DE129385895
Citibank Frankfurt (BLZ 502 109 00) 600119052

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/