lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAHA+R7OVn+GFoY=_FnWNbKi0i1ubOm8f=ymCjweNrSu=2SDVCQ@mail.gmail.com>
Date:	Tue, 2 Sep 2014 15:15:10 -0700
From:	Cong Wang <cwang@...pensource.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Cong Wang <xiyou.wangcong@...il.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	stable <stable@...r.kernel.org>,
	Paul Mackerras <paulus@...ba.org>,
	Ingo Molnar <mingo@...hat.com>,
	Arnaldo Carvalho de Melo <acme@...nel.org>
Subject: Re: [Patch] perf_event: fix a race condition in perf_remove_from_context()

On Mon, Sep 1, 2014 at 1:38 AM, Peter Zijlstra <peterz@...radead.org> wrote:
> On Thu, Aug 28, 2014 at 04:27:35PM -0700, Cong Wang wrote:
>> From: Cong Wang <cwang@...pensource.com>
>>
>> We saw a kernel soft lockup in perf_remove_from_context(),
>> it looks like the `perf` process, when exiting, could not go
>> out of the retry loop. Meanwhile, the target process was forking
>> a child. So either the target process should execute the smp
>> function call to deactive the event (if it was running) or it should
>> do a context switch which deactives the event.
>>
>> It seems we optimize out a context switch in perf_event_context_sched_out(),
>> and what's more important, we still test an obsolete task pointer when
>> retrying, so no one actually would deactive that event in this situation.
>> Fix it directly by reloading the task pointer in perf_remove_from_context().
>> This should fix the above soft lockup.
>
>
>
>> ---
>> diff --git a/kernel/events/core.c b/kernel/events/core.c
>> index f9c1ed0..c4141a0 100644
>> --- a/kernel/events/core.c
>> +++ b/kernel/events/core.c
>> @@ -1524,6 +1524,11 @@ retry:
>
> Please use either:
>
> .gitconfig:
>
> [diff "default"]
>         xfuncname = "^[[:alpha:]$_].*[^:]$"
>
> .quiltrc:
>
> QUILT_DIFF_OPTS="-F ^[[:alpha:]\$_].*[^:]\$"
>

OK, I didn't know this before.


>>        */
>>       if (ctx->is_active) {
>>               raw_spin_unlock_irq(&ctx->lock);
>> +             /*
>> +              * Reload the task pointer, it might have been changed by
>> +              * a concurrent perf_event_context_sched_out() without switching
>> +              */
>> +             task = ctx->task;
>>               goto retry;
>>       }
>
> You forgot to check if that same error happened in other places (it
> does), please fix all of them.

I think you mean perf_install_in_context()? I only saw the soft lockup in
perf_remove_from_context() so far, but I can fix other places if you want.

Thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ