lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 13 Nov 2008 22:37:44 +0100
From:	Ingo Molnar <mingo@...e.hu>
To:	Jiri Kosina <jkosina@...e.cz>
Cc:	Andi Kleen <andi@...stfloor.org>,
	Robert Richter <robert.richter@....com>,
	oprofile-list@...ts.sf.net, Jiri Benc <jbenc@...e.cz>,
	Vilem Marsik <vmarsik@...e.cz>,
	Eric Dumazet <dada1@...mosbay.com>,
	Pekka Enberg <penberg@...helsinki.fi>,
	linux-kernel@...r.kernel.org
Subject: Re: Oprofile [still] doesn't work on 2.6.28-rc4 on certain CPU


* Jiri Kosina <jkosina@...e.cz> wrote:

> On Thu, 13 Nov 2008, Ingo Molnar wrote:
> 
> > > I haven't yet found a time to start bisecting this.
> > Would be nice to identify a commit to revert - in case we run out of 
> > time fixing it.
> 
> Yup, I first wanted to make this known to the public in hope that it 
> will ring a bell somewhere.
> 
> If noone sees an obvous reason for this, I will do my best to bisect 
> this tomorrow.

We've got the one patch below pending, but that's not for AMD cpus so 
it shouldnt impact your case.

But ... some change made it all much more fragile. I'm curious why 
things became more fragile.

	Ingo

--------------->
Subject: oprofile: un-mask APIC before resetting counter in ppro_check_ctrs()
From: Eric Dumazet <dada1@...mosbay.com>
Date: Tue, 11 Nov 2008 09:32:12 +0100

While using oprofile on my HP BL460c G1, (two quad core intel E5450 CPU),
I noticed that one CPU after the other could not get anymore NMI.

After a while, all cores where blocked (ie not generating events for oprofile)
I tried all major linux versions and all where affected by this freeze.

I found that we have to un-mask APIC *before* writing to MSR counter
when we get event notification, because we use APIC_LVTPC in edge triggered mode.

Signed-off-by: Eric Dumazet <dada1@...mosbay.com>
Signed-off-by: Ingo Molnar <mingo@...e.hu>
---
 arch/x86/oprofile/op_model_ppro.c |   10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

Index: tip/arch/x86/oprofile/op_model_ppro.c
===================================================================
--- tip.orig/arch/x86/oprofile/op_model_ppro.c
+++ tip/arch/x86/oprofile/op_model_ppro.c
@@ -126,6 +126,12 @@ static int ppro_check_ctrs(struct pt_reg
 	u64 val;
 	int i;
 
+	/*
+	 * We need to unmask the apic vector *before* writing reset_value
+	 * to msr counter, because we use edge trigger
+	 */
+	apic_write(APIC_LVTPC, apic_read(APIC_LVTPC) & ~APIC_LVT_MASKED);
+
 	for (i = 0 ; i < num_counters; ++i) {
 		if (!reset_value[i])
 			continue;
@@ -136,10 +142,6 @@ static int ppro_check_ctrs(struct pt_reg
 		}
 	}
 
-	/* Only P6 based Pentium M need to re-unmask the apic vector but it
-	 * doesn't hurt other P6 variant */
-	apic_write(APIC_LVTPC, apic_read(APIC_LVTPC) & ~APIC_LVT_MASKED);
-
 	/* We can't work out if we really handled an interrupt. We
 	 * might have caught a *second* counter just after overflowing
 	 * the interrupt for this counter then arrives
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ