lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 31 Jul 2008 13:23:36 +0200
From:	Sebastien Dugue <sebastien.dugue@...l.net>
To:	"John Kacur" <jkacur@...il.com>
Cc:	"Chirag Jog" <chirag@...ux.vnet.ibm.com>,
	"J?rgen Mell" <j.mell@...nline.de>,
	"Thomas Gleixner" <tglx@...utronix.de>,
	LKML <linux-kernel@...r.kernel.org>,
	rt-users <linux-rt-users@...r.kernel.org>,
	"Steven Rostedt" <rostedt@...dmis.org>,
	"Clark Williams" <williams@...hat.com>,
	"Peter Zijlstra" <peterz@...radead.org>,
	"Josh Triplett" <josht@...ux.vnet.ibm.com>,
	"Timothy R. Chavez" <tim.chavez@...ux.vnet.ibm.com>
Subject: Re: [PATCH] Fix Bug messages

  Hi John,

On Thu, 31 Jul 2008 12:13:24 +0200 "John Kacur" <jkacur@...il.com> wrote:

> On Thu, Jul 31, 2008 at 10:00 AM, Sebastien Dugue
> <sebastien.dugue@...l.net> wrote:
> > On Wed, 30 Jul 2008 22:48:42 +0530 Chirag Jog <chirag@...ux.vnet.ibm.com> wrote:
> >
> >> * J?rgen Mell <j.mell@...nline.de> [2008-07-30 11:01:32]:
> >>
> >> > Hello Thomas,
> >> >
> >> > On Wednesday, 30. July 2008, Thomas Gleixner wrote:
> >> > > We are pleased to announce the 2.6.26-rt1 tree, which can be
> >> > > downloaded from the location:
> >> >
> >> > I have tried the new kernel and have some good news and some bad news:
> >> >
> >> > The good news: The machine boots and seems to run without major problems.
> >> >
> >> > The bad news: It produces continuously lots of bug messages in the error
> >> > logs (cf. attached dmesg.tgz). The error at rtmutex.c:743 was already
> >> > present in 2.6.25-rt* when ACPI was enabled. The 'using smp_processor_id
> >> > ()  in preemptible code' is new here with 2.6.26.
> >> >
> >> > Machine is an old Athlon XP (single core) on an EPOX mainboard with VIA
> >> > chipset.
> >> >
> >> > If I can help with testing, please let me know.
> >> >
> >> > Bye,
> >> >          Jürgen
> >> >
> >> >
> >> This patch should solve some of the bug messages.
> >> It does two things:
> >> 1. Change rt_runtime_lock to be a raw spinlock as the comment above it
> >> says: it is nested inside the rq lock.
> >>
> >> 2. Change mnt_writers to be a per_cpu locked variable.
> >> This eliminates the need for the codepath to disable preemption and
> >> then potentially sleep, leading to the BUG messages
> >>
> >> Signed-Off-By: Chirag <chirag@...ux.vnet.ibm.com>
> >
> >  Neat, the only remaining BUGs I see are from sock_prot_inuse_add()
> >
> > BUG: using smp_processor_id() in preemptible [00000000] code: arping/1916
> > caller is .sock_prot_inuse_add+0x30/0x80
> > Call Trace:
> > [c0000000eed2f910] [c000000000010304] .show_stack+0x70/0x1bc (unreliable)
> > [c0000000eed2f9c0] [c0000000001a2340] .debug_smp_processor_id+0x138/0x168
> > [c0000000eed2fa70] [c0000000002181f4] .sock_prot_inuse_add+0x30/0x80
> > [c0000000eed2fb10] [c00000000026d96c] .udp_lib_get_port+0x2a8/0x320
> > [c0000000eed2fbc0] [c000000000275b30] .inet_bind+0x168/0x248
> > [c0000000eed2fc60] [c000000000215024] .sys_bind+0x98/0xdc
> > [c0000000eed2fd90] [c0000000002370bc] .compat_sys_socketcall+0xcc/0x214
> > [c0000000eed2fe30] [c0000000000086ac] syscall_exit+0x0/0x40
> > BUG: arping:1916 task might have lost a preemption check!
> > Call Trace:
> > [c0000000eed2f890] [c000000000010304] .show_stack+0x70/0x1bc (unreliable)
> > [c0000000eed2f940] [c00000000004e298] .preempt_enable_no_resched+0x60/0x78
> > [c0000000eed2f9c0] [c0000000001a2348] .debug_smp_processor_id+0x140/0x168
> > [c0000000eed2fa70] [c0000000002181f4] .sock_prot_inuse_add+0x30/0x80
> > [c0000000eed2fb10] [c00000000026d96c] .udp_lib_get_port+0x2a8/0x320
> > [c0000000eed2fbc0] [c000000000275b30] .inet_bind+0x168/0x248
> > [c0000000eed2fc60] [c000000000215024] .sys_bind+0x98/0xdc
> > [c0000000eed2fd90] [c0000000002370bc] .compat_sys_socketcall+0xcc/0x214
> > [c0000000eed2fe30] [c0000000000086ac] syscall_exit+0x0/0x40
> >
> 
> Does this simple fix do the trick for you?
>
>Signed-off-by: John Kacur <jkacur@...il.com>
>
>Index: linux-2.6.26-rt1/net/core/sock.c
>===================================================================
>--- linux-2.6.26-rt1.orig/net/core/sock.c
>+++ linux-2.6.26-rt1/net/core/sock.c
>@@ -1943,7 +1943,7 @@ static DECLARE_BITMAP(proto_inuse_idx, P
> #ifdef CONFIG_NET_NS
> void sock_prot_inuse_add(struct net *net, struct proto *prot, int val)
> {
>-	int cpu = smp_processor_id();
>+	int cpu = raw_smp_processor_id();
> 	per_cpu_ptr(net->core.inuse, cpu)->val[prot->inuse_idx] += val;
> }
> EXPORT_SYMBOL_GPL(sock_prot_inuse_add);

  Nope, still the same BUGs, I do not have the net namespaces configured, so the
version of sock_prot_inuse_add() which is used is defined a few lines below:

static DEFINE_PER_CPU(struct prot_inuse, prot_inuse);

void sock_prot_inuse_add(struct net *net, struct proto *prot, int val)
{
	__get_cpu_var(prot_inuse).val[prot->inuse_idx] += val;
}
EXPORT_SYMBOL_GPL(sock_prot_inuse_add);

  Looks like another case of percpu variables Chirag has benn fixing.

  Sebastien.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ