lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070508105528.GA86@tv-sign.ru>
Date:	Tue, 8 May 2007 14:55:28 +0400
From:	Oleg Nesterov <oleg@...sign.ru>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Jiri Slaby <jirislaby@...il.com>,
	"Rafael J. Wysocki" <rjw@...k.pl>, Pavel Machek <pavel@....cz>,
	linux-pm@...ts.linux-foundation.org,
	Linux kernel mailing list <linux-kernel@...r.kernel.org>
Subject: Re: 2.6.21-mm1 hwsusp: BUG at workqueue.c:106

On 05/08, Andrew Morton wrote:
>
> On Tue, 08 May 2007 10:57:35 +0200 Jiri Slaby <jirislaby@...il.com> wrote:
> 
> > 
> > this occured in dmesg during resuming from hwsusp in 2.6.21-mm1 (captured
> > through netconsole). Perfectly reproducible, it simply happens each time I
> > try it.
> 
> Let's cc Oleg.
> 
> > usb_endpoint usbdev5.1_ep00: PM: resume from 0, parent usb5 still 2
> > ------------[ cut here ]------------
> > kernel BUG at /home/l/latest/xxx/kernel/workqueue.c:106!
> > invalid opcode: 0000 [#1]
> > SMP
> > Modules linked in: ipv6 floppy ohci1394 ieee1394 parport_pc parport usbhid
> > ehci_hcd pata_acpi ff_memless sr_mod cdrom
> > CPU:    1
> > EIP:    0060:[<c0132161>]    Not tainted VLI
> > EFLAGS: 00010046   (2.6.21-mm1 #272)
> > EIP is at insert_work+0x6d/0x71
> > eax: c1c3b3c0   ebx: c1814aa0   ecx: 00000001   edx: c1814aa0
> > esi: c1c3b340   edi: 00000282   ebp: c04d2f68   esp: c04d2f50
> > ds: 007b   es: 007b   fs: 00d8  gs: 0000  ss: 0068
> > Process swapper (pid: 0, ti=c04d2000 task=c1c26030 task.ti=c1c20000)
> > Stack: c4685f54 c18148ac c04d2f98 c013816f c1c3b340 c1814aa0 c04d2f88 c013256c
> >        00000066 c1814880 c1c5e000 c04d2fc4 c01325a1
> > c1c5e000 00000100 c012ba35 00000000 c04d2fb8 c01333f4 Call Trace:
> >  [<c0104f27>]  [<c0104fe2>] show_stack_log_lvl+0xa5/0xca
> > show_registers+0x1e2/0x2da
> >  [<c01053f5>]  [<c010559a>] do_trap+0x84/0xaa
> > do_invalid_op+0x88/0x92
> >  [<c0378662>]  [<c013256c>] __queue_work+0x22/0x33
> > delayed_work_timer_fn+0x24/0x2a
> >  [<c012ba35>]  [<c01288a9>] __do_softirq+0x75/0xe6
> > do_softirq+0x63/0xac
> >  [<c0128713>]  [<c0116d7e>] smp_apic_timer_interrupt+0x5c/0x88
> > apic_timer_interrupt+0x28/0x30

queue_delayed_work().

Probably, cancel_delayed_work(&delayed_work->work) was called with the ->timer
pending. This is wrong, cancel_delayed_work() clears _PENDING unconditionally,
that is why the comment says

	it is expected that, prior to calling cancel_work_sync(), the caller has
	arranged for the work to not be requeued.

(Just in case, after make-cancel_rearming_delayed_work-reliable.patch this is still
 wrong (as documented) to do cancel_delayed_work() before cancel_delayed_timer(),
 but should work correctly).

ata_port_flush_task() and ata_port_detach() do this, I sent the patch to fix this
twice. The last one is

	[PATCH -mm] libata-core: convert to use cancel_rearming_delayed_work()
	http://marc.info/?l=linux-kernel&m=117840349108505
	

Jiri, any chance you can re-test with the patch below?

--- OLD/kernel/workqueue.c~	2007-05-06 00:01:06.000000000 +0400
+++ OLD/kernel/workqueue.c	2007-05-08 14:50:39.000000000 +0400
@@ -103,7 +103,10 @@ static inline void set_wq_data(struct wo
 {
 	unsigned long new;
 
-	BUG_ON(!work_pending(work));
+	if (!work_pending(work)) {
+		printk(KERN_ERR "BUG: set_wq_data ");
+		print_symbol("%s\n", (unsigned long) work->func);
+	}
 
 	new = (unsigned long) cwq | (1UL << WORK_STRUCT_PENDING);
 	new |= WORK_STRUCT_FLAG_MASK & *work_data_bits(work);

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ