lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080919081733.GC19491@elte.hu>
Date:	Fri, 19 Sep 2008 10:17:33 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Steven Noonan <steven@...inklabs.net>
Cc:	"Luis R. Rodriguez" <lrodriguez@...eros.com>,
	Luis Rodriguez <Luis.Rodriguez@...eros.com>,
	"ath9k-devel@...ts.ath9k.org" <ath9k-devel@...ts.ath9k.org>,
	linux-wireless <linux-wireless@...r.kernel.org>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [ath9k-devel] ath9k: massive unexplained latency in 2.6.27
	(rc5, rc6, probably others)


* Steven Noonan <steven@...inklabs.net> wrote:

> On Thu, Sep 18, 2008 at 8:01 PM, Luis R. Rodriguez
> <lrodriguez@...eros.com> wrote:
> > Thanks for testing, and glad to see this is resolved.
> >
> 
> Damn. It's back. I was using wireless normally this evening. Browsing
> the web, that kind of thing, and then the wireless inexplicably
> dropped (even with the group rekeying patch), so I unloaded/reloaded
> the module. This popped up in dmesg:
> 
> [ 3834.375658] vendor=8086 device=27d2
> [ 3834.375666] ath9k 0000:03:00.0: PCI INT A disabled
> [ 3834.375716] ath9k: driver unloaded
> [ 3838.552419] ath9k: 0.1
> [ 3838.552502] vendor=8086 device=27d2
> [ 3838.552511] ath9k 0000:03:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
> [ 3838.552532] ath9k 0000:03:00.0: setting latency timer to 64
> [ 3838.688924] phy1: Selected rate control algorithm 'ath9k_rate_control'
> [ 3838.693652] phy1: Atheros 5416: mem=0xffffc20000060000, irq=17
> [ 3839.427125] irq 17: nobody cared (try booting with the "irqpoll" option)
> [ 3839.427136] Pid: 0, comm: swapper Tainted: P
> 2.6.27-rc6-tip-00478-g74f1a36 #1
> [ 3839.427141] Call Trace:
> [ 3839.427145]  <IRQ>  [<ffffffff802219c5>] ? read_hpet+0x9/0x1c
> [ 3839.427165]  [<ffffffff8026af71>] __report_bad_irq+0x3d/0x8c
> [ 3839.427172]  [<ffffffff8026b0c6>] note_interrupt+0x106/0x160
> [ 3839.427180]  [<ffffffff8026b801>] handle_fasteoi_irq+0xad/0xda
> [ 3839.427188]  [<ffffffff8020f7b0>] do_IRQ+0x10c/0x190
> [ 3839.427194]  [<ffffffff8020c551>] ret_from_intr+0x0/0xa
> [ 3839.427198]  <EOI>  [<ffffffff8026c6f0>] rcu_pending+0x62/0x6e
> [ 3839.427211]  [<ffffffff8025bc11>] ? tick_nohz_stop_sched_tick+0x2e4/0x2f3
> [ 3839.427218]  [<ffffffff8020ad94>] cpu_idle+0x7b/0xdb
> [ 3839.427226]  [<ffffffff8060c921>] rest_init+0x75/0x77
> [ 3839.427231] handlers:
> [ 3839.427234] [<ffffffffa0240238>] (ath_isr+0x0/0x170 [ath9k])
> [ 3839.427263] Disabling IRQ #17
> [ 3842.263699] ADDRCONF(NETDEV_UP): wlan0: link is not ready
> [ 3848.035003] ADDRCONF(NETDEV_UP): wlan0: link is not ready
> [ 3848.432701] ADDRCONF(NETDEV_UP): wlan0: link is not ready
> [ 3850.216947] wlan0: authenticate with AP 00:1e:52:79:4d:01
> [ 3850.217027] wlan0: authenticate with AP 00:1e:52:79:4d:01
> [ 3850.228326] wlan0: authenticated
> [ 3850.228336] wlan0: associate with AP 00:1e:52:79:4d:01
> [ 3850.428140] wlan0: associate with AP 00:1e:52:79:4d:01
> [ 3850.628151] wlan0: associate with AP 00:1e:52:79:4d:01
> [ 3850.728305] wlan0: RX AssocResp from 00:1e:52:79:4d:01 (capab=0x431
> status=0 aid=1)
> [ 3850.728314] wlan0: associated
> [ 3850.728655] wlan0 (WE) : Wireless Event too big (320)
> [ 3850.743377] ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
> [ 3860.855104] wlan0: no IPv6 routers present
> 
> I rebuilt the module with DBG_ATH_INTERRUPT, but it somehow stumbled 
> itself back into working order while I was compiling. I can't keep the 
> interrupt debugging on all the time because it's just -too verbose-, 
> and when I pop a debug version of the module in, then it's too late to 
> track the issue....

in tip/master there's an ftrace_printk() facility. You can just replace 
the verbose printk()s with ftrace_printk() and see the result in 
/debug/tracing/trace.

This sort of tracing has very low overhead and can be used in an easy 
ad-hoc manner with no extra infrastructure. Here's a few quick-start 
links about how to enable the scheduler tracer:

 http://redhat.com/~mingo/sched-devel.git/readme-tracer.txt
 http://redhat.com/~mingo/sched-devel.git/howto-trace-latencies.txt

any ftrace_printk() you add in the kernel will show up in that trace. 

(If the scheduling events are uninteresting and clutter the output then 
you might want to remove the scheduler tracing entries from kernel/*.c 
by removing the trace_sched_*() calls or use a less noisy tracer.)

to get enough of a trace history you might want to increase the number 
of trace entries in /debug/tracing/trace_entries from 16K to 128K or so.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ