lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 29 Nov 2011 11:19:00 +0200
From:	Emmanuel Grumbach <egrumbach@...il.com>
To:	Johannes Berg <johannes@...solutions.net>
Cc:	Norbert Preining <preining@...ic.at>,
	"Guy, Wey-Yi" <wey-yi.w.guy@...el.com>,
	Pekka Enberg <penberg@...helsinki.fi>,
	"linux-wireless@...r.kernel.org" <linux-wireless@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Dave Jones <davej@...hat.com>,
	David Rientjes <rientjes@...gle.com>
Subject: Re: iwlagn is getting very shaky

On Tue, Nov 29, 2011 at 10:31, Johannes Berg <johannes@...solutions.net> wrote:
> I noticed that the logs are a bit odd wrt. timing.
>
>> > Intersperesed I see some other messages that are new to me:
>> > [ 4019.443129] Open BA session requested for 00:0a:79:eb:56:10 tid 0
>> > [ 4019.500149] activated addBA response timer on tid 0
>> > [ 4020.500033] addBA response timer expired on tid 0
>
> I guess the delay here is due to the synchronize_net()? That can take a
> while, 57ms seems a lot but I suppose it's possible.
>
>> > [ 4020.501626] Tx BA session stop requested for 00:0a:79:eb:56:10 tid 0
>> > [ 4023.740570] switched off addBA timer for tid 0
>> > [ 4023.740578] got addBA resp for tid 0 but we already gave up
>>
>> Here is the AP is finally replying
>
> It's kinda hard to believe that the AP took 4 seconds (!) to reply to
> the frame. Where could the frame get stuck? I don't see any other work
> processing happening etc. either. It's also curious that in those 3
> seconds between these messages, we didn't actually get around to
> stopping the session, that only happens just after:

Yeah you are right, didn't look at the timestamps. Not sure you would
see work being processed though.

>
>> > [ 4023.740619] Stopping Tx BA session for 00:0a:79:eb:56:10 tid 0
>
> (here)
>
>> > [ 4023.768544] Open BA session requested for 00:0a:79:eb:56:10 tid 0
>>
>> Here we are trying again
>>
>> > [ 4023.784292] activated addBA response timer on tid 0
>> > [ 4023.786294] switched off addBA timer for tid 0
>
> 20ms response time here, that's much more reasonable.
>
>
> Could something be hogging the workqueues?
>

Frankly, I am seeing issues that seem to point to workqueues too.
Sometimes mac80211 seems just not responsive.
Sometimes I come back to mac80211 for the AGG callback (start or
stop), and it takes ages (5 seconds !) until it actually move to
operationnal / stopped state.

It might that we are holding the mac80211 workqueue in the driver too...
I guess we could try to enable MAC80211 debug flag with timestamps to check.

> johannes
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ