[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b950c974d921fe3b2b7b05eb416d313858b3ed11.camel@sipsolutions.net>
Date: Mon, 28 Oct 2019 11:59:10 +0100
From: Johannes Berg <johannes@...solutions.net>
To: Arnd Bergmann <arnd@...db.de>
Cc: David Miller <davem@...emloft.net>,
Networking <netdev@...r.kernel.org>,
linux-wireless <linux-wireless@...r.kernel.org>,
John Crispin <john@...ozen.org>
Subject: Re: pull-request: mac80211-next 2019-07-31
On Mon, 2019-10-28 at 11:53 +0100, Arnd Bergmann wrote:
> On Mon, Oct 28, 2019 at 10:52 AM Johannes Berg
> <johannes@...solutions.net> wrote:
> > > It looks like one of the last additions pushed the stack usage over
> > > the 1024 byte limit
> > > for 32-bit architectures:
> > >
> > > net/mac80211/mlme.c:4063:6: error: stack frame size of 1032 bytes in
> > > function 'ieee80211_sta_rx_queued_mgmt' [-Werror,-Wframe-larger-than=]
> > >
> > > struct ieee802_11_elems is fairly large, and just grew another two pointers.
> > > When ieee80211_rx_mgmt_assoc_resp() and ieee80211_assoc_success()
> > > are inlined into ieee80211_sta_rx_queued_mgmt(), there are three copies
> > > of this structure, which is slightly too much.
> >
> > Hmm. I guess that means the compiler isn't smart enough to make the
> > copies from the inlined sub-functions alias each other? I mean, if I
> > have
> >
> > fn1(...) { struct ... elems1; ... }
> > fn2(...) { struct ... elems2; ... }
> >
> > fn(...)
> > {
> > fn1();
> > fn2();
> > }
> >
> > then it could reasonably use the same stack memory for elems1 and
> > elems2, at least theoretically, but you're saying it doesn't do that I
> > guess?
>
> No, that's not the problem here (it can happen if the compiler is
> unable to prove the object lifetimes are non-overlapping).
>
> What we have here are multiple functions that are in the same call chain:
>
> fn1()
> {
> struct ieee802_11_elems e ;
> }
>
> fn2()
> {
> struct ieee802_11_elems e;
> ...
> fn1();
> }
>
> fn3()
> {
> struct ieee802_11_elems e;
> ...
> fn2();
> }
>
> Here, the object lifetimes actually do overlap, so the compiler cannot easily
> optimize that away.
Ah, yes, you're right. I didn't look closely enough, sorry.
> > I don't think dynamic allocation would be nice - but we could manually
> > do this by passing the elems pointer into the
> > ieee80211_rx_mgmt_assoc_resp() and ieee80211_assoc_success() functions.
>
> Ah, so you mean we can reuse the objects manually? I think that would
> be great. I could not tell if that's possible when reading the source, but
> if you can show that this works, that would be an easy solution.
Now that I look more closely, I'm not even sure why we parse it again in
ieee80211_assoc_success() after already having done it in
_rx_mgmt_assoc_resp(), they're doing exactly the same thing:
ieee802_11_parse_elems(pos, len - (pos - (u8 *)mgmt), false, &elems,
mgmt->bssid, assoc_data->bss->bssid);
(need to track the variables a bit more closely, but ...)
So I think we can even just avoid duplicate work.
And for the third copy - it's in a different switch case. Do you think
we could rely on the compiler being able to prove non-overlapping
lifetime? Or better to just pass a pointer down to
_rx_mgmt_assoc_resp()?
> > Why do you say 32-bit btw, it should be *bigger* on 64-bit, but I didn't
> > see this ... hmm.
>
> That is correct. For historic reasons, both the total amount of stack space
> per thread and the warning limit on 64 bit are twice the amount that we
> have on 32-bit kernels, so even though the problem is more serious on
> 64-bit architectures, we do not see a warning about it because we remain
> well under the warning limit.
Ah, ok, thanks.
johannes
Powered by blists - more mailing lists