[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABvG-CUUAQaWQuh1HqNmyM+wudBdAjZhSvHdsXEpwRyOwTg-fg@mail.gmail.com>
Date: Sun, 10 Jun 2018 19:10:32 +0200
From: Michał Kazior <kazikcz@...il.com>
To: Arend van Spriel <arend.vanspriel@...adcom.com>
Cc: Ben Greear <greearb@...delatech.com>,
Cong Wang <xiyou.wangcong@...il.com>,
Linux Kernel Network Developers <netdev@...r.kernel.org>,
"linux-wireless@...r.kernel.org" <linux-wireless@...r.kernel.org>
Subject: Re: [PATCH v2] net-fq: Add WARN_ON check for null flow.
Ben,
The patch is symptomatic. fq_tin_dequeue() already checks if the list
is empty before it tries to access first entry. I see no point in
using the _or_null() + WARN_ON.
The 0x3c deref is likely an offset off of NULL base pointer. Did you
check gdb/addr2line of the ieee80211_tx_dequeue+0xfb? Where did it
point to?
I suspect there's not enough synchronization between quescing the
device/ath10k after fw crashes and performing mac80211's reconfig
procedure.
Michał
On 8 June 2018 at 23:40, Arend van Spriel <arend.vanspriel@...adcom.com> wrote:
> On 6/8/2018 5:17 PM, Ben Greear wrote:
>
> I recalled an email from Michał leaving tieto so adding his alternate email
> he provided back then.
>
> Gr. AvS
>
>
>> On 06/07/2018 04:59 PM, Cong Wang wrote:
>>>
>>> On Thu, Jun 7, 2018 at 4:48 PM, <greearb@...delatech.com> wrote:
>>>>
>>>> diff --git a/include/net/fq_impl.h b/include/net/fq_impl.h
>>>> index be7c0fa..cb911f0 100644
>>>> --- a/include/net/fq_impl.h
>>>> +++ b/include/net/fq_impl.h
>>>> @@ -78,7 +78,10 @@ static struct sk_buff *fq_tin_dequeue(struct fq *fq,
>>>> return NULL;
>>>> }
>>>>
>>>> - flow = list_first_entry(head, struct fq_flow, flowchain);
>>>> + flow = list_first_entry_or_null(head, struct fq_flow,
>>>> flowchain);
>>>> +
>>>> + if (WARN_ON_ONCE(!flow))
>>>> + return NULL;
>>>
>>>
>>> This does not make sense either. list_first_entry_or_null()
>>> returns NULL only when the list is empty, but we already check
>>> list_empty() right before this code, and it is protected by fq->lock.
>>>
>>
>> Hello Michal,
>>
>> git blame shows you as the author of the fq_impl.h code.
>>
>> I saw a crash when debugging funky ath10k firmware in a 4.16 + hacks
>> kernel. There was an apparent
>> mostly-null deref in the fq_tin_dequeue method. According to gdb, it
>> was within
>> 1 line of the dereference of 'flow'.
>>
>> My hack above is probably not that useful. Cong thinks maybe the
>> locking is bad.
>>
>> If you get a chance, please review this thread and see if you have any
>> ideas for
>> a better fix (or better debugging code).
>>
>> As always, if you would like me to generate you a buggy firmware that
>> will crash
>> in the tx path and cause all sorts of mayhem in the ath10k driver and
>> wifi stack,
>> I will be happy to do so.
>>
>> https://www.mail-archive.com/netdev@vger.kernel.org/msg239738.html
>>
>> Thanks,
>> Ben
>>
>
Powered by blists - more mailing lists