[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <92220030-e9f1-1963-00b6-05f37abb82ee@gmail.com>
Date: Wed, 20 Dec 2017 12:23:52 -0800
From: John Fastabend <john.fastabend@...il.com>
To: Jakub Kicinski <kubakici@...pl>
Cc: Jiri Pirko <jiri@...nulli.us>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
Cong Wang <xiyou.wangcong@...il.com>
Subject: Re: RCU callback crashes
On 12/20/2017 12:17 PM, Jakub Kicinski wrote:
> On Wed, 20 Dec 2017 10:04:17 -0800, John Fastabend wrote:
>> On 12/19/2017 10:34 PM, Jakub Kicinski wrote:
>>> On Tue, 19 Dec 2017 22:22:27 -0800, Jakub Kicinski wrote:
>>>>>> I get this:
>>>>>
>>>>> Could you try to run it with kasan on?
>>>>
>>>> I didn't manage to reproduce it with KASAN on so far :( Even enabling
>>>> object debugging to get the second splat in my email (which is more
>>>> useful) actually makes the crash go away, I only see the warning...
>>>
>>> Ah, no object debug but KASAN on produces this:
>>>
>>
>> @Jakub, This is with mq and pfifo_fast I guess?
>
> Sorry for falling silent, I was convinced I saw this before your code
> went in, it just takes a lot longer to trigger... I've been running
> net-next from Dec 1st now for an hour now and it didn't crash :/
>
> Trying KASAN now..
>
Its possible my patches just made it worse because the kfree on the skb
lists was exposed as well.
I'm trying to see how removing that rcu grace period was safe in the
first place. The datapath is using rcu_read critical section to protect
the qdisc but the control path (a) doesn't use rcu grace period and (b)
doesn't use the qidisc lock. Going to go get a coffee and I'll think
about it a bit more. Any ideas Cong?
Perhaps we need a patch for net (mine was against net-next) and stable
as well probably.
Thanks,
John
Powered by blists - more mailing lists