[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEA6p_D8Av0QAJ46Gzx=Bt3uP-NshTmz2QoBzYaO8OK9+CFQNA@mail.gmail.com>
Date: Wed, 20 Sep 2017 10:46:58 -0700
From: Wei Wang <weiwan@...gle.com>
To: Paweł Staszewski <pstaszewski@...are.pl>
Cc: Eric Dumazet <eric.dumazet@...il.com>,
Linux Kernel Network Developers <netdev@...r.kernel.org>,
Eric Dumazet <edumazet@...gle.com>
Subject: Re: Latest net-next from GIT panic
>> This is why I suggested to replace the BUG() in another mail
>>
>> So :
>>
>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>> index
>> f535779d9dc1dfe36934c2abba4e43d053ac5d6f..220cd12456754876edf2d3ef13195e82d70d5c74
>> 100644
>> --- a/include/linux/netdevice.h
>> +++ b/include/linux/netdevice.h
>> @@ -3331,7 +3331,15 @@ void netdev_run_todo(void);
>> */
>> static inline void dev_put(struct net_device *dev)
>> {
>> - this_cpu_dec(*dev->pcpu_refcnt);
>> + int __percpu *pref = READ_ONCE(dev->pcpu_refcnt);
>> +
>> + if (!pref) {
>> + pr_err("no pcpu_refcnt on dev %p(%s) state %d dismantle
>> %d\n",
>> + dev, dev->name, dev->reg_state, dev->dismantle);
>> + for (;;)
>> + cpu_relax();
>> + }
>> + this_cpu_dec(*pref);
>> }
>> /**
>>
Thanks a lot Eric for the debug patch.
Pawel,
I want to confirm with you about the last good commit when you did bisection.
You mentioned:
> And the last one
>
> git bisect good
> Bisecting: 1 revision left to test after this (roughly 1 step)
> [1cfb71eeb12047bcdbd3e6730ffed66e810a0855] ipv6: take dst->__refcnt for
> insertion into fib6 tree
>
> With this have kernel panic same as always
>
> git bisect bad
> Bisecting: 0 revisions left to test after this (roughly 0 steps)
> [b838d5e1c5b6e57b10ec8af2268824041e3ea911] ipv4: mark DST_NOGC and
> remove the operation of dst_free()
So it breaks right at:
[b838d5e1c5b6e57b10ec8af2268824041e3ea911] ipv4: mark DST_NOGC and
remove the operation of dst_free()
Right?
If you sync the image to one commit before the above one:
[9df16efadd2a8a82731dc76ff656c771e261827f] ipv4: call dst_hold_safe() properly
Does it crash?
And could you confirm that your config does not have any IPv6
addresses or routes configured?
Thanks.
Wei
6:03 +0200, Paweł Staszewski wrote:
>>>
>>> Nit much more after adding this patch
>>>
>>> https://bugzilla.kernel.org/attachment.cgi?id=258529
>>>
>> This is why I suggested to replace the BUG() in another mail
>>
>> So :
>>
>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>> index
>> f535779d9dc1dfe36934c2abba4e43d053ac5d6f..220cd12456754876edf2d3ef13195e82d70d5c74
>> 100644
>> --- a/include/linux/netdevice.h
>> +++ b/include/linux/netdevice.h
>> @@ -3331,7 +3331,15 @@ void netdev_run_todo(void);
>> */
>> static inline void dev_put(struct net_device *dev)
>> {
>> - this_cpu_dec(*dev->pcpu_refcnt);
>> + int __percpu *pref = READ_ONCE(dev->pcpu_refcnt);
>> +
>> + if (!pref) {
>> + pr_err("no pcpu_refcnt on dev %p(%s) state %d dismantle
>> %d\n",
>> + dev, dev->name, dev->reg_state, dev->dismantle);
>> + for (;;)
>> + cpu_relax();
>> + }
>> + this_cpu_dec(*pref);
>> }
>> /**
>>
>>
>>
>
> Full panic
>
> https://bugzilla.kernel.org/attachment.cgi?id=258531
>
>
> I will change patch and apply but later today cause now cant use backup
> router as testlab - Internet rush hours if something happens this will be
> bed when second router will have bugged kernel :)
>
>
Powered by blists - more mailing lists