[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7d8731ee-6c91-bbfd-6313-c95f52d7b49a@itcare.pl>
Date: Wed, 20 Sep 2017 23:10:47 +0200
From: Paweł Staszewski <pstaszewski@...are.pl>
To: Cong Wang <xiyou.wangcong@...il.com>,
Eric Dumazet <eric.dumazet@...il.com>
Cc: Wei Wang <weiwan@...gle.com>,
Linux Kernel Network Developers <netdev@...r.kernel.org>,
Eric Dumazet <edumazet@...gle.com>
Subject: Re: Latest net-next from GIT panic
W dniu 2017-09-20 o 21:23, Paweł Staszewski pisze:
>
>
> W dniu 2017-09-20 o 21:13, Paweł Staszewski pisze:
>>
>>
>> W dniu 2017-09-20 o 20:36, Cong Wang pisze:
>>> On Wed, Sep 20, 2017 at 11:30 AM, Eric Dumazet
>>> <eric.dumazet@...il.com> wrote:
>>>> On Wed, 2017-09-20 at 11:22 -0700, Cong Wang wrote:
>>>>> but dmesg at this time shows nothing about interfaces or flaps.
>>>>>
>>>>> This is very odd.
>>>>>
>>>>> We only free netdevice in free_netdev() and it is only called when
>>>>> we unregister a netdevice. Otherwise pcpu_refcnt is impossible
>>>>> to be NULL.
>>>> If there is a missing dev_hold() or one dev_put() in excess,
>>>> this would allow the netdev to be freed too soon.
>>>>
>>>> -> Use after free.
>>>> memory holding netdev could be reallocated-cleared by some other
>>>> kernel
>>>> user.
>>>>
>>> Sure, but only unregister could trigger a free. If there is no
>>> unregister,
>>> like what Pawel claims, then there is no free, the refcnt just goes to
>>> 0 but the memory is still there.
>>>
>> About possible mistake from my side with bisect - i can judge too
>> early that some bisect was good
>> the road was:
>> git bisect start
>> # bad: [ac7b75966c9c86426b55fe1c50ae148aa4571075] Merge tag
>> 'pinctrl-v4.13-1' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
>> git bisect bad ac7b75966c9c86426b55fe1c50ae148aa4571075
>> # good: [e24dd9ee5399747b71c1d982a484fc7601795f31] Merge branch
>> 'next' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
>> git bisect good e24dd9ee5399747b71c1d982a484fc7601795f31
>> # bad: [9cc9a5cb176ccb4f2cda5ac34da5a659926f125f] datapath: Avoid
>> using stack larger than 1024.
>> git bisect bad 9cc9a5cb176ccb4f2cda5ac34da5a659926f125f
>> # good: [073cf9e20c333ab29744717a23f9e43ec7512a20] Merge branch
>> 'udp-reduce-cache-pressure'
>> git bisect good 073cf9e20c333ab29744717a23f9e43ec7512a20
>> # bad: [8abd5599a520e9f188a750f1bde9dde5fb856230] Merge branch
>> 's390-net-updates-part-2'
>> git bisect bad 8abd5599a520e9f188a750f1bde9dde5fb856230
>> # good: [2fae5d0e647c6470d206e72b5fc24972bb900f70] Merge branch
>> 'bpf-ctx-narrow'
>> git bisect good 2fae5d0e647c6470d206e72b5fc24972bb900f70
>> # good: [41500c3e2a19ffcf40a7158fce1774de08e26ba2] rds: tcp: remove
>> cp_outgoing
>> git bisect good 41500c3e2a19ffcf40a7158fce1774de08e26ba2
>> # bad: [8917a777be3ba566377be05117f71b93a5fd909d] tcp: md5: add
>> TCP_MD5SIG_EXT socket option to set a key address prefix
>> git bisect bad 8917a777be3ba566377be05117f71b93a5fd909d
>> # good: [4a6ce2b6f2ecabbddcfe47e7cf61dd0f00b10e36] net: introduce a
>> new function dst_dev_put()
>>
>> And currently have this running for about 4 hours without problems.
>>
>>
>>
>> git bisect good 4a6ce2b6f2ecabbddcfe47e7cf61dd0f00b10e36
>> # bad: [a4c2fd7f78915a0d7c5275e7612e7793157a01f2] net: remove
>> DST_NOCACHE flag
>>
>> Here for sure - panic
>>
>> git bisect bad a4c2fd7f78915a0d7c5275e7612e7793157a01f2
>> # bad: [ad65a2f05695aced349e308193c6e2a6b1d87112] ipv6: call
>> dst_hold_safe() properly
>> git bisect bad ad65a2f05695aced349e308193c6e2a6b1d87112
>> # good: [9df16efadd2a8a82731dc76ff656c771e261827f] ipv4: call
>> dst_hold_safe() properly
>> git bisect good 9df16efadd2a8a82731dc76ff656c771e261827f
>> # bad: [1cfb71eeb12047bcdbd3e6730ffed66e810a0855] ipv6: take
>> dst->__refcnt for insertion into fib6 tree
>>
>> im not 100% sure tor last two
>> Will test them again starting from
>> [95c47f9cf5e028d1ae77dc6c767c1edc8a18025b] ipv4: call dst_dev_put()
>> properly
>>
>>
>> git bisect bad 1cfb71eeb12047bcdbd3e6730ffed66e810a0855
>> # bad: [b838d5e1c5b6e57b10ec8af2268824041e3ea911] ipv4: mark DST_NOGC
>> and remove the operation of dst_free()
>>
>>
>>
>> git bisect bad b838d5e1c5b6e57b10ec8af2268824041e3ea911
>> # first bad commit: [b838d5e1c5b6e57b10ec8af2268824041e3ea911] ipv4:
>> mark DST_NOGC and remove the operation of dst_free()
>>
>>
>>
> What i can say more
> I can reproduce this on any server with similar configuration
> the difference can be teamd instead of bonding
> ixgbe or i40e and mlx5
> Same problems
>
> vlans - more or less prefixes learned from bgp -> zebra -> netlink ->
> kernel
> But normally in lab when using only plain routing no bgpd and about
> 128 vlans - with 128 routes - cant reproduce this - this apperas only
> with bgp - minimum where i can reproduce this was about 130k prefixes
> with about 286 nexthops
>
>
>
>
bisected again and same result:
b838d5e1c5b6e57b10ec8af2268824041e3ea911 is the first bad commit
commit b838d5e1c5b6e57b10ec8af2268824041e3ea911
Author: Wei Wang <weiwan@...gle.com>
Date: Sat Jun 17 10:42:32 2017 -0700
ipv4: mark DST_NOGC and remove the operation of dst_free()
With the previous preparation patches, we are ready to get rid of the
dst gc operation in ipv4 code and release dst based on refcnt only.
So this patch adds DST_NOGC flag for all IPv4 dst and remove the calls
to dst_free().
At this point, all dst created in ipv4 code do not use the dst gc
anymore and will be destroyed at the point when refcnt drops to 0.
Signed-off-by: Wei Wang <weiwan@...gle.com>
Acked-by: Martin KaFai Lau <kafai@...com>
Signed-off-by: David S. Miller <davem@...emloft.net>
:040000 040000 9b7e7fb641de6531fc7887473ca47ef7cb6a11da
831a73b71d3df1755f3e24c0d3c86d7a93fd55e2 M net
Will add now version 2 of patch from Eric and we will see
Powered by blists - more mailing lists