[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ed916873-4d7d-43f3-07cf-028d3ef4177c@gmail.com>
Date: Sat, 4 May 2019 13:24:07 -0400
From: Eric Dumazet <eric.dumazet@...il.com>
To: Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
"David S. Miller" <davem@...emloft.net>
Cc: David Ahern <dsahern@...il.com>, Julian Anastasov <ja@....bg>,
Cong Wang <xiyou.wangcong@...il.com>,
syzbot <syzbot+30209ea299c09d8785c9@...kaller.appspotmail.com>,
ddstreet@...e.org, dvyukov@...gle.com,
linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
syzkaller-bugs@...glegroups.com,
Linus Torvalds <torvalds@...ux-foundation.org>,
Mahesh Bandewar <maheshb@...gle.com>
Subject: Re: [PATCH] ipv4: Delete uncached routes upon unregistration of
loopback device.
On 5/4/19 1:09 PM, Tetsuo Handa wrote:
> On 2019/05/05 0:56, Eric Dumazet wrote:>
>> Well, you have not fixed a bug, you simply made sure that whatever cpu is using the
>> routes you forcibly deleted is going to crash the host very soon (use-after-frees have
>> undefined behavior, but KASAN should crash most of the times)
>
> I confirmed that this patch survives "#syz test:" before submitting.
> But you know that this patch is deleting the route entry too early. OK.
>
>>
>> Please do not send patches like that with a huge CC list, keep networking patches
>> to netdev mailing list.
>
> If netdev people started working on this "minutely crashing bug" earlier,
> I would not have written a patch...
So, just that you know, we are working on bug fixes, and this is best effort.
It is not because _you_ want to fix a particular bug (out of hundreds)
that we need to stop everything and work full time on a particular bug.
And here the root cause of the problem is elsewhere. A dst is leaking somewhere,
and prevents the netns dismantle.
We had many dst leaks in the past, and they keep being added by new bugs.
>
>>
>> Mahesh has an alternative patch, adding a fake device that can not be dismantled
>> to make sure we fully intercept skbs sent through a dead route, instead of relying
>> on loopback dropping them later at some point.
>
> So, the reason to temporarily move the refcount is to give enough period
> so that the route entry is no longer used. But moving the refcount to a
> loopback device in a namespace was wrong. Is this understanding correct?
I believe you need spend more time on studying the networking code by yourself,
add tracing if you believe this could be useful to you and others.
>
> Compared to moving the refcount to the loopback device in the init namespace,
> the fake device can somehow drop the refcount moved via rt_flush_dev(), can't it?
>
The fake device wont ever disappear.
> Anyway, I'll wait for Mahesh.
>
Powered by blists - more mailing lists