[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c11bca37-08fd-66bf-26b3-c6bf5edea6a0@oracle.com>
Date: Tue, 14 Nov 2017 10:02:59 -0800
From: Girish Moodalbail <girish.moodalbail@...cle.com>
To: Sowmini Varadhan <sowmini.varadhan@...cle.com>
Cc: syzbot
<bot+643ecad3f5bb49700e839363b608c4928f6db8f0@...kaller.appspotmail.com>,
davem@...emloft.net, netdev@...r.kernel.org,
rds-devel@....oracle.com, santosh.shilimkar@...cle.com,
syzkaller-bugs@...glegroups.com
Subject: Re: KASAN: use-after-free Read in rds_tcp_dev_event
On 11/14/17 5:22 AM, Sowmini Varadhan wrote:
>
>
> A few questions.
>
> - First off, why am I not seeing the original mail in this thread
> even when I search the mail archives, e.g.,
> https://lkml.org/lkml/2017/11/13/954
>
> - Girish Moodalbail writes:
>
>> The issue here is that we are trying to reference a network namespace
>> (struct net *) that is long gone (i.e., L532 below -- c_net is the culprit).
>
> The netns is not "long gone", we are still processing
> the NETDEV_UNREGISTER_FINAL for loopback.
Obviously, I was not talking about the current namespace.
Say there are two namespaces - ns1 and ns2 and that both have RDS connections.
Deletion of ns1 will be fine. However when ns2 is being deleted, in the
rds_tcp_dev_event() callback we walk through the global list and some nodes in
that list will be referring to ns1 (that is "long gone"). If you read my earlier
email, I was talking about ns1 which is already gone, and we are trying to
access it from ns2.
~Girish
> As I said in my
> earlier mail, the idea is to extract the list of unique conns
> that belong to the netns and then destroy both the conn, and
> all associated paths. Thus there can only be a single thread
> going through rds_tcp_kill_sock at any time (since we should
> only get the unregister_final/loopback one time for the netns).
> (See alos comment block in rds_tcp_dev_event about network activity
> quiescing). Thus there should be no concurrency issue.
>
> However when I just ehecked this, there may be some rds connection
> refcounting bug. When I quickly tested this, I'm not seeing the
> expected calls to conn_path_destroy. I'll need some time to take
> a look, this has been known to work, so something got broken along
> the way
>
>> I think we should move away from global list to a per-namespace list. The
>> global list are used only in two places (both of which are per-namespace
>> operations):
>
> let's first understand the real root-cause before we start
> redesigning data-structures.
>
> --Sowmini
>
>
>
Powered by blists - more mailing lists