[<prev] [next>] [day] [month] [year] [list]
Message-ID: <CANn89iLcxMi=AnhyFTgAoiCznFPCoKdjKVZbHMZMQ9dgK6xXnw@mail.gmail.com>
Date: Tue, 1 Oct 2024 09:46:04 +0200
From: Eric Dumazet <edumazet@...gle.com>
To: Daniel Yang <danielyangkang@...il.com>
Cc: Kuniyuki Iwashima <kuniyu@...zon.com>, alibuda@...ux.alibaba.com, davem@...emloft.net,
guwen@...ux.alibaba.com, jaka@...ux.ibm.com, kuba@...nel.org,
linux-kernel@...r.kernel.org, linux-s390@...r.kernel.org,
netdev@...r.kernel.org, pabeni@...hat.com,
syzbot+e953a8f3071f5c0a28fd@...kaller.appspotmail.com,
tonylu@...ux.alibaba.com, wenjia@...ux.ibm.com
Subject: Re: [PATCH] fixed rtnl deadlock from gtp
On Tue, Oct 1, 2024 at 6:54 AM Daniel Yang <danielyangkang@...il.com> wrote:
>
> Ok I see the issue. Yes it does seem to be a false positive. Then do we already have lockdep classes and subclasses set up for lock_sock() to prevent other false positives like this one? If not, should I add one then to resolve this?
>
Please do not top post on linux mailing lists
About your question :
https://lore.kernel.org/netdev/CANn89iKcWmufo83xy-SwSrXYt6UpL2Pb+5pWuzyYjMva5F8bBQ@mail.gmail.com/
> On Mon, Sep 30, 2024 at 8:04 PM Kuniyuki Iwashima <kuniyu@...zon.com> wrote:
>>
>> From: Daniel Yang <danielyangkang@...il.com>
>> Date: Mon, 30 Sep 2024 18:55:54 -0700
>> > Fixes deadlock described in this bug:
>> > https://syzkaller.appspot.com/bug?extid=e953a8f3071f5c0a28fd.
>> > Specific crash report here:
>> > https://syzkaller.appspot.com/text?tag=CrashReport&x=14670e07980000.
>> >
>> > DESCRIPTION OF ISSUE
>> > Deadlock: sk_lock-AF_INET --> &smc->clcsock_release_lock --> rtnl_mutex
>> >
>> > rtnl_mutex->sk_lock-AF_INET
>> > rtnetlink_rcv_msg() acquires rtnl_lock() and calls rtnl_newlink(), which
>> > eventually calls gtp_newlink() which calls lock_sock() to attempt to
>> > acquire sk_lock.
>>
>> Is the deadlock real ?
>>
>> From the lockdep splat, the gtp's sk_protocol is verified to be
>> IPPROTO_UDP before holding lock_sock(), so it seems just a labeling
>> issue.
>> https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/tree/drivers/net/gtp.c?id=9410645520e9b820069761f3450ef6661418e279#n1674
>>
>>
>> >
>> > sk_lock-AF_INET->&smc->clcsock_release_lock
>> > smc_sendmsg() calls lock_sock() to acquire sk_lock, then calls
>> > smc_switch_to_fallback() which attempts to acquire mutex_lock(&smc->...).
>> >
>> > &smc->clcsock_release_lock->rtnl_mutex
>> > smc_setsockopt() calls mutex_lock(&smc->...). smc->...->setsockopt() is
>> > called, which calls nf_setsockopt() which attempts to acquire
>> > rtnl_lock() in some nested call in start_sync_thread() in ip_vs_sync.c.
>> >
>> > FIX:
>> > In smc_switch_to_fallback(), separate the logic into inline function
>> > __smc_switch_to_fallback(). In smc_sendmsg(), lock ordering can be
>> > modified and the functionality of smc_switch_to_fallback() is
>> > encapsulated in the __smc_switch_to_fallback() function.
Powered by blists - more mailing lists