[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1451d316-a4ff-354b-e57b-00ef85fba43a@huawei.com>
Date: Mon, 22 Jan 2024 19:10:37 +0800
From: shaozhengchao <shaozhengchao@...wei.com>
To: Pablo Neira Ayuso <pablo@...filter.org>
CC: <netdev@...r.kernel.org>, <davem@...emloft.net>, <edumazet@...gle.com>,
<kuba@...nel.org>, <pabeni@...hat.com>, <horms@...nel.org>,
<anjali.k.kulkarni@...cle.com>, <kuniyu@...zon.com>, <fw@...len.de>,
<weiyongjun1@...wei.com>, <yuehaibing@...wei.com>
Subject: Re: [PATCH net,v4] netlink: fix potential sleeping issue in
mqueue_flush_file
On 2024/1/22 16:56, Pablo Neira Ayuso wrote:
> On Mon, Jan 22, 2024 at 09:18:07AM +0800, Zhengchao Shao wrote:
>> I analyze the potential sleeping issue of the following processes:
>> Thread A Thread B
>> ... netlink_create //ref = 1
>> do_mq_notify ...
>> sock = netlink_getsockbyfilp ... //ref = 2
>> info->notify_sock = sock; ...
>> ... netlink_sendmsg
>> ... skb = netlink_alloc_large_skb //skb->head is vmalloced
>> ... netlink_unicast
>> ... sk = netlink_getsockbyportid //ref = 3
>> ... netlink_sendskb
>> ... __netlink_sendskb
>> ... skb_queue_tail //put skb to sk_receive_queue
>> ... sock_put //ref = 2
>> ... ...
>> ... netlink_release
>> ... deferred_put_nlk_sk //ref = 1
>> mqueue_flush_file
>> spin_lock
>> remove_notification
>> netlink_sendskb
>> sock_put //ref = 0
>> sk_free
>> ...
>> __sk_destruct
>> netlink_sock_destruct
>> skb_queue_purge //get skb from sk_receive_queue
>> ...
>> __skb_queue_purge_reason
>> kfree_skb_reason
>> __kfree_skb
>> ...
>> skb_release_all
>> skb_release_head_state
>> netlink_skb_destructor
>> vfree(skb->head) //sleeping while holding spinlock
>>
>> In netlink_sendmsg, if the memory pointed to by skb->head is allocated by
>> vmalloc, and is put to sk_receive_queue queue, also the skb is not freed.
>> When the mqueue executes flush, the sleeping bug will occur. Use
>> vfree_atomic instead of vfree in netlink_skb_destructor to solve the issue.
>
> mqueue notification is of NOTIFY_COOKIE_LEN size:
>
> static int do_mq_notify(mqd_t mqdes, const struct sigevent *notification)
> {
> [...]
> if (notification->sigev_notify == SIGEV_THREAD) {
> long timeo;
>
> /* create the notify skb */
> nc = alloc_skb(NOTIFY_COOKIE_LEN, GFP_KERNEL);
> if (!nc)
> return -ENOMEM;
>
> Do you have a reproducer?
Hi Pablo:
I donot have reproducer. I found the issue when running syz on
the 5.10 stable branch, but it only happened once. Then I analyzed the
mainline code and found the same issue.
The sock can be obtained from the value of sigev_signo
transferred by the user in do_mq_notify. And the sock may be of type
netlink, and it is possible to allocate the head area using vmalloc.
Not only release the skb allocated in do_mq_notify, but also release
the skb allocated in netlink_sendmsg when put sock mqueue_flush_file.
What I missed?
Thank you.
Zhengchao Shao
Powered by blists - more mailing lists