netdev - Re: [PATCH net] virtio-net: fix overflow inside virtnet_rq

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <546cc17a-dd57-8260-4737-c45d7b011631@oracle.com>
Date: Wed, 28 Aug 2024 12:57:56 -0700
From: Si-Wei Liu <si-wei.liu@...cle.com>
To: Darren Kenny <darren.kenny@...cle.com>,
        "Michael S. Tsirkin" <mst@...hat.com>,
        Xuan Zhuo <xuanzhuo@...ux.alibaba.com>
Cc: netdev@...r.kernel.org, Jason Wang <jasowang@...hat.com>,
        Eugenio Pérez <eperezma@...hat.com>,
        "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>, virtualization@...ts.linux.dev,
        "Linux regression tracking (Thorsten Leemhuis)" <regressions@...mhuis.info>
Subject: Re: [PATCH net] virtio-net: fix overflow inside virtnet_rq_alloc

Just in case Xuan missed the last email while his email server kept 
rejecting incoming emails in the last week.: the patch doesn't seem fix 
the regression.

Xuan, given this is not very hard to reproduce and we have clearly 
stated how to, could you try to get the patch verified in house before 
posting to upstream? Or you were unable to reproduce locally?

Thanks,
-Siwei

On 8/21/2024 9:47 AM, Darren Kenny wrote:
> Hi Michael,
>
> On Tuesday, 2024-08-20 at 12:50:39 -04, Michael S. Tsirkin wrote:
>> On Tue, Aug 20, 2024 at 03:19:13PM +0800, Xuan Zhuo wrote:
>>> leads to regression on VM with the sysctl value of:
>>>
>>> - net.core.high_order_alloc_disable=1
>>
>>
>>
>>> which could see reliable crashes or scp failure (scp a file 100M in size
>>> to VM):
>>>
>>> The issue is that the virtnet_rq_dma takes up 16 bytes at the beginning
>>> of a new frag. When the frag size is larger than PAGE_SIZE,
>>> everything is fine. However, if the frag is only one page and the
>>> total size of the buffer and virtnet_rq_dma is larger than one page, an
>>> overflow may occur. In this case, if an overflow is possible, I adjust
>>> the buffer size. If net.core.high_order_alloc_disable=1, the maximum
>>> buffer size is 4096 - 16. If net.core.high_order_alloc_disable=0, only
>>> the first buffer of the frag is affected.
>>>
>>> Fixes: f9dac92ba908 ("virtio_ring: enable premapped mode whatever use_dma_api")
>>> Reported-by: "Si-Wei Liu" <si-wei.liu@...cle.com>
>>> Closes: http://lore.kernel.org/all/8b20cc28-45a9-4643-8e87-ba164a540c0a@oracle.com
>>> Signed-off-by: Xuan Zhuo <xuanzhuo@...ux.alibaba.com>
>>
>> Darren, could you pls test and confirm?
> Unfortunately with this change I seem to still get a panic as soon as I start a
> download using wget:
>
> [  144.055630] Kernel panic - not syncing: corrupted stack end detected inside scheduler
> [  144.056249] CPU: 8 PID: 37894 Comm: sleep Kdump: loaded Not tainted 6.10.0-1.el8uek.x86_64 #2
> [  144.056850] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-4.module+el8.9.0+90173+a3f3e83a 04/01/2014
> [  144.057585] Call Trace:
> [  144.057791]  <TASK>
> [  144.057973]  panic+0x347/0x370
> [  144.058223]  schedule_debug.isra.0+0xfb/0x100
> [  144.058565]  __schedule+0x58/0x6a0
> [  144.058838]  ? refill_stock+0x26/0x50
> [  144.059120]  ? srso_alias_return_thunk+0x5/0xfbef5
> [  144.059473]  do_task_dead+0x42/0x50
> [  144.059752]  do_exit+0x31e/0x4b0
> [  144.060011]  ? __audit_syscall_entry+0xee/0x150
> [  144.060352]  do_group_exit+0x30/0x80
> [  144.060633]  __x64_sys_exit_group+0x18/0x20
> [  144.060946]  do_syscall_64+0x8c/0x1c0
> [  144.061228]  ? srso_alias_return_thunk+0x5/0xfbef5
> [  144.061570]  ? __audit_filter_op+0xbe/0x140
> [  144.061873]  ? srso_alias_return_thunk+0x5/0xfbef5
> [  144.062204]  ? audit_reset_context+0x232/0x310
> [  144.062514]  ? srso_alias_return_thunk+0x5/0xfbef5
> [  144.062851]  ? syscall_exit_work+0x103/0x130
> [  144.063148]  ? srso_alias_return_thunk+0x5/0xfbef5
> [  144.063473]  ? syscall_exit_to_user_mode+0x77/0x220
> [  144.063813]  ? srso_alias_return_thunk+0x5/0xfbef5
> [  144.064142]  ? do_syscall_64+0xb9/0x1c0
> [  144.064411]  ? srso_alias_return_thunk+0x5/0xfbef5
> [  144.064747]  ? do_syscall_64+0xb9/0x1c0
> [  144.065018]  ? srso_alias_return_thunk+0x5/0xfbef5
> [  144.065345]  ? do_read_fault+0x109/0x1b0
> [  144.065628]  ? srso_alias_return_thunk+0x5/0xfbef5
> [  144.065961]  ? do_fault+0x1aa/0x2f0
> [  144.066212]  ? handle_pte_fault+0x102/0x1a0
> [  144.066503]  ? srso_alias_return_thunk+0x5/0xfbef5
> [  144.066836]  ? __handle_mm_fault+0x5ed/0x710
> [  144.067137]  ? srso_alias_return_thunk+0x5/0xfbef5
> [  144.067464]  ? __count_memcg_events+0x72/0x110
> [  144.067779]  ? srso_alias_return_thunk+0x5/0xfbef5
> [  144.068106]  ? count_memcg_events.constprop.0+0x26/0x50
> [  144.068457]  ? srso_alias_return_thunk+0x5/0xfbef5
> [  144.068788]  ? handle_mm_fault+0xae/0x320
> [  144.069068]  ? srso_alias_return_thunk+0x5/0xfbef5
> [  144.069395]  ? do_user_addr_fault+0x34a/0x6b0
> [  144.069708]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [  144.070049] RIP: 0033:0x7fc5524f9c66
> [  144.070307] Code: Unable to access opcode bytes at 0x7fc5524f9c3c.
> [  144.070720] RSP: 002b:00007ffee052beb8 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
> [  144.071214] RAX: ffffffffffffffda RBX: 00007fc5527bb860 RCX: 00007fc5524f9c66
> [  144.071684] RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000
> [  144.072146] RBP: 0000000000000000 R08: 00000000000000e7 R09: ffffffffffffff78
> [  144.072608] R10: 00007ffee052bdef R11: 0000000000000246 R12: 00007fc5527bb860
> [  144.073076] R13: 0000000000000002 R14: 00007fc5527c4528 R15: 0000000000000000
> [  144.073543]  </TASK>
> [  144.074780] Kernel Offset: 0x37c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
>
> Thanks,
>
> Darren.
>
>>> ---
>>>   drivers/net/virtio_net.c | 12 +++++++++---
>>>   1 file changed, 9 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>>> index c6af18948092..e5286a6da863 100644
>>> --- a/drivers/net/virtio_net.c
>>> +++ b/drivers/net/virtio_net.c
>>> @@ -918,9 +918,6 @@ static void *virtnet_rq_alloc(struct receive_queue *rq, u32 size, gfp_t gfp)
>>>   	void *buf, *head;
>>>   	dma_addr_t addr;
>>>   
>>> -	if (unlikely(!skb_page_frag_refill(size, alloc_frag, gfp)))
>>> -		return NULL;
>>> -
>>>   	head = page_address(alloc_frag->page);
>>>   
>>>   	dma = head;
>>> @@ -2421,6 +2418,9 @@ static int add_recvbuf_small(struct virtnet_info *vi, struct receive_queue *rq,
>>>   	len = SKB_DATA_ALIGN(len) +
>>>   	      SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
>>>   
>>> +	if (unlikely(!skb_page_frag_refill(len, &rq->alloc_frag, gfp)))
>>> +		return -ENOMEM;
>>> +
>>>   	buf = virtnet_rq_alloc(rq, len, gfp);
>>>   	if (unlikely(!buf))
>>>   		return -ENOMEM;
>>> @@ -2521,6 +2521,12 @@ static int add_recvbuf_mergeable(struct virtnet_info *vi,
>>>   	 */
>>>   	len = get_mergeable_buf_len(rq, &rq->mrg_avg_pkt_len, room);
>>>   
>>> +	if (unlikely(!skb_page_frag_refill(len + room, alloc_frag, gfp)))
>>> +		return -ENOMEM;
>>> +
>>> +	if (!alloc_frag->offset && len + room + sizeof(struct virtnet_rq_dma) > alloc_frag->size)
>>> +		len -= sizeof(struct virtnet_rq_dma);
>>> +
>>>   	buf = virtnet_rq_alloc(rq, len + room, gfp);
>>>   	if (unlikely(!buf))
>>>   		return -ENOMEM;
>>> -- 
>>> 2.32.0.3.g01195cf9f