[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b11693ec-5d08-62e7-7479-a631edd5b1ce@suse.com>
Date: Wed, 13 Jul 2022 11:31:06 +0200
From: Juergen Gross <jgross@...e.com>
To: Jan Beulich <jbeulich@...e.com>
Cc: Wei Liu <wei.liu@...nel.org>, Paul Durrant <paul@....org>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
xen-devel@...ts.xenproject.org, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] xen/netback: handle empty rx queue in
xenvif_rx_next_skb()
On 13.07.22 09:59, Jan Beulich wrote:
> On 13.07.2022 09:48, Juergen Gross wrote:
>> xenvif_rx_next_skb() is expecting the rx queue not being empty, but
>> in case the loop in xenvif_rx_action() is doing multiple iterations,
>> the availability of another skb in the rx queue is not being checked.
>>
>> This can lead to crashes:
>>
>> [40072.537261] BUG: unable to handle kernel NULL pointer dereference at 0000000000000080
>> [40072.537407] IP: xenvif_rx_skb+0x23/0x590 [xen_netback]
>> [40072.537534] PGD 0 P4D 0
>> [40072.537644] Oops: 0000 [#1] SMP NOPTI
>> [40072.537749] CPU: 0 PID: 12505 Comm: v1-c40247-q2-gu Not tainted 4.12.14-122.121-default #1 SLE12-SP5
>> [40072.537867] Hardware name: HP ProLiant DL580 Gen9/ProLiant DL580 Gen9, BIOS U17 11/23/2021
>> [40072.537999] task: ffff880433b38100 task.stack: ffffc90043d40000
>> [40072.538112] RIP: e030:xenvif_rx_skb+0x23/0x590 [xen_netback]
>> [40072.538217] RSP: e02b:ffffc90043d43de0 EFLAGS: 00010246
>> [40072.538319] RAX: 0000000000000000 RBX: ffffc90043cd7cd0 RCX: 00000000000000f7
>> [40072.538430] RDX: 0000000000000000 RSI: 0000000000000006 RDI: ffffc90043d43df8
>> [40072.538531] RBP: 000000000000003f R08: 000077ff80000000 R09: 0000000000000008
>> [40072.538644] R10: 0000000000007ff0 R11: 00000000000008f6 R12: ffffc90043ce2708
>> [40072.538745] R13: 0000000000000000 R14: ffffc90043d43ed0 R15: ffff88043ea748c0
>> [40072.538861] FS: 0000000000000000(0000) GS:ffff880484600000(0000) knlGS:0000000000000000
>> [40072.538988] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [40072.539088] CR2: 0000000000000080 CR3: 0000000407ac8000 CR4: 0000000000040660
>> [40072.539211] Call Trace:
>> [40072.539319] xenvif_rx_action+0x71/0x90 [xen_netback]
>> [40072.539429] xenvif_kthread_guest_rx+0x14a/0x29c [xen_netback]
>>
>> Fix that by stopping the loop in case the rx queue becomes empty.
>>
>> Signed-off-by: Juergen Gross <jgross@...e.com>
>
> Reviewed-by: Jan Beulich <jbeulich@...e.com>
>
> Does this want a Fixes: tag and Cc: to stable@ (not the least since as per
> above the issue was noticed with 4.12.x)?
Hmm, I _think_ the issue was introduced with eb1723a29b9a. Do you agree?
>
>> --- a/drivers/net/xen-netback/rx.c
>> +++ b/drivers/net/xen-netback/rx.c
>> @@ -495,6 +495,7 @@ void xenvif_rx_action(struct xenvif_queue *queue)
>> queue->rx_copy.completed = &completed_skbs;
>>
>> while (xenvif_rx_ring_slots_available(queue) &&
>> + !skb_queue_empty(&queue->rx_queue) &&
>> work_done < RX_BATCH_SIZE) {
>> xenvif_rx_skb(queue);
>> work_done++;
>
> I have to admit that I find the title a little misleading - you don't
> deal with the issue _in_ xenvif_rx_next_skb(); you instead avoid
> entering the function in such a case.
I'm handling the issue to avoid "an empty rx queue in xenvif_rx_next_skb()".
I can rephrase it to "avoid entering xenvif_rx_next_skb() with an empty rx
queue".
Juergen
Download attachment "OpenPGP_0xB0DE9DD628BF132F.asc" of type "application/pgp-keys" (3099 bytes)
Download attachment "OpenPGP_signature" of type "application/pgp-signature" (496 bytes)
Powered by blists - more mailing lists