[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220928151135.pvrlsylg6j3hzh74@sgarzare-redhat>
Date: Wed, 28 Sep 2022 17:11:35 +0200
From: Stefano Garzarella <sgarzare@...hat.com>
To: "Michael S. Tsirkin" <mst@...hat.com>
Cc: Junichi Uekawa <uekawa@...omium.org>,
Stefan Hajnoczi <stefanha@...hat.com>,
Jason Wang <jasowang@...hat.com>,
Eric Dumazet <edumazet@...gle.com>, davem@...emloft.net,
netdev@...r.kernel.org, Jakub Kicinski <kuba@...nel.org>,
virtualization@...ts.linux-foundation.org, kvm@...r.kernel.org,
Paolo Abeni <pabeni@...hat.com>, linux-kernel@...r.kernel.org,
Bobby Eshleman <bobby.eshleman@...il.com>
Subject: Re: [PATCH] vhost/vsock: Use kvmalloc/kvfree for larger packets.
On Wed, Sep 28, 2022 at 05:31:58AM -0400, Michael S. Tsirkin wrote:
>On Wed, Sep 28, 2022 at 10:28:23AM +0200, Stefano Garzarella wrote:
>> On Wed, Sep 28, 2022 at 03:45:38PM +0900, Junichi Uekawa wrote:
>> > When copying a large file over sftp over vsock, data size is usually 32kB,
>> > and kmalloc seems to fail to try to allocate 32 32kB regions.
>> >
>> > Call Trace:
>> > [<ffffffffb6a0df64>] dump_stack+0x97/0xdb
>> > [<ffffffffb68d6aed>] warn_alloc_failed+0x10f/0x138
>> > [<ffffffffb68d868a>] ? __alloc_pages_direct_compact+0x38/0xc8
>> > [<ffffffffb664619f>] __alloc_pages_nodemask+0x84c/0x90d
>> > [<ffffffffb6646e56>] alloc_kmem_pages+0x17/0x19
>> > [<ffffffffb6653a26>] kmalloc_order_trace+0x2b/0xdb
>> > [<ffffffffb66682f3>] __kmalloc+0x177/0x1f7
>> > [<ffffffffb66e0d94>] ? copy_from_iter+0x8d/0x31d
>> > [<ffffffffc0689ab7>] vhost_vsock_handle_tx_kick+0x1fa/0x301 [vhost_vsock]
>> > [<ffffffffc06828d9>] vhost_worker+0xf7/0x157 [vhost]
>> > [<ffffffffb683ddce>] kthread+0xfd/0x105
>> > [<ffffffffc06827e2>] ? vhost_dev_set_owner+0x22e/0x22e [vhost]
>> > [<ffffffffb683dcd1>] ? flush_kthread_worker+0xf3/0xf3
>> > [<ffffffffb6eb332e>] ret_from_fork+0x4e/0x80
>> > [<ffffffffb683dcd1>] ? flush_kthread_worker+0xf3/0xf3
>> >
>> > Work around by doing kvmalloc instead.
>> >
>> > Signed-off-by: Junichi Uekawa <uekawa@...omium.org>
>
>My worry here is that this in more of a work around.
>It would be better to not allocate memory so aggressively:
>if we are so short on memory we should probably process
>packets one at a time. Is that very hard to implement?
Currently the "virtio_vsock_pkt" is allocated in the "handle_kick"
callback of TX virtqueue. Then the packet is multiplexed on the right
socket queue, then the user space can de-queue it whenever they want.
So maybe we can stop processing the virtqueue if we are short on memory,
but when can we restart the TX virtqueue processing?
I think as long as the guest used only 4K buffers we had no problem, but
now that it can create larger buffers the host may not be able to
allocate it contiguously. Since there is no need to have them contiguous
here, I think this patch is okay.
However, if we switch to sk_buff (as Bobby is already doing), maybe we
don't have this problem because I think there is some kind of
pre-allocated pool.
>
>
>
>> > ---
>> >
>> > drivers/vhost/vsock.c | 2 +-
>> > net/vmw_vsock/virtio_transport_common.c | 2 +-
>> > 2 files changed, 2 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
>> > index 368330417bde..5703775af129 100644
>> > --- a/drivers/vhost/vsock.c
>> > +++ b/drivers/vhost/vsock.c
>> > @@ -393,7 +393,7 @@ vhost_vsock_alloc_pkt(struct vhost_virtqueue *vq,
>> > return NULL;
>> > }
>> >
>> > - pkt->buf = kmalloc(pkt->len, GFP_KERNEL);
>> > + pkt->buf = kvmalloc(pkt->len, GFP_KERNEL);
>> > if (!pkt->buf) {
>> > kfree(pkt);
>> > return NULL;
>> > diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
>> > index ec2c2afbf0d0..3a12aee33e92 100644
>> > --- a/net/vmw_vsock/virtio_transport_common.c
>> > +++ b/net/vmw_vsock/virtio_transport_common.c
>> > @@ -1342,7 +1342,7 @@ EXPORT_SYMBOL_GPL(virtio_transport_recv_pkt);
>> >
>> > void virtio_transport_free_pkt(struct virtio_vsock_pkt *pkt)
>> > {
>> > - kfree(pkt->buf);
>> > + kvfree(pkt->buf);
>>
>> virtio_transport_free_pkt() is used also in virtio_transport.c and
>> vsock_loopback.c where pkt->buf is allocated with kmalloc(), but IIUC
>> kvfree() can be used with that memory, so this should be fine.
>>
>> > kfree(pkt);
>> > }
>> > EXPORT_SYMBOL_GPL(virtio_transport_free_pkt);
>> > --
>> > 2.37.3.998.g577e59143f-goog
>> >
>>
>> This issue should go away with the Bobby's work about introducing sk_buff
>> [1], but we can queue this for now.
>>
>> I'm not sure if we should do the same also in the virtio-vsock driver
>> (virtio_transport.c). Here in vhost-vsock the buf allocated is only used in
>> the host, while in the virtio-vsock driver the buffer is exposed to the
>> device emulated in the host, so it should be physically contiguous (if not,
>> maybe we need to adjust virtio_vsock_rx_fill()).
>
>More importantly it needs to support DMA API which IIUC kvmalloc
>memory does not.
>
Right, good point!
Thanks,
Stefano
Powered by blists - more mailing lists