lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 15 Aug 2022 11:40:59 -0400
From:   "Michael S. Tsirkin" <mst@...hat.com>
To:     Xuan Zhuo <xuanzhuo@...ux.alibaba.com>,
        Jason Wang <jasowang@...hat.com>,
        Andres Freund <andres@...razel.de>,
        "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>,
        virtualization@...ts.linux-foundation.org, netdev@...r.kernel.org,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Jens Axboe <axboe@...nel.dk>,
        James Bottomley <James.Bottomley@...senpartnership.com>,
        "Martin K. Petersen" <martin.petersen@...cle.com>,
        Guenter Roeck <linux@...ck-us.net>,
        linux-kernel@...r.kernel.org, Greg KH <gregkh@...uxfoundation.org>,
        c@...hat.com
Subject: Re: upstream kernel crashes

On Mon, Aug 15, 2022 at 01:34:41AM -0700, Andres Freund wrote:
> Hi, 
> 
> On August 15, 2022 1:28:29 AM PDT, "Michael S. Tsirkin" <mst@...hat.com> wrote:
> >On Mon, Aug 15, 2022 at 01:15:27AM -0700, Andres Freund wrote:
> >> Hi,
> >> 
> >> On 2022-08-15 03:51:34 -0400, Michael S. Tsirkin wrote:
> >> > It is possible that GCP gets confused if ring size is smaller than the
> >> > device maximum simply because no one did it in the past.
> >> > 
> >> > So I pushed just the revert of 762faee5a267 to the test branch.
> >> > Could you give it a spin?
> >> 
> >> Seems to fix the issue, at least to the extent I can determine at 1am... :)
> >> 
> >> Greetings,
> >> 
> >> Andres Freund
> >
> >So you tested this:
> >
> >commit 13df5a7eaeb22561d39354b576bc98a7e2c389f9 (HEAD, kernel.org/test)
> >Author: Michael S. Tsirkin <mst@...hat.com>
> >Date:   Mon Aug 15 03:44:38 2022 -0400
> >
> >    Revert "virtio_net: set the default max ring size by find_vqs()"
> >    
> >    This reverts commit 762faee5a2678559d3dc09d95f8f2c54cd0466a7.
> >    
> >    Signed-off-by: Michael S. Tsirkin <mst@...hat.com>
> >
> >and it fixes both issues right? No crashes no networking issue?
> 
> Correct. I only did limited testing, but it's survived far longer / more reboots than anything since the commit.
> 
> Andres
> -- 
> Sent from my Android device with K-9 Mail. Please excuse my brevity.


OK so this gives us a quick revert as a solution for now.
Next, I would appreciate it if you just try this simple hack.
If it crashes we either have a long standing problem in virtio
code or more likely a gcp bug where it can't handle smaller
rings than what device requestes.
Thanks!

diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c
index f7965c5dd36b..bdd5f481570b 100644
--- a/drivers/virtio/virtio_pci_modern.c
+++ b/drivers/virtio/virtio_pci_modern.c
@@ -314,6 +314,9 @@ static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
 	if (!size || size > num)
 		size = num;
 
+	if (size > 1024)
+		size = 1024;
+
 	if (size & (size - 1)) {
 		dev_warn(&vp_dev->pci_dev->dev, "bad queue size %u", size);
 		return ERR_PTR(-EINVAL);


-- 
MST

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ