netdev - RE: [PATCH net v2 2/2] vhost_net: fix high cpu load when sendmsg fails

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <34EFBCA9F01B0748BEB6B629CE643AE60DB8D985@DGGEMM533-MBX.china.huawei.com>
Date:   Wed, 23 Dec 2020 02:46:20 +0000
From:   wangyunjian <wangyunjian@...wei.com>
To:     Jason Wang <jasowang@...hat.com>,
        Willem de Bruijn <willemdebruijn.kernel@...il.com>
CC:     Network Development <netdev@...r.kernel.org>,
        "Michael S. Tsirkin" <mst@...hat.com>,
        "virtualization@...ts.linux-foundation.org" 
        <virtualization@...ts.linux-foundation.org>,
        "Lilijun (Jerry)" <jerry.lilijun@...wei.com>,
        chenchanghu <chenchanghu@...wei.com>,
        xudingke <xudingke@...wei.com>,
        "huangbin (J)" <brian.huangbin@...wei.com>
Subject: RE: [PATCH net v2 2/2] vhost_net: fix high cpu load when sendmsg
 fails



> -----Original Message-----
> From: Jason Wang [mailto:jasowang@...hat.com]
> Sent: Tuesday, December 22, 2020 12:41 PM
> To: Willem de Bruijn <willemdebruijn.kernel@...il.com>; wangyunjian
> <wangyunjian@...wei.com>
> Cc: Network Development <netdev@...r.kernel.org>; Michael S. Tsirkin
> <mst@...hat.com>; virtualization@...ts.linux-foundation.org; Lilijun (Jerry)
> <jerry.lilijun@...wei.com>; chenchanghu <chenchanghu@...wei.com>;
> xudingke <xudingke@...wei.com>; huangbin (J)
> <brian.huangbin@...wei.com>
> Subject: Re: [PATCH net v2 2/2] vhost_net: fix high cpu load when sendmsg fails
> 
> 
> On 2020/12/22 上午7:07, Willem de Bruijn wrote:
> > On Wed, Dec 16, 2020 at 3:20 AM wangyunjian<wangyunjian@...wei.com>
> wrote:
> >> From: Yunjian Wang<wangyunjian@...wei.com>
> >>
> >> Currently we break the loop and wake up the vhost_worker when sendmsg
> >> fails. When the worker wakes up again, we'll meet the same error.
> > The patch is based on the assumption that such error cases always
> > return EAGAIN. Can it not also be ENOMEM, such as from tun_build_skb?
> >
> >> This will cause high CPU load. To fix this issue, we can skip this
> >> description by ignoring the error. When we exceeds sndbuf, the return
> >> value of sendmsg is -EAGAIN. In the case we don't skip the
> >> description and don't drop packet.
> > the -> that
> >
> > here and above: description -> descriptor
> >
> > Perhaps slightly revise to more explicitly state that
> >
> > 1. in the case of persistent failure (i.e., bad packet), the driver
> > drops the packet 2. in the case of transient failure (e.g,. memory
> > pressure) the driver schedules the worker to try again later
> 
> 
> If we want to go with this way, we need a better time to wakeup the worker.
> Otherwise it just produces more stress on the cpu that is what this patch tries
> to avoid.

The problem was initially discovered when a VM sent an abnormal packet,
which causing the VM can't send packets anymore. After this patch
"feb8892cb441c7 vhost_net: conditionally enable tx polling", there have
also been high CPU consumption issues. 

It is the first problem that I am actually more concerned with and want
to solve.

Thanks

> 
> Thanks
> 
> 
> >
> >