netdev - Re: Re: [RFC v3 01/11] eventfd: track eventfd_signal() recursion depth separately in different cases

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CACycT3ukfPjnD+o0_xkq9Y9cwDxQUj1dmuuwuVdQvKywjQhRjA@mail.gmail.com>
Date:   Thu, 28 Jan 2021 14:08:58 +0800
From:   Yongji Xie <xieyongji@...edance.com>
To:     Jason Wang <jasowang@...hat.com>
Cc:     "Michael S. Tsirkin" <mst@...hat.com>,
        Stefan Hajnoczi <stefanha@...hat.com>,
        Stefano Garzarella <sgarzare@...hat.com>,
        Parav Pandit <parav@...dia.com>, Bob Liu <bob.liu@...cle.com>,
        Christoph Hellwig <hch@...radead.org>,
        Randy Dunlap <rdunlap@...radead.org>,
        Matthew Wilcox <willy@...radead.org>, viro@...iv.linux.org.uk,
        Jens Axboe <axboe@...nel.dk>, bcrl@...ck.org,
        Jonathan Corbet <corbet@....net>,
        virtualization@...ts.linux-foundation.org, netdev@...r.kernel.org,
        kvm@...r.kernel.org, linux-aio@...ck.org,
        linux-fsdevel@...r.kernel.org
Subject: Re: Re: [RFC v3 01/11] eventfd: track eventfd_signal() recursion
 depth separately in different cases

On Thu, Jan 28, 2021 at 12:31 PM Jason Wang <jasowang@...hat.com> wrote:
>
>
> On 2021/1/28 上午11:52, Yongji Xie wrote:
> > On Thu, Jan 28, 2021 at 11:05 AM Jason Wang <jasowang@...hat.com> wrote:
> >>
> >> On 2021/1/27 下午5:11, Yongji Xie wrote:
> >>> On Wed, Jan 27, 2021 at 11:38 AM Jason Wang <jasowang@...hat.com> wrote:
> >>>> On 2021/1/20 下午2:52, Yongji Xie wrote:
> >>>>> On Wed, Jan 20, 2021 at 12:24 PM Jason Wang <jasowang@...hat.com> wrote:
> >>>>>> On 2021/1/19 下午12:59, Xie Yongji wrote:
> >>>>>>> Now we have a global percpu counter to limit the recursion depth
> >>>>>>> of eventfd_signal(). This can avoid deadlock or stack overflow.
> >>>>>>> But in stack overflow case, it should be OK to increase the
> >>>>>>> recursion depth if needed. So we add a percpu counter in eventfd_ctx
> >>>>>>> to limit the recursion depth for deadlock case. Then it could be
> >>>>>>> fine to increase the global percpu counter later.
> >>>>>> I wonder whether or not it's worth to introduce percpu for each eventfd.
> >>>>>>
> >>>>>> How about simply check if eventfd_signal_count() is greater than 2?
> >>>>>>
> >>>>> It can't avoid deadlock in this way.
> >>>> I may miss something but the count is to avoid recursive eventfd call.
> >>>> So for VDUSE what we suffers is e.g the interrupt injection path:
> >>>>
> >>>> userspace write IRQFD -> vq->cb() -> another IRQFD.
> >>>>
> >>>> It looks like increasing EVENTFD_WAKEUP_DEPTH should be sufficient?
> >>>>
> >>> Actually I mean the deadlock described in commit f0b493e ("io_uring:
> >>> prevent potential eventfd recursion on poll"). It can break this bug
> >>> fix if we just increase EVENTFD_WAKEUP_DEPTH.
> >>
> >> Ok, so can wait do something similar in that commit? (using async stuffs
> >> like wq).
> >>
> > We can do that. But it will reduce the performance. Because the
> > eventfd recursion will be triggered every time kvm kick eventfd in
> > vhost-vdpa cases:
> >
> > KVM write KICKFD -> ops->kick_vq -> VDUSE write KICKFD
> >
> > Thanks,
> > Yongji
>
>
> Right, I think in the future we need to find a way to let KVM to wakeup
> VDUSE directly.
>

Yes, this would be better.

Thanks,
Yongji