lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Fri, 7 Feb 2020 17:51:36 +0100
From:   Stefano Garzarella <sgarzare@...hat.com>
To:     Jens Axboe <axboe@...nel.dk>
Cc:     linux-kernel@...r.kernel.org, io-uring@...r.kernel.org
Subject: Re: [PATCH liburing v2 0/1] test: add epoll test case

On Fri, Jan 31, 2020 at 08:39:46AM -0700, Jens Axboe wrote:
> On 1/31/20 7:29 AM, Stefano Garzarella wrote:
> > Hi Jens,
> > this is a v2 of the epoll test.
> > 
> > v1 -> v2:
> >     - if IORING_FEAT_NODROP is not available, avoid to overflow the CQ
> >     - add 2 new tests to test epoll with IORING_FEAT_NODROP
> >     - cleanups
> > 
> > There are 4 sub-tests:
> >     1. test_epoll
> >     2. test_epoll_sqpoll
> >     3. test_epoll_nodrop
> >     4. test_epoll_sqpoll_nodrop
> > 
> > In the first 2 tests, I try to avoid to queue more requests than we have room
> > for in the CQ ring. These work fine, I have no faults.
> 
> Thanks!
> 
> > In the tests 3 and 4, if IORING_FEAT_NODROP is supported, I try to submit as
> > much as I can until I get a -EBUSY, but they often fail in this way:
> > the submitter manages to submit everything, the receiver receives all the
> > submitted bytes, but the cleaner loses completion events (I also tried to put a
> > timeout to epoll_wait() in the cleaner to be sure that it is not related to the
> > patch that I send some weeks ago, but the situation doesn't change, it's like
> > there is still overflow in the CQ).
> > 
> > Next week I'll try to investigate better which is the problem.
> 
> Does it change if you have an io_uring_enter() with GETEVENTS set? I wonder if
> you just pruned the CQ ring but didn't flush the internal side.
> 

Just an update: after the "io_uring: flush overflowed CQ events in the
io_uring_poll()" the test 3 works well.

Now the problem is the test 4 (with sqpoll). It works in most cases, but it
fails a few times in this way:
- the submitter freezes after submitting X requests
- the cleaner and the consumer see X-2 requests (2 are the entries in
  the queue)

I tried to put a timeout on the submitter's epoll and do an io_uring_submit()
to wake up the kthread (if we lose some notifications), but the problem seems
to be somewhere else. I think a race somewhere.

Any suggestion on how to debug this case?
I'll try with tracing.

Thanks,
Stefano

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ