lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-Id: <D74C05BD-EC64-4A24-B7D8-E126056E831A@gmail.com>
Date: Wed, 30 Apr 2025 12:12:48 -0700
From: Thomas Haynes <loghyr@...il.com>
To: linux-kernel@...r.kernel.org
Subject: [QUESTION] io_uring: Handling -EAGAIN and potential duplicate
 submissions

Hi LKML,

I am using kernel version 6.14.4-300.fc42.x86_64 and performing RPC
handling of NFSv3 requests in an user land server.

I'm working with io_uring and have a question about the correct way
to handle -EAGAIN from io_uring_submit(), specifically to avoid
potential duplicate submissions.

I have a submission loop that looks like this:

    for (int i = 0; i < MAX_RETRIES; i++) {
        ret = io_uring_submit(ring);
        if (ret >= 0)
            break;
        if (ret == -EAGAIN) {
            TRACE(write_fragment_trace,
                  "Context=%p resubmission %d", (void *)ic, i);
            usleep(IO_URING_WAIT_US);
        } else
            break;
    }

My understanding is that -EAGAIN from io_uring_submit() indicates
that the kernel's submission queue was temporarily full and the
submission should be retried. However, I'm observing a behavior
that suggests a potential for duplicate operations:

  * I submit a request.

  * io_uring_submit() returns -EAGAIN. The SQE remains in the 
  submission queue.

  * I retry the io_uring_submit().

  * Eventually, io_uring_submit() returns a positive value.

It appears that both the original SQE (from the -EAGAIN case) and
the SQE submitted in the successful call are processed, leading to
the operation being performed twice. It also leads to heap-use-after-free
after I release the associated memory after processing the first
CQE.

This raises a few questions:

  * Is this behavior expected? Does -EAGAIN in io_uring_submit() 
  imply that the SQE may or may not have been partially processed
  or queued for processing, even though the submit call itself
  failed?

  * If this is expected, what is the recommended way to handle 
  -EAGAIN to guarantee that each SQE is submitted and processed
  exactly once, even under temporary queue pressure? Should I be
  modifying the SQE or the submission queue in some way before
  retrying?

  * Are there any specific io_uring setup flags or other considerations 
  that might influence this behavior?

I'm concerned about the potential for data corruption or other 
issues if operations are performed multiple times.

Any insights or best practices on handling -EAGAIN in this context
would be greatly appreciated.

Thanks,
Tom Haynes



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ