linux-kernel - [QUESTION] io_uring: Handling -EAGAIN and potential duplicate submissions

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-Id: <D74C05BD-EC64-4A24-B7D8-E126056E831A@gmail.com>
Date: Wed, 30 Apr 2025 12:12:48 -0700
From: Thomas Haynes <loghyr@...il.com>
To: linux-kernel@...r.kernel.org
Subject: [QUESTION] io_uring: Handling -EAGAIN and potential duplicate
 submissions

Hi LKML,

I am using kernel version 6.14.4-300.fc42.x86_64 and performing RPC
handling of NFSv3 requests in an user land server.

I'm working with io_uring and have a question about the correct way
to handle -EAGAIN from io_uring_submit(), specifically to avoid
potential duplicate submissions.

I have a submission loop that looks like this:

    for (int i = 0; i < MAX_RETRIES; i++) {
        ret = io_uring_submit(ring);
        if (ret >= 0)
            break;
        if (ret == -EAGAIN) {
            TRACE(write_fragment_trace,
                  "Context=%p resubmission %d", (void *)ic, i);
            usleep(IO_URING_WAIT_US);
        } else
            break;
    }

My understanding is that -EAGAIN from io_uring_submit() indicates
that the kernel's submission queue was temporarily full and the
submission should be retried. However, I'm observing a behavior
that suggests a potential for duplicate operations:

  * I submit a request.

  * io_uring_submit() returns -EAGAIN. The SQE remains in the 
  submission queue.

  * I retry the io_uring_submit().

  * Eventually, io_uring_submit() returns a positive value.

It appears that both the original SQE (from the -EAGAIN case) and
the SQE submitted in the successful call are processed, leading to
the operation being performed twice. It also leads to heap-use-after-free
after I release the associated memory after processing the first
CQE.

This raises a few questions:

  * Is this behavior expected? Does -EAGAIN in io_uring_submit() 
  imply that the SQE may or may not have been partially processed
  or queued for processing, even though the submit call itself
  failed?

  * If this is expected, what is the recommended way to handle 
  -EAGAIN to guarantee that each SQE is submitted and processed
  exactly once, even under temporary queue pressure? Should I be
  modifying the SQE or the submission queue in some way before
  retrying?

  * Are there any specific io_uring setup flags or other considerations 
  that might influence this behavior?

I'm concerned about the potential for data corruption or other 
issues if operations are performed multiple times.

Any insights or best practices on handling -EAGAIN in this context
would be greatly appreciated.

Thanks,
Tom Haynes