lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wh0foAi-kPgNOq6qSHPgsfekT8N9_--usjiTynpQbqvRA@mail.gmail.com>
Date:   Thu, 16 Mar 2023 11:15:06 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     richard clark <richard.xnu.clark@...il.com>
Cc:     linux-kernel@...r.kernel.org
Subject: Re: Question about select and poll system call

On Mon, Mar 13, 2023 at 7:28 PM richard clark
<richard.xnu.clark@...il.com> wrote:
>
> There're two questions about these system calls:
> 1. According to https://pubs.opengroup.org/onlinepubs/7908799/xsh/select.html:
> ERRORS
> [EINVAL]
>       The nfds argument is less than 0 or greater than FD_SETSIZE.
> But the current implementation in Linux like:
>        if (nfds > FD_SETSIZE)
>                nfds = FD_SETSIZE
> What's the rationale behind this?

Basically, the value of FD_SETSIZE has changed, and different pieces
of the system have used different values over the years.

The exact value of FD_SETSIZE ends up actually depending on the
compile-time size of the "fd_set" variable, and both the kernel and
glibc (and presumably other C library implementations) have changed
over time.

Just to give you a flavor of that history, 'select()' was implemented
back in early '92 in linux-0.12 (one of the greatest Linux releases of
all time - 0.12 was when Linux actually became *useful* to some
people).

And back then, we had this:

  typedef unsigned long fd_set;

which may seem a bit limiting today ("Only 32 bits??!?"), but to put
that in perspective, back then we also had this:

  #define NR_OPEN 20

and Linux-0.12 also did the *radical* change of changing NR_INODE from
32 to 64. Whee..

It was a very different time, in other words.

Now, imagine what happens when you increase those kinds of limits (as
we obviously did), and you do the library and kernel maintenance
separately. Some people might use a newer library with an older
kernel, and vice versa.

Doing that

         if (nfds > FD_SETSIZE)
                 nfds = FD_SETSIZE;

basically allows you to at least limp along in that situation, where
maybe the library uses a 'fd_set' with thousands of bits, but the
kernel has a smaller limit.

Because you *will* find user programs that basically do

          select(FD_SETSIZE, ...)

even if they don't actually use all those bits. Returning an error
because the C library had a different idea of how big the fdset was
compared to the kernel would be bad.

Now, the above is the *historical* reason for this all. The kernel
hasn't actually changed FD_SETSIZE in decades. We could say "by now,
if you use FD_SETSIZE larger than 1024, we'll return an error instead
of just truncating it".

But at the same time, while time has passed and we could do those
kinds of decisions, by now the POSIX spec is almost immaterial, and
compatibility with older versions of Linux is more important than
POSIX paper compatibility.

So there just isn't any reason to change any more.

> 2. Can we unify the two different system calls? For example, using
> poll(...) to implement the frontend select call(...), is there
> something I'm missing for current implementation?

No. select() and poll() are completely different animals. Trying to
unify them means having to convert from an array of fd descriptors to
several arrays of bits. They are just very different interfaces.

Inside the kernel, the low-level implementation as far as individual
file descriptors is concerned is all unified already. Once you just
deal with one single file descriptor, we internally use a "->poll()"
thing. But to *get* to that individual file descriptor, select() and
poll() walk very different data structures.

                  Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ