[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20241124094813.1021293-1-alexjlzheng@tencent.com>
Date: Sun, 24 Nov 2024 17:48:13 +0800
From: Jinliang Zheng <alexjlzheng@...il.com>
To: viro@...iv.linux.org.uk
Cc: adobriyan@...il.com,
alexjlzheng@...il.com,
alexjlzheng@...cent.com,
brauner@...nel.org,
flyingpeng@...cent.com,
jack@...e.cz,
joel.granados@...nel.org,
kees@...nel.org,
linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org,
mcgrof@...nel.org
Subject: Re: [PATCH 0/6] Maintain the relative size of fs.file-max and fs.nr_open
On Sat, 23 Nov 2024 19:32:27 +0000, Al Viro wrote:
> On Sat, Nov 23, 2024 at 06:27:30PM +0000, Al Viro wrote:
> > On Sun, Nov 24, 2024 at 02:08:55AM +0800, Jinliang Zheng wrote:
> > > According to Documentation/admin-guide/sysctl/fs.rst, fs.nr_open and
> > > fs.file-max represent the number of file-handles that can be opened
> > > by each process and the entire system, respectively.
> > >
> > > Therefore, it's necessary to maintain a relative size between them,
> > > meaning we should ensure that files_stat.max_files is not less than
> > > sysctl_nr_open.
> >
> > NAK.
> >
> > You are confusing descriptors (nr_open) and open IO channels (max_files).
> >
> > We very well _CAN_ have more of the former. For further details,
> > RTFM dup(2) or any introductory Unix textbook.
>
> Short version: there are 3 different notions -
> 1) file as a collection of data kept by filesystem. Such things as
> contents, ownership, permissions, timestamps belong there.
> 2) IO channel used to access one of (1). open(2) creates such;
> things like current position in file, whether it's read-only or read-write
> open, etc. belong there. It does not belong to a process - after fork(),
> child has access to all open channels parent had when it had spawned
> a child. If you open a file in parent, read 10 bytes from it, then spawn
> a child that reads 10 more bytes and exits, then have parent read another
> 5 bytes, the first read by parent will have read bytes 0 to 9, read by
> child - bytes 10 to 19 and the second read by parent - bytes 20 to 24.
> Position is a property of IO channel; it belongs neither to underlying
> file (otherwise another process opening the file and reading from it
> would play havoc on your process) nor to process (otherwise reads done
> by child would not have affected the parent and the second read from
> parent would have gotten bytes 10 to 14). Same goes for access mode -
> it belongs to IO channel.
I'm sorry that I don't know much about the implementation of UNIX, but
specific to the implementation of Linux, struct file is more like a
combination of what you said 1) and 2).
But I see your point, I missed the dup() case. dup() will occupy the
element position of the fdtable->fd array, but will not create a new
struct file.
Thank you.
Jinliang Zheng
> 3) file descriptor - a number that has a meaning only in context
> of a process and refers to IO channel. That's what system calls use
> to identify the IO channel to operate upon; open() picks a descriptor
> unused by the calling process, associates the new channel with it and
> returns that descriptor (a number) to caller. Multiple descriptors can
> refer to the same IO channel; e.g. dup(fd) grabs a new descriptor and
> associates it with the same IO channel fd currently refers to.
>
> IO channels are not directly exposed to userland, but they are
> very much present in Unix-style IO API. Note that results of e.g.
> int fd1 = open("/etc/issue", 0);
> int fd2 = open("/etc/issue", 0);
> and
> int fd1 = open("/etc/issue", 0);
> int fd2 = dup(fd1);
> are not identical, even though in both cases fd1 and fd2 are opened
> descriptors and reading from them will access the contents of the
> /etc/issue; in the former case the positions being accessed by read from
> fd1 and fd2 will be independent, in the latter they will be shared.
>
> It's really quite basic - Unix Programming 101 stuff. It's not
> just that POSIX requires that and that any Unix behaves that way,
> anything even remotely Unix-like will be like that.
>
> You won't find the words 'IO channel' in POSIX, but I refuse
> to use the term they have chosen instead - 'file description'. Yes,
> alongside with 'file descriptor', in the contexts where the distinction
> between these notions is quite important. I would rather not say what
> I really think of those unsung geniuses, lest CoC gets overexcited...
>
> Anyway, in casual conversations the expression 'opened file'
> usually refers to that thing. Which is somewhat clumsy (sounds like
> 'file on filesystem that happens to be opened'), but usually it's
> good enough. If you need to be pedantic (e.g. when explaining that
> material in aforementioned Unix Programming 101 class), 'IO channel'
> works well enough, IME.
Powered by blists - more mailing lists