[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <53284233.3050800@gmail.com>
Date: Tue, 18 Mar 2014 13:55:15 +0100
From: "Michael Kerrisk (man-pages)" <mtk.manpages@...il.com>
To: "Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>,
"linux-man@...r.kernel.org" <linux-man@...r.kernel.org>,
Linux-Fsdevel <linux-fsdevel@...r.kernel.org>,
lkml <linux-kernel@...r.kernel.org>
CC: mtk.manpages@...il.com, Andreas Dilger <adilger@...ger.ca>,
NeilBrown <neilb@...e.de>, Christoph Hellwig <hch@...radead.org>
Subject: For review: open_by_name_at(2) man page [v2]
Hi Aneesh, (and others)
After integrating review comments from NeilBown and Christoph Hellwig,
here is draft 2 of a man page I've written for name_to_handle_at(2) and
open_by_name_at(2). Especially thanks to Neil's comments, several parts
of the page underwent a substantial rewrite. Would you be willing to
review it please, and let me know of any corrections/improvements?
There are some FIXMEs in the page that I would especially like some
help with.
Thanks,
Michael
'\" t -*- coding: UTF-8 -*-
.\" Copyright (c) 2014 by Michael Kerrisk <mtk.manpages@...il.com>
.\"
.\" %%%LICENSE_START(VERBATIM)
.\" Permission is granted to make and distribute verbatim copies of this
.\" manual provided the copyright notice and this permission notice are
.\" preserved on all copies.
.\"
.\" Permission is granted to copy and distribute modified versions of this
.\" manual under the conditions for verbatim copying, provided that the
.\" entire resulting derived work is distributed under the terms of a
.\" permission notice identical to this one.
.\"
.\" Since the Linux kernel and libraries are constantly changing, this
.\" manual page may be incorrect or out-of-date. The author(s) assume no
.\" responsibility for errors or omissions, or for damages resulting from
.\" the use of the information contained herein. The author(s) may not
.\" have taken the same level of care in the production of this manual,
.\" which is licensed free of charge, as they might when working
.\" professionally.
.\"
.\" Formatted or processed versions of this manual, if unaccompanied by
.\" the source, must acknowledge the copyright and authors of this work.
.\" %%%LICENSE_END
.\"
.TH OPEN_BY_HANDLE_AT 2 2014-03-24 "Linux" "Linux Programmer's Manual"
.SH NAME
name_to_handle_at, open_by_handle_at \- obtain handle
for a pathname and open file via a handle
.SH SYNOPSIS
.nf
.B #define _GNU_SOURCE
.B #include <sys/types.h>
.B #include <sys/stat.h>
.B #include <fcntl.h>
.BI "int name_to_handle_at(int " dirfd ", const char *" pathname ,
.BI " struct file_handle *" handle ,
.BI " int *" mount_id ", int " flags );
.BI "int open_by_handle_at(int " mount_fd ", struct file_handle *" handle ,
.BI " int " flags );
.fi
.SH DESCRIPTION
The
.BR name_to_handle_at ()
and
.BR open_by_handle_at ()
system calls split the functionality of
.BR openat (2)
into two parts:
.BR name_to_handle_at ()
returns an opaque handle that corresponds to a specified file;
.BR open_by_handle_at ()
opens the file corresponding to a handle returned by a previous call to
.BR name_to_handle_at ()
and returns an open file descriptor.
.\"
.\"
.SS name_to_handle_at()
The
.BR name_to_handle_at ()
system call returns a file handle and a mount ID corresponding to
the file specified by the
.IR dirfd
and
.IR pathname
arguments.
The file handle is returned via the argument
.IR handle ,
which is a pointer to a structure of the following form:
.in +4n
.nf
struct file_handle {
unsigned int handle_bytes; /* Size of f_handle [in, out] */
int handle_type; /* Handle type [out] */
unsigned char f_handle[0]; /* File identifier (sized by
caller) [out] */
};
.fi
.in
.PP
It is the caller's responsibility to allocate the structure
with a size large enough to hold the handle returned in
.IR f_handle .
Before the call, the
.IR handle_bytes
field should be initialized to contain the allocated size for
.IR f_handle .
(The constant
.BR MAX_HANDLE_SZ ,
defined in
.IR <fcntl.h> ,
specifies the maximum possible size for a file handle.)
Upon successful return, the
.IR handle_bytes
field is updated to contain the number of bytes actually written to
.IR f_handle .
The caller can discover the required size for the
.I file_handle
structure by making a call in which
.IR handle->handle_bytes
is zero;
in this case, the call fails with the error
.BR EOVERFLOW
and
.IR handle->handle_bytes
is set to indicate the required size;
the caller can then use this information to allocate a structure
of the correct size (see EXAMPLE below).
Other than the use of the
.IR handle_bytes
field, the caller should treat the
.IR file_handle
structure as an opaque data type: the
.IR handle_type
and
.IR f_handle
fields are needed only by a subsequent call to
.BR open_by_handle_at ().
Together, the
.I pathname
and
.I dirfd
arguments identify the file for which a handle is to obtained.
There are four distinct cases:
.IP * 3
If
.I pathname
is a nonempty string containing an absolute pathname,
then a handle is returned for the file referred to by that pathname.
In this case,
.IR dirfd
is ignored.
.IP *
If
.I pathname
is a nonempty string containing a relative pathname and
.IR dirfd
has the special value
.BR AT_FDCWD ,
then
.I pathname
is interpreted relative to the current working directory of the caller,
and a handle is returned for the file to which it refers.
.IP *
If
.I pathname
is a nonempty string containing a relative pathname and
.IR dirfd
is a file descriptor referring to a directory, then
.I pathname
is interpreted relative to the directory referred to by
.IR dirfd ,
and a handle is returned for the file to which it refers.
(See
.BR openat (3)
for an explanation of why "directory file descriptors" are useful.)
.IP *
If
.I pathname
is an empty string and
.I flags
specifies the value
.BR AT_EMPTY_PATH ,
then
.IR dirfd
can be an open file descriptor referring to any type of file,
or
.BR AT_FDCWD ,
meaning the current working directory,
and a handle is returned for the file to which it refers.
.PP
The
.I mount_id
argument returns an identifier for the filesystem
mount that corresponds to
.IR pathname .
This corresponds to the first field in one of the records in
.IR /proc/self/mountinfo .
Opening the pathname in the fifth field of that record yields a file
descriptor for the mount point;
that file descriptor can be used in a subsequent call to
.BR open_by_handle_at ().
The
.I flags
argument is a bit mask constructed by ORing together
zero or more of the following value:
.TP
.B AT_EMPTY_PATH
Allow
.I pathname
to be an empty string.
See above.
(which may have been obtained using the
.BR open (2)
.B O_PATH
flag).
.TP
.B AT_SYMLINK_FOLLOW
By default,
.BR name_to_handle_at ()
does not dereference
.I pathname
if it is a symbolic link.
The flag
.B AT_SYMLINK_FOLLOW
can be specified in
.I flags
to cause
.I pathname
to be dereferenced if it is a symbolic link.
.SS open_by_handle_at()
The
.BR open_by_handle_at ()
system call opens the file referred to by
.IR handle ,
a file handle returned by a previous call to
.BR name_to_handle_at ().
The
.IR mount_fd
argument is a file descriptor for any object (file, directory, etc.)
in the mounted filesystem with respect to which
.IR handle
should be interpreted.
The special value
.B AT_FDCWD
can be specified, meaning the current working directory of the caller.
The
.I flags
argument
is as for
.BR open (2).
.\" FIXME: Confirm that the following is intended behavior.
.\" (It certainly seems to be the behavior, from experimenting.)
If
.I handle
refers to a symbolic link, the caller must specify the
.B O_PATH
flag, and the symbolic link is not dereferenced (the
.B O_NOFOLLOW
flag, if specified, is ignored).
The caller must have the
.B CAP_DAC_READ_SEARCH
capability to invoke
.BR open_by_handle_at ().
.SH RETURN VALUE
On success,
.BR name_to_handle_at ()
returns 0,
and
.BR open_by_handle_at ()
returns a nonnegative file descriptor.
In the event of an error, both system calls return \-1 and set
.I errno
to indicate the cause of the error.
.SH ERRORS
.BR name_to_handle_at ()
and
.BR open_by_handle_at ()
can fail for the same errors as
.BR openat (2).
In addition, they can fail with the errors noted below.
.BR name_to_handle_at ()
can fail with the following errors:
.TP
.B EINVAL
.I flags
includes an invalid bit value.
.TP
.B EINVAL
.IR handle_bytes\->handle_bytes
is greater than
.BR MAX_HANDLE_SZ .
.TP
.B ENOENT
.I pathname
is an empty string, but
.BR AT_EMPTY_PATH
was not specified in
.IR flags .
.TP
.B ENOTDIR
The file descriptor supplied in
.I dirfd
does not refer to a directory,
and it it is not the case that both
.I flags
includes
.BR AT_EMPTY_PATH
and
.I pathname
is an empty string.
.TP
.B EOPNOTSUPP
The filesystem does not support decoding of a pathname to a file handle.
.TP
.B EOVERFLOW
The
.I handle->handle_bytes
value passed into the call was too small.
When this error occurs,
.I handle->handle_bytes
is updated to indicate the required size for the handle.
.\"
.\"
.PP
.BR open_by_handle_at ()
can fail with the following errors:
.TP
.B EBADF
.IR mount_fd
is not an open file descriptor.
.TP
.B EINVAL
.I handle->handle_bytes
is greater than
.BR MAX_HANDLE_SZ
or is equal to zero.
.TP
.B ELOOP
.\" FIXME (see earlier FIXME). Is this the intended behavior?
.I handle
refers to a symbolic link, but
.B O_PATH
was not specified in
.IR flags .
.TP
.B EPERM
The caller does not have the
.BR CAP_DAC_READ_SEARCH
capability.
.TP
.B ESTALE
The specified
.I handle
is no longer valid.
.SH VERSIONS
These system calls first appeared in Linux 2.6.39.
.SH CONFORMING TO
These system calls are nonstandard Linux extensions.
.SH NOTES
A file handle can be generated in one process using
.BR name_to_handle_at ()
and later used in a different process that calls
.BR open_by_handle_at ().
Not all filesystem types support the translation of pathnames to
file handles.
.\" FIXME NeilBrown noted:
.\" ESTALE is also returned if the filesystem does not support
.\" file-handle -> file mappings.
.\" On filesystems which don't provide export_operations (/sys /proc
.\" ubifs romfs cramfs nfs coda ... several others) name_to_handle_at
.\" will produce a generic handle using the 32 bit inode and 32 bit
.\" i_generation. open_by_name_at given this (or any) filehandle
.\" will fail with ESTALE.
.\" However, on /proc and /sys, at least, name_to_handle_at() fails with
.\" EOPNOTSUPP. Are there really filesystems that can deliver ESTALE (the
.\" same error as for an invalid file handle) in the above circumstances?
A file handle may become invalid ("stale") if a file is deleted,
or for other filesystem-specific reasons.
Invalid handles are notified by an
.B ESTALE
error from
.BR open_by_name_at ().
These system calls are designed for use by user-space file servers.
For example, a user-space NFS server might generate a file handle
and pass it to an NFS client.
Later, when the client wants to open the file,
it could pass the handle back to the server.
.\" https://lwn.net/Articles/375888/
.\" "Open by handle" - Jonathan Corbet, 2010-02-23
This sort of functionality allows a user-space file server to operate in
a stateless fashion with respect to the files it serves.
If
.I pathname
refers to a symbolic link and
.IR flags
does not specify
.BR AT_SYMLINK_FOLLOW ,
then
.BR name_to_handle_at ()
returns a handle for the link (rather than the file to which it refers).
.\" commit bcda76524cd1fa32af748536f27f674a13e56700
The process receiving the handle can later perform operations
on the symbolic link by converting the handle to a file descriptor using
.BR open_by_handle_at ()
with the
.BR O_PATH
flag, and then passing the file descriptor as the
.IR dirfd
argument in system calls such as
.BR readlinkat (2)
and
.BR fchownat (2).
.SS Obtaining a persistent filesystem ID
The mount IDs in
.IR /proc/self/mountinfo
can be reused as filesystems are unmounted and mounted.
Therefore, the mount ID returned by
.BR name_to_handle_at (3)
(in
.IR *mount_id )
should not be treated as a persistent identifier
for the corresponding mounted filesystem.
However, an application can use the information in the
.I mountinfo
record that corresponds to the mount ID
to derive a persistent identifier.
For example, one can use the device name in the fifth field of the
.I mountinfo
record to search for the corresponding device UUID via the symbolic links in
.IR /dev/disks/by-uuid .
(A more comfortable way of obtaining the UUID is to use the
.\" e.g., http://stackoverflow.com/questions/6748429/using-libblkid-to-find-uuid-of-a-partition
.BR libblkid (3)
library.)
That process can then be reversed,
using the UUID to look up the device name,
and then obtaining the corresponding mount point,
in order to produce the
.IR mount_fd
argument used by
.BR open_by_name_at ().
.SH EXAMPLE
The two programs below demonstrate the use of
.BR name_to_handle_at ()
and
.BR open_by_handle_at ().
The first program
.RI ( t_name_to_handle_at.c )
uses
.BR name_to_handle_at ()
to obtain the file handle and mount ID
for the file specified in its command-line argument;
the handle and ID are written to standard output.
The second program
.RI ( t_open_by_handle_at.c )
reads a mount ID and file handle from standard input.
The program then employs
.BR open_by_handle_at ()
to open the file using that handle.
If an optional command-line argument is supplied, then the
.IR mount_fd
argument for
.BR open_by_handle_at ()
is obtained by opening the directory named in that argument.
Otherwise,
.IR mount_fd
is obtained by scanning
.IR /proc/self/mountinfo
to find a record whose mount ID matches the mount ID
read from standard input,
and the mount directory specified in that record is opened.
(These programs do not deal with the fact that mount IDs are not persistent.)
The following shell session demonstrates the use of these two programs:
.in +4n
.nf
$ \fBecho 'Kannst du bitte überlegen?' > cecilia.txt\fP
$ \fB./t_name_to_handle_at cecilia.txt > fh\fP
$ \fB./t_open_by_handle_at < fh\fP
open_by_handle_at: Operation not permitted
$ \fBsudo ./t_open_by_handle_at < fh\fP # Need CAP_SYS_ADMIN
Read 28 bytes
$ \fBrm cecilia.txt\fP
.fi
.in
Now we delete and (quickly) re-create the file so that
it has the same content and (by chance) the same inode.
Nevertheless,
.BR open_by_handle_at ()
recognizes that the original file referred to by the file handle
no longer exists.
.in +4n
.nf
$ \fBstat \-\-printf="%i\\n" cecilia.txt\fP # Display inode number
4072121
$ \fBrm cecilia.txt\fP
$ \fBecho 'Kannst du bitte überlegen?' > cecilia.txt\fP
$ \fBstat \-\-printf="%i\\n" cecilia.txt\fP # Check inode number
4072121
$ \fBsudo ./t_open_by_handle_at < fh\fP
open_by_handle_at: Stale NFS file handle
.fi
.in
.SS Program source: t_name_to_handle_at.c
\&
.nf
#define _GNU_SOURCE
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>
#define errExit(msg) do { perror(msg); exit(EXIT_FAILURE); \\
} while (0)
int
main(int argc, char *argv[])
{
struct file_handle *fhp;
int mount_id, fhsize, s;
if (argc < 2 || strcmp(argv[1], "\-\-help") == 0) {
fprintf(stderr, "Usage: %s pathname\\n", argv[0]);
exit(EXIT_FAILURE);
}
/* Allocate file_handle structure */
fhsize = sizeof(struct file_handle *);
fhp = malloc(fhsize);
if (fhp == NULL)
errExit("malloc");
/* Make an initial call to name_to_handle_at() to discover
the size required for file handle */
fhp\->handle_bytes = 0;
s = name_to_handle_at(AT_FDCWD, argv[1], fhp, &mount_id, 0);
if (s != \-1 || errno != EOVERFLOW) {
fprintf(stderr, "Unexpected result from name_to_handle_at()\\n");
exit(EXIT_FAILURE);
}
/* Reallocate file_handle structure with correct size */
fhsize = sizeof(struct file_handle) + fhp\->handle_bytes;
fhp = realloc(fhp, fhsize); /* Copies fhp\->handle_bytes */
if (fhp == NULL)
errExit("realloc");
/* Get file handle from pathname supplied on command line */
if (name_to_handle_at(AT_FDCWD, argv[1], fhp, &mount_id, 0) == \-1)
errExit("name_to_handle_at");
/* Write mount ID, file handle size, and file handle to stdout,
for later reuse by t_open_by_handle_at.c */
if (write(STDOUT_FILENO, &mount_id, sizeof(int)) != sizeof(int) ||
write(STDOUT_FILENO, &fhsize, sizeof(int)) != sizeof(int) ||
write(STDOUT_FILENO, fhp, fhsize) != fhsize) {
fprintf(stderr, "Write failure\\n");
exit(EXIT_FAILURE);
}
exit(EXIT_SUCCESS);
}
.fi
.SS Program source: t_open_by_handle_at.c
\&
.nf
#define _GNU_SOURCE
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <limits.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#define errExit(msg) do { perror(msg); exit(EXIT_FAILURE); \\
} while (0)
/* Scan /proc/self/mountinfo to find the line whose mount ID matches
\(aqmount_id\(aq. (An easier way to do this is to install and use the
\(aqlibmount\(aq library provided by the \(aqutil\-linux\(aq project.)
Open the corresponding mount path and return the resulting file
descriptor. */
static int
open_mount_path_by_id(int mount_id)
{
char *linep;
size_t lsize;
char mount_path[PATH_MAX];
int fmnt_id, fnd, nread;
FILE *fp;
fp = fopen("/proc/self/mountinfo", "r");
if (fp == NULL)
errExit("fopen");
for (fnd = 0; !fnd ; ) {
linep = NULL;
nread = getline(&linep, &lsize, fp);
if (nread == \-1)
break;
nread = sscanf(linep, "%d %*d %*s %*s %s", &fmnt_id, mount_path);
if (nread != 2) {
fprintf(stderr, "Bad sscanf()\\n");
exit(EXIT_FAILURE);
}
free(linep);
if (fmnt_id == mount_id)
fnd = 1;
}
fclose(fp);
if (!fnd) {
fprintf(stderr, "Could not find mount point\\n");
exit(EXIT_FAILURE);
}
return open(mount_path, O_RDONLY | O_DIRECTORY);
}
int
main(int argc, char *argv[])
{
struct file_handle *fhp;
int mount_id, fd, mount_fd, fhsize;
ssize_t nread;
#define BSIZE 1000
char buf[BSIZE];
if (argc > 1 && strcmp(argv[1], "\-\-help") == 0) {
fprintf(stderr, "Usage: %s [mount\-dir]]\\n",
argv[0]);
exit(EXIT_FAILURE);
}
/* Read data produced by t_name_to_handle_at.c */
if (read(STDIN_FILENO, &mount_id, sizeof(int)) != sizeof(int))
errExit("read");
if (read(STDIN_FILENO, &fhsize, sizeof(int)) != sizeof(int))
errExit("read");
fhp = malloc(fhsize);
if (fhp == NULL)
errExit("malloc");
if (read(STDIN_FILENO, fhp, fhsize) != fhsize)
errExit("read");
/* Obtain file descriptor for mount point, either by opening
the pathname specified on the command line, or by scanning
/proc/self/mounts to find a mount that matches the \(aqmount_id\(aq
obtained by name_to_handle_at() (in t_name_to_handle_at.c) */
if (argc > 1)
mount_fd = open(argv[1], O_RDONLY | O_DIRECTORY);
else
mount_fd = open_mount_path_by_id(mount_id);
if (mount_fd == \-1)
errExit("opening mount fd");
/* Open name using handle and mount point */
fd = open_by_handle_at(mount_fd, fhp, O_RDONLY);
if (fd == \-1)
errExit("open_by_handle_at");
/* Try reading a few bytes from the file */
nread = read(fd, buf, BSIZE);
if (nread == \-1)
errExit("read");
printf("Read %ld bytes\\n", (long) nread);
exit(EXIT_SUCCESS);
}
.fi
.SH SEE ALSO
.BR blkid (1),
.BR findfs (1),
.BR open (2),
.BR libblkid (3),
.BR mount (8)
The
.I libblkid
and
.I libmount
documentation under the latest
.I util-linux
release at
.UR https://www.kernel.org/pub/linux/utils/util-linux/
.UE
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists