[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <878rt7h6qs.fsf@brahms.olymp>
Date: Fri, 18 Mar 2022 10:53:15 +0000
From: Luís Henriques <lhenriques@...e.de>
To: Xiubo Li <xiubli@...hat.com>
Cc: Jeff Layton <jlayton@...nel.org>,
Ilya Dryomov <idryomov@...il.com>, ceph-devel@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH v3 2/4] ceph: handle encrypted snapshot names in
subdirectories
Xiubo Li <xiubli@...hat.com> writes:
> On 3/17/22 11:45 PM, Luís Henriques wrote:
>> When creating a snapshot, the .snap directories for every subdirectory will
>> show the snapshot name in the "long format":
>>
>> # mkdir .snap/my-snap
>> # ls my-dir/.snap/
>> _my-snap_1099511627782
>>
>> Encrypted snapshots will need to be able to handle these snapshot names by
>> encrypting/decrypting only the snapshot part of the string ('my-snap').
>>
>> Also, since the MDS prevents snapshot names to be bigger than 240 characters
>> it is necessary to adapt CEPH_NOHASH_NAME_MAX to accommodate this extra
>> limitation.
>>
>> Signed-off-by: Luís Henriques <lhenriques@...e.de>
>> ---
>> fs/ceph/crypto.c | 189 ++++++++++++++++++++++++++++++++++++++++-------
>> fs/ceph/crypto.h | 11 ++-
>> 2 files changed, 169 insertions(+), 31 deletions(-)
>>
>> diff --git a/fs/ceph/crypto.c b/fs/ceph/crypto.c
>> index beb73bbdd868..caa9863dee93 100644
>> --- a/fs/ceph/crypto.c
>> +++ b/fs/ceph/crypto.c
>> @@ -128,16 +128,100 @@ void ceph_fscrypt_as_ctx_to_req(struct ceph_mds_request *req, struct ceph_acl_se
>> swap(req->r_fscrypt_auth, as->fscrypt_auth);
>> }
>> -int ceph_encode_encrypted_dname(const struct inode *parent, struct qstr
>> *d_name, char *buf)
>> +/*
>> + * User-created snapshots can't start with '_'. Snapshots that start with this
>> + * character are special (hint: there aren't real snapshots) and use the
>> + * following format:
>> + *
>> + * _<SNAPSHOT-NAME>_<INODE-NUMBER>
>> + *
>> + * where:
>> + * - <SNAPSHOT-NAME> - the real snapshot name that may need to be decrypted,
>> + * - <INODE-NUMBER> - the inode number for the actual snapshot
>> + *
>> + * This function parses these snapshot names and returns the inode
>> + * <INODE-NUMBER>. 'name_len' will also bet set with the <SNAPSHOT-NAME>
>> + * length.
>> + */
>> +static struct inode *parse_longname(const struct inode *parent, const char *name,
>> + int *name_len)
>> {
>> + struct inode *dir = NULL;
>> + struct ceph_vino vino = { .snap = CEPH_NOSNAP };
>> + char *inode_number;
>> + char *name_end;
>> + int orig_len = *name_len;
>> + int ret = -EIO;
>> +
>> + /* Skip initial '_' */
>> + name++;
>> + name_end = strrchr(name, '_');
>> + if (!name_end) {
>> + dout("Failed to parse long snapshot name: %s\n", name);
>> + return ERR_PTR(-EIO);
>> + }
>> + *name_len = (name_end - name);
>> + if (*name_len <= 0) {
>> + pr_err("Failed to parse long snapshot name\n");
>> + return ERR_PTR(-EIO);
>> + }
>> +
>> + /* Get the inode number */
>> + inode_number = kmemdup_nul(name_end + 1,
>> + orig_len - *name_len - 2,
>> + GFP_KERNEL);
>> + if (!inode_number)
>> + return ERR_PTR(-ENOMEM);
>> + ret = kstrtou64(inode_number, 0, &vino.ino);
>> + if (ret) {
>> + dout("Failed to parse inode number: %s\n", name);
>> + dir = ERR_PTR(ret);
>> + goto out;
>> + }
>> +
>> + /* And finally the inode */
>> + dir = ceph_find_inode(parent->i_sb, vino);
>> + if (!dir) {
>> + /* This can happen if we're not mounting cephfs on the root */
>> + dir = ceph_get_inode(parent->i_sb, vino, NULL);
>
> In this case IMO you should lookup the inode from MDS instead create it in the
> cache, which won't setup the encryption info needed.
>
> So later when you try to use this to dencrypt the snapshot names, you will hit
> errors ? And also the case Jeff mentioned in previous thread could happen.
No, I don't see any errors. The reason is that if we get a I_NEW inode,
we do not have the keys to even decrypt the names. If you mount a
filesystem using as root a directory that is inside an encrypted
directory, you'll see the encrypted snapshot name:
# mkdir mydir
# fscrypt encrypt mydir
# mkdir -p mydir/a/b/c/d
# mkdir mydir/a/.snap/myspan
# umount ...
# mount <mon>:<port>:/a
# ls .snap
And we simply can't decrypt it because for that we'd need to have access
to the .fscrypt in the original filesystem mount root.
I haven't tested NFS over ceph (I don't currently have a test environment
for doing that), but I *think* the same thing will happen. (I can try to
setup this test environment in the next couple of days.)
> I figured out another approach could resolve this more gracefully:
I took a quick look at the PR and the client patch but I suspect that Jeff
is right: this approach may greatly reduce security, which is definitely
not desirable.
Cheers,
--
Luís
> For all the subdirs just let them inherit the encryption info from the same
> ancestor, which is initially encrypted, then in ceph_new_inode() you can just
> skip setting up the encryption info for all the subdirs and in MDS side will
> send back the parent's encryption info and fill it in handle_reply(), this is
> just what the .snap does.
>
> Then here you can use current inode to do the dencryption for all the snapshots
> including the long snapshot names.
>
> I have raise one PR and send a kclient patch for the above basic framework
> [1][2]. But there still need a little more work you need to do based them:
>
> When lssnap you need to add one flag in LeaseStat to tell the kclient whether
> the long snap names are encrypted, this is very easy in MDS side. Then in
> kclient side you can just skip dencrypting the long snap names which are from
> none-encyrpted parents and for all the other just use current inode to do the
> dencryption. No need to search the parent inodes for long snaps.
>
> And when lookuping a long snap name, which could be encyrpted and could be not,
> then you need to parse the inode out and lookup the inode from MDS if it does
> not exist in cache.
>
>
> [1] https://github.com/ceph/ceph/pull/45516
>
> [2] https://patchwork.kernel.org/project/ceph-devel/list/?series=624492
>
>
>> + if (!dir)
>> + dir = ERR_PTR(-ENOENT);
>> + }
>> + if (IS_ERR(dir))
>> + dout("Can't find inode %s (%s)\n", inode_number, name);
>> +
>> +out:
>> + kfree(inode_number);
>> + return dir;
>> +}
>> +
>> +int ceph_encode_encrypted_dname(struct inode *parent, struct qstr *d_name, char *buf)
>> +{
>> + struct inode *dir = parent;
>> + struct qstr iname;
>> u32 len;
>> + int name_len;
>> int elen;
>> int ret;
>> - u8 *cryptbuf;
>> + u8 *cryptbuf = NULL;
>> +
>> + iname.name = d_name->name;
>> + name_len = d_name->len;
>> +
>> + /* Handle the special case of snapshot names that start with '_' */
>> + if ((ceph_snap(dir) == CEPH_SNAPDIR) && (name_len > 0) &&
>> + (iname.name[0] == '_')) {
>> + dir = parse_longname(parent, iname.name, &name_len);
>> + if (IS_ERR(dir))
>> + return PTR_ERR(dir);
>> + iname.name++; /* skip initial '_' */
>> + }
>> + iname.len = name_len;
>> - if (!fscrypt_has_encryption_key(parent)) {
>> + if (!fscrypt_has_encryption_key(dir)) {
>> memcpy(buf, d_name->name, d_name->len);
>> - return d_name->len;
>> + elen = d_name->len;
>> + goto out;
>> }
>> /*
>> @@ -146,18 +230,22 @@ int ceph_encode_encrypted_dname(const struct inode *parent, struct qstr *d_name,
>> *
>> * See: fscrypt_setup_filename
>> */
>> - if (!fscrypt_fname_encrypted_size(parent, d_name->len, NAME_MAX, &len))
>> - return -ENAMETOOLONG;
>> + if (!fscrypt_fname_encrypted_size(dir, iname.len, NAME_MAX, &len)) {
>> + elen = -ENAMETOOLONG;
>> + goto out;
>> + }
>> /* Allocate a buffer appropriate to hold the result */
>> cryptbuf = kmalloc(len > CEPH_NOHASH_NAME_MAX ? NAME_MAX : len, GFP_KERNEL);
>> - if (!cryptbuf)
>> - return -ENOMEM;
>> + if (!cryptbuf) {
>> + elen = -ENOMEM;
>> + goto out;
>> + }
>> - ret = fscrypt_fname_encrypt(parent, d_name, cryptbuf, len);
>> + ret = fscrypt_fname_encrypt(dir, &iname, cryptbuf, len);
>> if (ret) {
>> - kfree(cryptbuf);
>> - return ret;
>> + elen = ret;
>> + goto out;
>> }
>> /* hash the end if the name is long enough */
>> @@ -173,12 +261,29 @@ int ceph_encode_encrypted_dname(const struct inode *parent, struct qstr *d_name,
>> /* base64 encode the encrypted name */
>> elen = fscrypt_base64url_encode(cryptbuf, len, buf);
>> - kfree(cryptbuf);
>> dout("base64-encoded ciphertext name = %.*s\n", elen, buf);
>> +
>> + WARN_ON(elen > (CEPH_NOHASH_NAME_MAX + SHA256_DIGEST_SIZE));
>> + if ((elen > 0) && (dir != parent)) {
>> + char tmp_buf[NAME_MAX];
>> +
>> + elen = snprintf(tmp_buf, sizeof(tmp_buf), "_%.*s_%ld",
>> + elen, buf, dir->i_ino);
>> + memcpy(buf, tmp_buf, elen);
>> + }
>> +
>> +out:
>> + kfree(cryptbuf);
>> + if (dir != parent) {
>> + if ((dir->i_state & I_NEW))
>> + discard_new_inode(dir);
>> + else
>> + iput(dir);
>> + }
>> return elen;
>> }
>> -int ceph_encode_encrypted_fname(const struct inode *parent, struct dentry
>> *dentry, char *buf)
>> +int ceph_encode_encrypted_fname(struct inode *parent, struct dentry *dentry, char *buf)
>> {
>> WARN_ON_ONCE(!fscrypt_has_encryption_key(parent));
>> @@ -203,29 +308,42 @@ int ceph_encode_encrypted_fname(const struct inode
>> *parent, struct dentry *dentr
>> int ceph_fname_to_usr(const struct ceph_fname *fname, struct fscrypt_str *tname,
>> struct fscrypt_str *oname, bool *is_nokey)
>> {
>> - int ret;
>> + struct inode *dir = fname->dir;
>> struct fscrypt_str _tname = FSTR_INIT(NULL, 0);
>> struct fscrypt_str iname;
>> -
>> - if (!IS_ENCRYPTED(fname->dir)) {
>> - oname->name = fname->name;
>> - oname->len = fname->name_len;
>> - return 0;
>> - }
>> + char *name = fname->name;
>> + int name_len = fname->name_len;
>> + int ret;
>> /* Sanity check that the resulting name will fit in the buffer */
>> if (fname->name_len > NAME_MAX || fname->ctext_len > NAME_MAX)
>> return -EIO;
>> - ret = __fscrypt_prepare_readdir(fname->dir);
>> + /* Handle the special case of snapshot names that start with '_' */
>> + if ((ceph_snap(dir) == CEPH_SNAPDIR) && (name_len > 0) &&
>> + (name[0] == '_')) {
>> + dir = parse_longname(dir, name, &name_len);
>> + if (IS_ERR(dir))
>> + return PTR_ERR(dir);
>> + name++; /* skip initial '_' */
>> + }
>> +
>> + if (!IS_ENCRYPTED(dir)) {
>> + oname->name = fname->name;
>> + oname->len = fname->name_len;
>> + ret = 0;
>> + goto out_inode;
>> + }
>> +
>> + ret = __fscrypt_prepare_readdir(dir);
>> if (ret)
>> - return ret;
>> + goto out_inode;
>> /*
>> * Use the raw dentry name as sent by the MDS instead of
>> * generating a nokey name via fscrypt.
>> */
>> - if (!fscrypt_has_encryption_key(fname->dir)) {
>> + if (!fscrypt_has_encryption_key(dir)) {
>> if (fname->no_copy)
>> oname->name = fname->name;
>> else
>> @@ -233,7 +351,8 @@ int ceph_fname_to_usr(const struct ceph_fname *fname, struct fscrypt_str *tname,
>> oname->len = fname->name_len;
>> if (is_nokey)
>> *is_nokey = true;
>> - return 0;
>> + ret = 0;
>> + goto out_inode;
>> }
>> if (fname->ctext_len == 0) {
>> @@ -242,11 +361,11 @@ int ceph_fname_to_usr(const struct ceph_fname *fname, struct fscrypt_str *tname,
>> if (!tname) {
>> ret = fscrypt_fname_alloc_buffer(NAME_MAX, &_tname);
>> if (ret)
>> - return ret;
>> + goto out_inode;
>> tname = &_tname;
>> }
>> - declen = fscrypt_base64url_decode(fname->name, fname->name_len,
>> tname->name);
>> + declen = fscrypt_base64url_decode(name, name_len, tname->name);
>> if (declen <= 0) {
>> ret = -EIO;
>> goto out;
>> @@ -258,9 +377,25 @@ int ceph_fname_to_usr(const struct ceph_fname *fname, struct fscrypt_str *tname,
>> iname.len = fname->ctext_len;
>> }
>> - ret = fscrypt_fname_disk_to_usr(fname->dir, 0, 0, &iname, oname);
>> + ret = fscrypt_fname_disk_to_usr(dir, 0, 0, &iname, oname);
>> + if (!ret && (dir != fname->dir)) {
>> + char tmp_buf[FSCRYPT_BASE64URL_CHARS(NAME_MAX)];
>> +
>> + name_len = snprintf(tmp_buf, sizeof(tmp_buf), "_%.*s_%ld",
>> + oname->len, oname->name, dir->i_ino);
>> + memcpy(oname->name, tmp_buf, name_len);
>> + oname->len = name_len;
>> + }
>> +
>> out:
>> fscrypt_fname_free_buffer(&_tname);
>> +out_inode:
>> + if ((dir != fname->dir) && !IS_ERR(dir)) {
>> + if ((dir->i_state & I_NEW))
>> + discard_new_inode(dir);
>> + else
>> + iput(dir);
>> + }
>> return ret;
>> }
>> diff --git a/fs/ceph/crypto.h b/fs/ceph/crypto.h
>> index 62f0ddd30dee..3273d076a9e5 100644
>> --- a/fs/ceph/crypto.h
>> +++ b/fs/ceph/crypto.h
>> @@ -82,13 +82,16 @@ static inline u32 ceph_fscrypt_auth_len(struct ceph_fscrypt_auth *fa)
>> * struct fscrypt_ceph_nokey_name {
>> * u8 bytes[157];
>> * u8 sha256[SHA256_DIGEST_SIZE];
>> - * }; // 189 bytes => 252 bytes base64-encoded, which is <= NAME_MAX (255)
>> + * }; // 180 bytes => 240 bytes base64-encoded, which is <= NAME_MAX (255)
>> + *
>> + * (240 bytes is the maximum size allowed for snapshot names to take into
>> + * account the format: '_<SNAPSHOT-NAME>_<INODE-NUMBER>'.)
>> *
>> * Note that for long names that end up having their tail portion hashed, we
>> * must also store the full encrypted name (in the dentry's alternate_name
>> * field).
>> */
>> -#define CEPH_NOHASH_NAME_MAX (189 - SHA256_DIGEST_SIZE)
>> +#define CEPH_NOHASH_NAME_MAX (180 - SHA256_DIGEST_SIZE)
>> void ceph_fscrypt_set_ops(struct super_block *sb);
>> @@ -97,8 +100,8 @@ void ceph_fscrypt_free_dummy_policy(struct ceph_fs_client
>> *fsc);
>> int ceph_fscrypt_prepare_context(struct inode *dir, struct inode *inode,
>> struct ceph_acl_sec_ctx *as);
>> void ceph_fscrypt_as_ctx_to_req(struct ceph_mds_request *req, struct ceph_acl_sec_ctx *as);
>> -int ceph_encode_encrypted_dname(const struct inode *parent, struct qstr *d_name, char *buf);
>> -int ceph_encode_encrypted_fname(const struct inode *parent, struct dentry *dentry, char *buf);
>> +int ceph_encode_encrypted_dname(struct inode *parent, struct qstr *d_name, char *buf);
>> +int ceph_encode_encrypted_fname(struct inode *parent, struct dentry *dentry, char *buf);
>> static inline int ceph_fname_alloc_buffer(struct inode *parent, struct
>> fscrypt_str *fname)
>> {
>>
>
Powered by blists - more mailing lists