[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <a7db2285-8886-34c3-6b4b-028fa134c8c7@redhat.com>
Date: Mon, 23 May 2022 20:11:14 +0800
From: Xiubo Li <xiubli@...hat.com>
To: Jeff Layton <jlayton@...nel.org>,
Luís Henriques <lhenriques@...e.de>,
Ilya Dryomov <idryomov@...il.com>
Cc: ceph-devel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH] ceph: try to prevent exceeding the MDS maximum xattr
size
On 5/23/22 6:43 PM, Jeff Layton wrote:
> On Mon, 2022-05-23 at 09:47 +0800, Xiubo Li wrote:
>> On 5/20/22 7:54 PM, Luís Henriques wrote:
>>> The MDS tries to enforce a limit on the total key/values in extended
>>> attributes. However, this limit is enforced only if doing a synchronous
>>> operation (MDS_OP_SETXATTR) -- if we're buffering the xattrs, the MDS
>>> doesn't have a chance to enforce these limits.
>>>
>>> This patch forces the usage of the synchronous operation if xattrs size hits
>>> the maximum size that is set on the MDS by default (64k).
>>>
>>> While there, fix a dout() that would trigger a printk warning:
>>>
>>> [ 98.718078] ------------[ cut here ]------------
>>> [ 98.719012] precision 65536 too large
>>> [ 98.719039] WARNING: CPU: 1 PID: 3755 at lib/vsprintf.c:2703 vsnprintf+0x5e3/0x600
>>> ...
>>>
>>> URL: https://tracker.ceph.com/issues/55725
>>> Signed-off-by: Luís Henriques <lhenriques@...e.de>
>>> ---
>>> fs/ceph/xattr.c | 17 +++++++++++++----
>>> 1 file changed, 13 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/fs/ceph/xattr.c b/fs/ceph/xattr.c
>>> index afec84088471..09751a5f028c 100644
>>> --- a/fs/ceph/xattr.c
>>> +++ b/fs/ceph/xattr.c
>>> @@ -15,6 +15,12 @@
>>> #define XATTR_CEPH_PREFIX "ceph."
>>> #define XATTR_CEPH_PREFIX_LEN (sizeof (XATTR_CEPH_PREFIX) - 1)
>>>
>>> +/*
>>> + * Maximum size of xattrs the MDS can handle per inode by default. This
>>> + * includes the attribute name and 4+4 bytes for the key/value sizes.
>>> + */
>>> +#define MDS_MAX_XATTR_PAIRS_SIZE (1<<16) /* 64K */
>> The max size is changeable in MDS side. Possibly we should do something
>> as mentioned in your ceph PR [1].
>>
>> @Jeff, any better idea ?
>>
>>
>> [1]
>> https://github.com/ceph/ceph/pull/46357/commits/741f8ba36f14774834c0d5618519425ccf1ccc85#r878966753
>>
>> Thanks.
>>
>> -- Xiubo
>>
>>
> Not really.
>
> The idea in the github comment makes sense. Basically, make it so that
> the setting isn't changeable at runtime and then have the client query
> for the limit at appropriate times.
>
> You can probably still defeat that by changing it and rebooting the MDS,
> but I don't see that you can do much else.
Yeah, sounds good.
>
>>> +
>>> static int __remove_xattr(struct ceph_inode_info *ci,
>>> struct ceph_inode_xattr *xattr);
>>>
>>> @@ -1078,7 +1084,7 @@ static int ceph_sync_setxattr(struct inode *inode, const char *name,
>>> flags |= CEPH_XATTR_REMOVE;
>>> }
>>>
>>> - dout("setxattr value=%.*s\n", (int)size, value);
>>> + dout("setxattr value size: ld\n", size);
>>>
>>> /* do request */
>>> req = ceph_mdsc_create_request(mdsc, op, USE_AUTH_MDS);
>>> @@ -1176,8 +1182,13 @@ int __ceph_setxattr(struct inode *inode, const char *name,
>>> spin_lock(&ci->i_ceph_lock);
>>> retry:
>>> issued = __ceph_caps_issued(ci, NULL);
>>> - if (ci->i_xattrs.version == 0 || !(issued & CEPH_CAP_XATTR_EXCL))
>>> + required_blob_size = __get_required_blob_size(ci, name_len, val_len);
>>> + if ((ci->i_xattrs.version == 0) || !(issued & CEPH_CAP_XATTR_EXCL) ||
>>> + (required_blob_size >= MDS_MAX_XATTR_PAIRS_SIZE)) {
>>> + dout("%s do sync setxattr: version: %llu blob size: %d\n",
>>> + __func__, ci->i_xattrs.version, required_blob_size);
>>> goto do_sync;
>>> + }
>>>
>>> if (!lock_snap_rwsem && !ci->i_head_snapc) {
>>> lock_snap_rwsem = true;
>>> @@ -1193,8 +1204,6 @@ int __ceph_setxattr(struct inode *inode, const char *name,
>>> ceph_cap_string(issued));
>>> __build_xattrs(inode);
>>>
>>> - required_blob_size = __get_required_blob_size(ci, name_len, val_len);
>>> -
>>> if (!ci->i_xattrs.prealloc_blob ||
>>> required_blob_size > ci->i_xattrs.prealloc_blob->alloc_len) {
>>> struct ceph_buffer *blob;
>>>
Powered by blists - more mailing lists