[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <d271bdc1-f95a-01e2-7a0f-b6511aee2d58@redhat.com>
Date: Thu, 2 Jun 2022 18:57:28 +0800
From: Xiubo Li <xiubli@...hat.com>
To: Luís Henriques <lhenriques@...e.de>
Cc: Jeff Layton <jlayton@...nel.org>,
Ilya Dryomov <idryomov@...il.com>,
Gregory Farnum <gfarnum@...hat.com>,
ceph-devel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH v3] ceph: prevent a client from exceeding the MDS
maximum xattr size
On 6/2/22 6:28 PM, Luís Henriques wrote:
> Xiubo Li <xiubli@...hat.com> writes:
>
>> On 6/2/22 5:26 PM, Luís Henriques wrote:
>>> Xiubo Li <xiubli@...hat.com> writes:
>>>
>>>> On 6/2/22 12:29 AM, Luís Henriques wrote:
>>>>> The MDS tries to enforce a limit on the total key/values in extended
>>>>> attributes. However, this limit is enforced only if doing a synchronous
>>>>> operation (MDS_OP_SETXATTR) -- if we're buffering the xattrs, the MDS
>>>>> doesn't have a chance to enforce these limits.
>>>>>
>>>>> This patch adds support for decoding the xattrs maximum size setting that is
>>>>> distributed in the mdsmap. Then, when setting an xattr, the kernel client
>>>>> will revert to do a synchronous operation if that maximum size is exceeded.
>>>>>
>>>>> While there, fix a dout() that would trigger a printk warning:
>>>>>
>>>>> [ 98.718078] ------------[ cut here ]------------
>>>>> [ 98.719012] precision 65536 too large
>>>>> [ 98.719039] WARNING: CPU: 1 PID: 3755 at lib/vsprintf.c:2703 vsnprintf+0x5e3/0x600
>>>>> ...
>>>>>
>>>>> URL: https://tracker.ceph.com/issues/55725
>>>>> Signed-off-by: Luís Henriques <lhenriques@...e.de>
>>>>> ---
>>>>> fs/ceph/mdsmap.c | 27 +++++++++++++++++++++++----
>>>>> fs/ceph/xattr.c | 12 ++++++++----
>>>>> include/linux/ceph/mdsmap.h | 1 +
>>>>> 3 files changed, 32 insertions(+), 8 deletions(-)
>>>>>
>>>>> * Changes since v2
>>>>>
>>>>> Well, a lot has changed since v2! Now the xattr max value setting is
>>>>> obtained through the mdsmap, which needs to be decoded, and the feature
>>>>> that was used in the previous revision was dropped. The drawback is that
>>>>> the MDS isn't unable to know in advance if a client is aware of this xattr
>>>>> max value.
>>>>>
>>>>> * Changes since v1
>>>>>
>>>>> Added support for new feature bit to get the MDS max_xattr_pairs_size
>>>>> setting.
>>>>>
>>>>> Also note that this patch relies on a patch that hasn't been merged yet
>>>>> ("ceph: use correct index when encoding client supported features"),
>>>>> otherwise the new feature bit won't be correctly encoded.
>>>>>
>>>>> diff --git a/fs/ceph/mdsmap.c b/fs/ceph/mdsmap.c
>>>>> index 30387733765d..36b2bc18ca2a 100644
>>>>> --- a/fs/ceph/mdsmap.c
>>>>> +++ b/fs/ceph/mdsmap.c
>>>>> @@ -13,6 +13,12 @@
>>>>> #include "super.h"
>>>>> +/*
>>>>> + * Maximum size of xattrs the MDS can handle per inode by default. This
>>>>> + * includes the attribute name and 4+4 bytes for the key/value sizes.
>>>>> + */
>>>>> +#define MDS_MAX_XATTR_SIZE (1<<16) /* 64K */
>>>>> +
>>>>> #define CEPH_MDS_IS_READY(i, ignore_laggy) \
>>>>> (m->m_info[i].state > 0 && ignore_laggy ? true : !m->m_info[i].laggy)
>>>>> @@ -352,12 +358,10 @@ struct ceph_mdsmap *ceph_mdsmap_decode(void **p, void
>>>>> *end, bool msgr2)
>>>>> __decode_and_drop_type(p, end, u8, bad_ext);
>>>>> }
>>>>> if (mdsmap_ev >= 8) {
>>>>> - u32 name_len;
>>>>> /* enabled */
>>>>> ceph_decode_8_safe(p, end, m->m_enabled, bad_ext);
>>>>> - ceph_decode_32_safe(p, end, name_len, bad_ext);
>>>>> - ceph_decode_need(p, end, name_len, bad_ext);
>>>>> - *p += name_len;
>>>>> + /* fs_name */
>>>>> + ceph_decode_skip_string(p, end, bad_ext);
>>>>> }
>>>>> /* damaged */
>>>>> if (mdsmap_ev >= 9) {
>>>>> @@ -370,6 +374,21 @@ struct ceph_mdsmap *ceph_mdsmap_decode(void **p, void *end, bool msgr2)
>>>>> } else {
>>>>> m->m_damaged = false;
>>>>> }
>>>>> + if (mdsmap_ev >= 17) {
>>>>> + /* balancer */
>>>>> + ceph_decode_skip_string(p, end, bad_ext);
>>>>> + /* standby_count_wanted */
>>>>> + ceph_decode_skip_32(p, end, bad_ext);
>>>>> + /* old_max_mds */
>>>>> + ceph_decode_skip_32(p, end, bad_ext);
>>>>> + /* min_compat_client */
>>>>> + ceph_decode_skip_8(p, end, bad_ext);
>>>> This is incorrect.
>>>>
>>>> If mdsmap_ev == 15 the min_compat_client will be a feature_bitset_t instead of
>>>> int8_t.
>>> Hmm... can you point me at where that's done in the code? As usual, I'm
>>> confused with that code and simply can't see that.
>>>
>>> Also, if that happens only when mdsmap_ev == 15, then there's no problem
>>> because that branch is only taken if it's >= 17.
>> Yeah, so you should skip 32 or 32+64 bits instead here, just likes:
>>
>> 3536 /* version >= 3, feature bits */
>> 3537 ceph_decode_32_safe(&p, end, len, bad);
>> 3538 if (len) {
>> 3539 ceph_decode_64_safe(&p, end, features, bad);
>> 3540 p += len - sizeof(features);
>> 3541 }
>>
>> For the ceph code please see:
>>
>> Please see https://github.com/ceph/ceph/blob/main/src/mds/MDSMap.cc#L925.
> I still don't see what your saying. From what I understand, with <= 15 we
> used to have 'min_compat_client', which is of type 'ceph_release_t',
> defined in src/common/ceph_releases.h:
>
> enum class ceph_release_t : std::uint8_t {
> ...
> }
Okay, you are right.
I miss reading that code.
-- Xiubo
> Then, starting with >= 16 the MDS ignores this 'min_compat_client' field
> (but still encodes/decodes it), and it *adds* 'required_client_features',
> which is a 'feature_bitset_t' and that is decoded immediately after (see
> bellow, the ceph_decode_skip_set() call).
>
> Cheers,
Powered by blists - more mailing lists