netdev - Re: [RFC PATCH v3 01/19] landlock: Support socket access-control

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cd78c2c8-feb3-b7f1-90be-3f6ab3becc09@huawei-partners.com>
Date: Fri, 10 Jan 2025 19:55:10 +0300
From: Mikhail Ivanov <ivanov.mikhail1@...wei-partners.com>
To: Günther Noack <gnoack3000@...il.com>
CC: Mickaël Salaün <mic@...ikod.net>,
	Günther Noack <gnoack@...gle.com>,
	<willemdebruijn.kernel@...il.com>, <linux-security-module@...r.kernel.org>,
	<netdev@...r.kernel.org>, <netfilter-devel@...r.kernel.org>,
	<yusongping@...wei.com>, <artem.kuzin@...wei.com>,
	<konstantin.meskhidze@...wei.com>
Subject: Re: [RFC PATCH v3 01/19] landlock: Support socket access-control

On 1/10/2025 7:27 PM, Günther Noack wrote:
> On Fri, Jan 10, 2025 at 04:02:42PM +0300, Mikhail Ivanov wrote:
>> Correct, but we also agreed to use bitmasks for "family" field as well:
>>
>> https://lore.kernel.org/all/af72be74-50c7-d251-5df3-a2c63c73296a@huawei-partners.com/
> 
> OK
> 
> 
>> On 1/10/2025 2:12 PM, Günther Noack wrote:
>>> I do not understand why this convenience feature in the UAPI layer
>>> requires a change to the data structures that Landlock uses
>>> internally?  As far as I can tell, struct landlock_socket_attr is only
>>> used in syscalls.c and converted to other data structures already.  I
>>> would have imagined that we'd "unroll" the specified bitmasks into the
>>> possible combinations in the add_rule_socket() function and then call
>>> landlock_append_socket_rule() multiple times with each of these?
>>
>> I thought about unrolling bitmask into multiple entries in rbtree, and
>> came up with following possible issue:
>>
>> Imagine that a user creates a rule that allows to create sockets of all
>> possible families and types (with protocol=0 for example):
>> {
>> 	.allowed_access = LANDLOCK_ACCESS_SOCKET_CREATE,
>> 	.families = INT64_MAX, /* 64 set bits */
>> 	.types = INT16_MAX, /* 16 set bits */
>> 	.protocol = 0,
>> },
>>
>> This will add 64 x 16 = 1024 entries to the rbtree. Currently, the
>> struct landlock_rule, which is used to store rules, weighs 40B. So, it
>> will be 40kB by a single rule. Even if we allow rules with only
>> "existing" families and types, it will be 46 x 7 = 322 entries and ~12kB
>> by a single rule.
>>
>> I understand that this may be degenerate case and most common rule will
>> result in less then 8 (or 4) entries in rbtree, but I think API should
>> be as intuitive as possible. User can expect to see the same
>> memory usage regardless of the content of the rule.
>>
>> I'm not really sure if this case is really an issue, so I'd be glad
>> to hear your opinion on it.
> 
> I think there are two separate questions here:
> 
> (a) I think it is OK that it is *possible* to allocate 40kB of kernel
>      memory.  At least, this is already possible today by calling
>      landlock_add_rule() repeatedly.
> 
>      I assume that the GFP_KERNEL_ACCOUNT flag is limiting the
>      potential damage to the caller here?  That flag was added in the
>      Landlock v19 patch set [1] ("Account objects to kmemcg.").
> 
> (b) I agree it might be counterintuitive when a single
>      landlock_add_rule() call allocates more space than expected.
> 
> Mickaël, since it is already possible today (but harder), I assume
> that you have thought about this problem before -- is it a problem
> when users allocate more kernel memory through Landlock than they
> expected?
> 
> 
> Naive proposal:
> 
> If this is an issue, how about we set a low limit to the number of
> families and the number of types that can be used in a single
> landlock_add_rule() invocation?  (It tends to be easier to start with
> a restrictive API and open it up later than the other way around.)

Looks like a good approach! Better to return with an error (which almost
always won't happen) and let the user to refactor the code than to
allow ruleset to eat an unexpected amount of memory.

> 
> For instance,
> 
> * In the families field, at most 2 bits may be set.
> * In the types field, at most 2 bits may be set.

Or we can say that rule can contain not more than 4 combinations of
family and type.

BTW, 4 seems to be sufficient, at least for IP protocols.

> 
> In my understanding, the main use case of the bit vectors is that
> there is a short way to say "all IPv4+v6 stream+dgram sockets", but we
> do not know scenarios where much more than that is needed?  With that,
> we would still keep people from accidentally allocating larger amounts
> of memory, while permitting the main use case.

Agreed

> 
> Having independent limits for the family and type fields is a bit
> easier to understand and document than imposing a limit on the
> multiplication result.
> 
>>> That being said, I am not a big fan of red-black trees for such simple
>>> integer lookups either, and I also think there should be something
>>> better if we make more use of the properties of the input ranges. The
>>> question is though whether you want to couple that to this socket type
>>> patch set, or rather do it in a follow up?  (So far we have been doing
>>> fine with the red black trees, and we are already contemplating the
>>> possibility of changing these internal structures in [2].  We have
>>> also used RB trees for the "port" rules with a similar reasoning,
>>> IIRC.)
>>
>> I think it'll be better to have a separate series for [2] if the socket
>> restriction can be implemented without rbtree refactoring.
> 
> Sounds good to me. 👍
> 
> –Günther
> 
> [1] https://lore.kernel.org/all/20200707180955.53024-2-mic@digikod.net/