[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ba4ddf27-91e7-0ecc-95d5-c139f6978812@igalia.com>
Date: Mon, 30 Jun 2025 13:28:25 +0530
From: Bhupesh Sharma <bhsharma@...lia.com>
To: Kees Cook <kees@...nel.org>
Cc: Bhupesh <bhupesh@...lia.com>, akpm@...ux-foundation.org,
kernel-dev@...lia.com, linux-kernel@...r.kernel.org, bpf@...r.kernel.org,
linux-perf-users@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-mm@...ck.org, oliver.sang@...el.com, lkp@...el.com,
laoar.shao@...il.com, pmladek@...e.com, rostedt@...dmis.org,
mathieu.desnoyers@...icios.com, arnaldo.melo@...il.com,
alexei.starovoitov@...il.com, andrii.nakryiko@...il.com,
mirq-linux@...e.qmqm.pl, peterz@...radead.org, willy@...radead.org,
david@...hat.com, viro@...iv.linux.org.uk, ebiederm@...ssion.com,
brauner@...nel.org, jack@...e.cz, mingo@...hat.com, juri.lelli@...hat.com,
bsegall@...gle.com, mgorman@...e.de, vschneid@...hat.com,
linux-trace-kernel@...r.kernel.org, torvalds@...ux-foundation.org,
akpm@...ux-foundation.org
Subject: Re: [PATCH v4 3/3] exec: Add support for 64 byte 'tsk->comm_ext'
On 5/26/25 4:43 PM, Bhupesh Sharma wrote:
> Hi Kees,
>
> On 5/24/25 2:25 AM, Kees Cook wrote:
>> On Fri, May 23, 2025 at 06:01:41PM +0530, Bhupesh Sharma wrote:
>>> 2. %s usage: I checked this at multiple places and can confirm that
>>> %s usage
>>> to print out 'tsk->comm' (as a string), get the longer
>>> new "extended comm".
>> As an example of why I don't like this union is that this is now lying
>> to the compiler. e.g. a %s of an object with a known size (sizeof(comm))
>> may now run off the end of comm without finding a %NUL character... this
>> is "safe" in the sense that the "extended comm" is %NUL terminated, but
>> it makes the string length ambiguous for the compiler (and any
>> associated security hardening).
>
> Right.
>
>>
>>> 3. users who do 'sizeof(->comm)' will continue to get the old value
>>> because
>>> of the union.
>> Right -- this is exactly where I think it can get very very wrong,
>> leaving things unterminated.
>>
>>> The problem with having two separate comms: tsk->comm and
>>> tsk->ext_comm,
>>> instead of a union is two fold:
>>> (a). If we keep two separate statically allocated comms: tsk->comm and
>>> tsk->ext_comm in struct task_struct, we need to basically keep
>>> supporting
>>> backward compatibility / ABI via tsk->comm and ask new user-land
>>> users to
>>> move to tsk->ext_comm.
>>>
>>> (b). If we keep one statically allocated comm: tsk->comm and one
>>> dynamically allocated tsk->ext_comm in struct task_struct, then we
>>> have the problem of allocating the tsk->ext_comm which _may_ be in
>>> the exec() hot path.
>>>
>>> I think the discussion between Linus and Yafang (see [1]), was more
>>> towards avoiding the approach in 3(a).
>>>
>>> Also we discussed the 3(b) approach, during the review of v2 of this
>>> series, where there was a apprehensions around: adding another field
>>> to store the task name and allocating tsk->ext_comm dynamically in
>>> the exec() hot path (see [2]).
>> Right -- I agree we need them statically allocated. But I think a union
>> is going to be really error-prone.
>>
>> How about this: rename task->comm to something else (task->comm_str?),
>> increase its size and then add ABI-keeping wrappers for everything that
>> _must_ have the old length.
>>
>> Doing this guarantees we won't miss anything (since "comm" got renamed),
>> and during the refactoring all the places where the old length is
>> required
>> will be glaringly obvious. (i.e. it will be harder to make mistakes
>> about leaving things unterminated.)
>>
>
> Ok, I got your point. Let me explore then how best a ABI-keeping
> wrapper can be introduced.
> I am thinking of something like:
>
> abi_wrapper_get_task_comm {
>
> if (requested_comm_length <= 16)
> return 16byte comm with NUL terminator; // old comm (16-bytes)
> else
> return 64byte comm with NUL terminator; // extended comm
> (64-bytes)
> ....
> }
>
> Please let me know if this looks better. Accordingly I will start with
> v5 changes.
Hi Everyone, sorry for the delay but I wanted the revive this discussion
after the -rc1 and my PTO.
I am looking for suggestions on how to implement v5 for this series.
Here is some background of the version (and related discussions so far):
In the v4, the implementation for tsk->comm handling (for supporting
long 64byte task names) looked at handling the possible use-cases as
follows:
1. memcpy() users: Handled by [PATCH 2/3] of this series, where we
identify existing users using the following search
pattern:
$ git grep 'memcpy.*->comm\>'
2. %s usage: I checked this at multiple places and can confirm that %s
usage to print out 'tsk->comm' (as a string), get the longer
new "extended comm".
3. users who do 'sizeof(->comm)' will continue to get the old value
because of the union.
The above points were taken to address the points discussed earlier
between Linus and Yafang (see [1])
As Kees, suggested in the v4 review (see [2]):
1. Let's rename task->comm to something else (task->comm_str?) and
increase its size, and
2. Then add ABI-keeping wrappers for everything that _must_ have the
old length.
I am thinking of implementing it with something like:
abi_wrapper_get_task_comm {
if (requested_comm_length <= 16)
return 16byte comm with NUL terminator; // old comm (16-bytes)
else
return 64byte comm with NUL terminator; // extended comm
(64-bytes)
....
}
Kindly let me know your views on the above approach(es).
[1].
https://lore.kernel.org/all/CAHk-=wjAmmHUg6vho1KjzQi2=psR30+CogFd4aXrThr2gsiS4g@mail.gmail.com/
[2]. https://lore.kernel.org/all/202505231346.52F291C54@keescook/
Thanks.
Powered by blists - more mailing lists