[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEf4BzaJiCLH8nwWa5eM4D+n1nyCn3X-v0+W4-CwLg7hB2Wthg@mail.gmail.com>
Date: Wed, 16 Jul 2025 13:47:14 -0700
From: Andrii Nakryiko <andrii.nakryiko@...il.com>
To: Bhupesh Sharma <bhsharma@...lia.com>
Cc: Bhupesh <bhupesh@...lia.com>, akpm@...ux-foundation.org, kernel-dev@...lia.com,
linux-kernel@...r.kernel.org, bpf@...r.kernel.org,
linux-perf-users@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-mm@...ck.org, oliver.sang@...el.com, lkp@...el.com,
laoar.shao@...il.com, pmladek@...e.com, rostedt@...dmis.org,
mathieu.desnoyers@...icios.com, arnaldo.melo@...il.com,
alexei.starovoitov@...il.com, mirq-linux@...e.qmqm.pl, peterz@...radead.org,
willy@...radead.org, david@...hat.com, viro@...iv.linux.org.uk,
keescook@...omium.org, ebiederm@...ssion.com, brauner@...nel.org,
jack@...e.cz, mingo@...hat.com, juri.lelli@...hat.com, bsegall@...gle.com,
mgorman@...e.de, vschneid@...hat.com, linux-trace-kernel@...r.kernel.org,
kees@...nel.org, torvalds@...ux-foundation.org
Subject: Re: [PATCH v5 3/3] treewide: Switch from tsk->comm to tsk->comm_str
which is 64 bytes long
On Wed, Jul 16, 2025 at 1:18 PM Bhupesh Sharma <bhsharma@...lia.com> wrote:
>
> On 7/16/25 11:40 PM, Andrii Nakryiko wrote:
> > On Wed, Jul 16, 2025 at 5:40 AM Bhupesh <bhupesh@...lia.com> wrote:
> >> Historically due to the 16-byte length of TASK_COMM_LEN, the
> >> users of 'tsk->comm' are restricted to use a fixed-size target
> >> buffer also of TASK_COMM_LEN for 'memcpy()' like use-cases.
> >>
> >> To fix the same, Kees suggested in [1] that we can replace tsk->comm,
> >> with tsk->comm_str, inside 'task_struct':
> >> union {
> >> char comm_str[TASK_COMM_EXT_LEN];
> >> };
> >>
> >> where TASK_COMM_EXT_LEN is 64-bytes.
> > Do we absolutely have to rename task->comm field? I understand as an
> > intermediate step to not miss any existing users in the kernel
> > sources, but once that all is done, we can rename that back to comm,
> > right?
> >
> > The reason I'm asking is because accessing task->comm is *very common*
> > with all sorts of BPF programs and scripts. Yes, we have way to deal
> > with that with BPF CO-RE, but every single use case would need to be
> > updated now to work both with task->comm name on old kernels and
> > task->comm_str on new kernels (because BPF programs are written in
> > more or less kernel version agnostic way, so they have to handle many
> > kernel releases).
> >
> > So, unless absolutely necessary, can we please keep the field name the
> > same? Changing the size of that field is not really a problem for BPF,
> > so no objections against that.
>
> So, as a background we have had several previous discussions regarding
> renaming the 'tsk->comm' to 'task->comm_str' on the list (please see [1]
> and [2] for details), and as per Kees's recommendations we have taken
> this approach in the v5 patchset (may be Kees can add further details if
> I have missed adding something in the log message above).
>
> That being said, ideally one would not like to break any exiting ABI
kernel-internal data structures are not ABI to BPF programs, even
though they do look into kernel internals routinely for observability
and other reasons, so there is no ABI/API stability issue here.
Having said that, unless absolutely necessary, renaming something so
commonly used as task->comm is just an unnecessary widespread
disruption, which ideally we just avoid, unless absolutely necessary.
> (which includes existing / older BPF programs). I was having a look at
> the BPF CO_RE reference guide (see [3]), and was able to make out that
> BPF CO_RE has a definition of |s||truct task_struct|which doesn't need
> to match the kernel's struct task_struct definition exactly (as [3]
> mentions -> only a necessary subset of fields have to be present and
> compatible):
>
> |struct task_struct { intpid; charcomm[16]; struct
> task_struct*group_leader; } __attribute__((preserve_access_index)); |
>
> So, if we add a check here to add '|charcomm[16]' or||charcomm_str[16]'
> to BPF CO RE's internal 'struct task_struct' on basis of the underlying
> kernel version being used (something like using 'KERNEL_VERSION(x, y,
> 0)' for example), will that suffice? I have ||used and seen these checks
> being used in the user-space applications (for example, see [4]) at
> several occasions.
>
It's simpler on BPF side, we can check the existence of task_struct's
comm field, and handle either task->comm or task->comm_str
accordingly. This has been done many times for various things that
evolved in the kernel.
But given how frequently task->comm is referenced (pretty much any
profiler or tracer will capture this), it's just the widespread nature
of accessing task->comm in BPF programs/scripts that will cause a lot
of adaptation churn. And given the reason for renaming was to catch
missing cases during refactoring, my ask was to do this renaming
locally, validate all kernel code was modified, and then switch the
field name back to "comm" (which you already did, so the remaining
part would be just to rename comm_str back to comm). Unless I'm
missing something else, of course.
> Please let me know your views.
>
> |[1]. https://lore.kernel.org/all/202505222041.B639D482FB@keescook/
> [2].
> https://lore.kernel.org/all/ba4ddf27-91e7-0ecc-95d5-c139f6978812@igalia.com/
> [3]. https://nakryiko.com/posts/bpf-core-reference-guide/
> [4].
> https://github.com/crash-utility/crash/blob/master/memory_driver/crash.c#L41C25-L41C49
>
> Thanks.
>
> >> And then modify 'get_task_comm()' to pass 'tsk->comm_str'
> >> to the existing users.
> >>
> >> This would mean that ABI is maintained while ensuring that:
> >>
> >> - Existing users of 'get_task_comm'/ 'set_task_comm' will get 'tsk->comm_str'
> >> truncated to a maximum of 'TASK_COMM_LEN' (16-bytes) to maintain ABI,
> >> - New / Modified users of 'get_task_comm'/ 'set_task_comm' will get
> >> 'tsk->comm_str' supported for a maximum of 'TASK_COMM_EXTLEN' (64-bytes).
> >>
> >> Note, that the existing users have not been modified to migrate to
> >> 'TASK_COMM_EXT_LEN', in case they have hard-coded expectations of
> >> dealing with only a 'TASK_COMM_LEN' long 'tsk->comm_str'.
> >>
> >> [1]. https://lore.kernel.org/all/202505231346.52F291C54@keescook/
> >>
> >> Signed-off-by: Bhupesh <bhupesh@...lia.com>
> >> ---
> >> arch/arm64/kernel/traps.c | 2 +-
> >> arch/arm64/kvm/mmu.c | 2 +-
> >> block/blk-core.c | 2 +-
> >> block/bsg.c | 2 +-
> >> drivers/char/random.c | 2 +-
> >> drivers/hid/hid-core.c | 6 +++---
> >> drivers/mmc/host/tmio_mmc_core.c | 6 +++---
> >> drivers/pci/pci-sysfs.c | 2 +-
> >> drivers/scsi/scsi_ioctl.c | 2 +-
> >> drivers/tty/serial/serial_core.c | 2 +-
> >> drivers/tty/tty_io.c | 8 ++++----
> >> drivers/usb/core/devio.c | 16 ++++++++--------
> >> drivers/usb/core/message.c | 2 +-
> >> drivers/vfio/group.c | 2 +-
> >> drivers/vfio/vfio_iommu_type1.c | 2 +-
> >> drivers/vfio/vfio_main.c | 2 +-
> >> drivers/xen/evtchn.c | 2 +-
> >> drivers/xen/grant-table.c | 2 +-
> >> fs/binfmt_elf.c | 2 +-
> >> fs/coredump.c | 4 ++--
> >> fs/drop_caches.c | 2 +-
> >> fs/exec.c | 8 ++++----
> >> fs/ext4/dir.c | 2 +-
> >> fs/ext4/inode.c | 2 +-
> >> fs/ext4/namei.c | 2 +-
> >> fs/ext4/super.c | 12 ++++++------
> >> fs/hugetlbfs/inode.c | 2 +-
> >> fs/ioctl.c | 2 +-
> >> fs/iomap/direct-io.c | 2 +-
> >> fs/jbd2/transaction.c | 2 +-
> >> fs/locks.c | 2 +-
> >> fs/netfs/internal.h | 2 +-
> >> fs/proc/base.c | 2 +-
> >> fs/read_write.c | 2 +-
> >> fs/splice.c | 2 +-
> >> include/linux/coredump.h | 2 +-
> >> include/linux/filter.h | 2 +-
> >> include/linux/ratelimit.h | 2 +-
> >> include/linux/sched.h | 11 ++++++++---
> >> init/init_task.c | 2 +-
> >> ipc/sem.c | 2 +-
> >> kernel/acct.c | 2 +-
> >> kernel/audit.c | 4 ++--
> >> kernel/auditsc.c | 10 +++++-----
> >> kernel/bpf/helpers.c | 2 +-
> >> kernel/capability.c | 4 ++--
> >> kernel/cgroup/cgroup-v1.c | 2 +-
> >> kernel/cred.c | 4 ++--
> >> kernel/events/core.c | 2 +-
> >> kernel/exit.c | 6 +++---
> >> kernel/fork.c | 9 +++++++--
> >> kernel/freezer.c | 4 ++--
> >> kernel/futex/waitwake.c | 2 +-
> >> kernel/hung_task.c | 10 +++++-----
> >> kernel/irq/manage.c | 2 +-
> >> kernel/kthread.c | 2 +-
> >> kernel/locking/rtmutex.c | 2 +-
> >> kernel/printk/printk.c | 2 +-
> >> kernel/sched/core.c | 22 +++++++++++-----------
> >> kernel/sched/debug.c | 4 ++--
> >> kernel/signal.c | 6 +++---
> >> kernel/sys.c | 6 +++---
> >> kernel/sysctl.c | 2 +-
> >> kernel/time/itimer.c | 4 ++--
> >> kernel/time/posix-cpu-timers.c | 2 +-
> >> kernel/tsacct.c | 2 +-
> >> kernel/workqueue.c | 6 +++---
> >> lib/dump_stack.c | 2 +-
> >> lib/nlattr.c | 6 +++---
> >> mm/compaction.c | 2 +-
> >> mm/filemap.c | 4 ++--
> >> mm/gup.c | 2 +-
> >> mm/memfd.c | 2 +-
> >> mm/memory-failure.c | 10 +++++-----
> >> mm/memory.c | 2 +-
> >> mm/mmap.c | 4 ++--
> >> mm/oom_kill.c | 18 +++++++++---------
> >> mm/page_alloc.c | 4 ++--
> >> mm/util.c | 2 +-
> >> net/core/sock.c | 2 +-
> >> net/dns_resolver/internal.h | 2 +-
> >> net/ipv4/raw.c | 2 +-
> >> net/ipv4/tcp.c | 2 +-
> >> net/socket.c | 2 +-
> >> security/lsm_audit.c | 4 ++--
> >> 85 files changed, 171 insertions(+), 161 deletions(-)
> >>
> > [...]
>
Powered by blists - more mailing lists