lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z3fFkHCPl_68hN4H@krava>
Date: Fri, 3 Jan 2025 12:10:08 +0100
From: Jiri Olsa <olsajiri@...il.com>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
	linux-kbuild@...r.kernel.org, bpf <bpf@...r.kernel.org>,
	Masami Hiramatsu <mhiramat@...nel.org>,
	Mark Rutland <mark.rutland@....com>,
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Masahiro Yamada <masahiroy@...nel.org>,
	Nathan Chancellor <nathan@...nel.org>,
	Nicolas Schier <nicolas@...sle.eu>,
	Zheng Yejian <zhengyejian1@...wei.com>,
	Martin Kelly <martin.kelly@...wdstrike.com>,
	Christophe Leroy <christophe.leroy@...roup.eu>,
	Josh Poimboeuf <jpoimboe@...hat.com>
Subject: Re: [PATCH 14/14] scripts/sorttable: ftrace: Do not add weak
 functions to available_filter_functions

On Thu, Jan 02, 2025 at 01:58:59PM -0500, Steven Rostedt wrote:
> From: Steven Rostedt <rostedt@...dmis.org>
> 
> When a function is annotated as "weak" and is overridden, the code is not
> removed. If it is traced, the fentry/mcount location in the weak function
> will be referenced by the "__mcount_loc" section. This will then be added
> to the available_filter_functions list. Since only the address of the
> functions are listed, to find the name to show, a search of kallsyms is
> used.
> 
> Since kallsyms will return the function by simply finding the function
> that the address is after but before the next function, an address of a
> weak function will show up as the function before it. This is because
> kallsyms does not save names of weak functions. This has caused issues in
> the past, as now the traced weak function will be listed in
> available_filter_functions with the name of the function before it.
> 
> At best, this will cause the previous function's name to be listed twice.
> At worse, if the previous function was marked notrace, it will now show up
> as a function that can be traced. Note that it only shows up that it can
> be traced but will not be if enabled, which causes confusion.
> 
>  https://lore.kernel.org/all/20220412094923.0abe90955e5db486b7bca279@kernel.org/
> 
> The commit b39181f7c6907 ("ftrace: Add FTRACE_MCOUNT_MAX_OFFSET to avoid
> adding weak function") was a workaround to this by checking the function
> address before printing its name. If the address was too far from the
> function given by the name then instead of printing the name it would
> print: __ftrace_invalid_address___<invalid-offset>
> 
> The real issue is that these invalid addresses are listed in the ftrace
> table look up which available_filter_functions is derived from. A place
> holder must be listed in that file because set_ftrace_filter may take a
> series of indexes into that file instead of names to be able to do O(1)
> lookups to enable filtering (many tools use this method).
> 
> Even if kallsyms saved the size of the function, it does not remove the
> need of having these place holders. The real solution is to not add a weak
> function into the ftrace table in the first place.
> 
> To solve this, the sorttable.c code that sorts the mcount regions during
> the build is modified to take a "nm -S vmlinux" input, sort it, and any
> function listed in the mcount_loc section that is not within a boundary of
> the function list given by nm is considered a weak function and is zeroed
> out. Note, this does not mean they will remain zero when booting as KASLR
> will still shift those addresses.

hi,
fyi this seems to remove several functions from available_filter_functions,
that bpf relay on.. like update_socket_protocol or bpf_rstat_flush:

	__bpf_hook_start();

	__weak noinline int update_socket_protocol(int family, int type, int protocol)
	{
		return protocol;
	}

	__bpf_hook_end();


	[root@...u-1 tracing]# cat available_filter_functions | grep update_socket_protocol
	[root@...u-1 tracing]# cat /proc/kallsyms | grep update_socket_protocol
	ffffffff821d58b0 W __pfx_update_socket_protocol
	ffffffff821d58c0 W update_socket_protocol

not sure why that fits the condition above for removal

jirka


> 
> On boot up, when the ftrace table is created from the mcount_loc section,
> it will skip any address that matches kaslr_offset(). This stops the weak
> functions from ever being added to the ftrace table and also keeps from
> needing place holders in available_filter_functions.
> 
> Before:
> 
>  ~# grep __ftrace_invalid_address___ /sys/kernel/tracing/available_filter_functions | wc -l
>  556
> 
> After:
> 
>  ~# grep __ftrace_invalid_address___ /sys/kernel/tracing/available_filter_functions | wc -l
>  0
> 
> Signed-off-by: Steven Rostedt (Google) <rostedt@...dmis.org>
> ---
>  kernel/trace/ftrace.c   |  14 +++++
>  scripts/link-vmlinux.sh |   4 +-
>  scripts/sorttable.c     | 131 +++++++++++++++++++++++++++++++++++++++-
>  3 files changed, 146 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index 9b17efb1a87d..5963ae76b31a 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -7077,6 +7077,20 @@ static int ftrace_process_locs(struct module *mod,
>  			continue;
>  		}
>  
> +		/*
> +		 * At build time, a check is made against: nm -S vmlinux
> +		 * to make sure all functions are found within the
> +		 * size range of symbols listed by nm. If not, it's likely
> +		 * a weak function that was overridden. We do not want those.
> +		 * The script will zero them out, but kaslr will still
> +		 * update them. If the address is the same as the kaslr_offset()
> +		 * then skip the record.
> +		 */
> +		if (addr == kaslr_offset()) {
> +			skipped++;
> +			continue;
> +		}
> +
>  		end_offset = (pg->index+1) * sizeof(pg->records[0]);
>  		if (end_offset > PAGE_SIZE << pg->order) {
>  			/* We should have allocated enough */
> diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
> index d853ddb3b28c..976808c46665 100755
> --- a/scripts/link-vmlinux.sh
> +++ b/scripts/link-vmlinux.sh
> @@ -177,12 +177,14 @@ mksysmap()
>  
>  sorttable()
>  {
> -	${objtree}/scripts/sorttable ${1}
> +	${NM} -S ${1} > .tmp_vmlinux.nm-sort
> +	${objtree}/scripts/sorttable -s .tmp_vmlinux.nm-sort ${1}
>  }
>  
>  cleanup()
>  {
>  	rm -f .btf.*
> +	rm -f .tmp_vmlinux.nm-sort
>  	rm -f System.map
>  	rm -f vmlinux
>  	rm -f vmlinux.map
> diff --git a/scripts/sorttable.c b/scripts/sorttable.c
> index da9e1a82e886..d1d52bd12adb 100644
> --- a/scripts/sorttable.c
> +++ b/scripts/sorttable.c
> @@ -446,6 +446,98 @@ static void *sort_orctable(void *arg)
>  #endif
>  
>  #ifdef MCOUNT_SORT_ENABLED
> +struct func_info {
> +	uint64_t	addr;
> +	uint64_t	size;
> +};
> +
> +/* List of functions created by: nm -S vmlinux */
> +static struct func_info *function_list;
> +static int function_list_size;
> +
> +/* Allocate functions in 1k blocks */
> +#define FUNC_BLK_SIZE	1024
> +#define FUNC_BLK_MASK	(FUNC_BLK_SIZE - 1)
> +
> +static int add_field(uint64_t addr, uint64_t size)
> +{
> +	struct func_info *fi;
> +	int fsize = function_list_size;
> +
> +	if (!(fsize & FUNC_BLK_MASK)) {
> +		fsize += FUNC_BLK_SIZE;
> +		fi = realloc(function_list, fsize * sizeof(struct func_info));
> +		if (!fi)
> +			return -1;
> +		function_list = fi;
> +	}
> +	fi = &function_list[function_list_size++];
> +	fi->addr = addr;
> +	fi->size = size;
> +	return 0;
> +}
> +
> +/* Only return match if the address lies inside the function size */
> +static int cmp_func_addr(const void *K, const void *A)
> +{
> +	uint64_t key = *(const uint64_t *)K;
> +	const struct func_info *a = A;
> +
> +	if (key < a->addr)
> +		return -1;
> +	return key >= a->addr + a->size;
> +}
> +
> +/* Find the function in function list that is bounded by the function size */
> +static int find_func(uint64_t key)
> +{
> +	return bsearch(&key, function_list, function_list_size,
> +		       sizeof(struct func_info), cmp_func_addr) != NULL;
> +}
> +
> +static int cmp_funcs(const void *A, const void *B)
> +{
> +	const struct func_info *a = A;
> +	const struct func_info *b = B;
> +
> +	if (a->addr < b->addr)
> +		return -1;
> +	return a->addr > b->addr;
> +}
> +
> +static int parse_symbols(const char *fname)
> +{
> +	FILE *fp;
> +	char addr_str[20]; /* Only need 17, but round up to next int size */
> +	char size_str[20];
> +	char type;
> +
> +	fp = fopen(fname, "r");
> +	if (!fp) {
> +		perror(fname);
> +		return -1;
> +	}
> +
> +	while (fscanf(fp, "%16s %16s %c %*s\n", addr_str, size_str, &type) == 3) {
> +		uint64_t addr;
> +		uint64_t size;
> +
> +		/* Only care about functions */
> +		if (type != 't' && type != 'T')
> +			continue;
> +
> +		addr = strtoull(addr_str, NULL, 16);
> +		size = strtoull(size_str, NULL, 16);
> +		if (add_field(addr, size) < 0)
> +			return -1;
> +	}
> +	fclose(fp);
> +
> +	qsort(function_list, function_list_size, sizeof(struct func_info), cmp_funcs);
> +
> +	return 0;
> +}
> +
>  static pthread_t mcount_sort_thread;
>  
>  struct elf_mcount_loc {
> @@ -464,6 +556,23 @@ static void *sort_mcount_loc(void *arg)
>  	uint64_t count = emloc->stop_mcount_loc - emloc->start_mcount_loc;
>  	unsigned char *start_loc = (void *)emloc->ehdr + offset;
>  
> +	/* zero out any locations not found by function list */
> +	if (function_list_size) {
> +		void *end_loc = start_loc + count;
> +
> +		for (void *ptr = start_loc; ptr < end_loc; ptr += long_size) {
> +			uint64_t key;
> +
> +			key = long_size == 4 ? r((uint32_t *)ptr) : r8((uint64_t *)ptr);
> +			if (!find_func(key)) {
> +				if (long_size == 4)
> +					*(uint32_t *)ptr = 0;
> +				else
> +					*(uint64_t *)ptr = 0;
> +			}
> +		}
> +	}
> +
>  	qsort(start_loc, count/long_size, long_size, compare_extable);
>  	return NULL;
>  }
> @@ -504,7 +613,10 @@ static void get_mcount_loc(uint64_t *_start, uint64_t *_stop)
>  	pclose(file_start);
>  	pclose(file_stop);
>  }
> +#else /* MCOUNT_SORT_ENABLED */
> +static inline int parse_symbols(const char *fname) { return 0; }
>  #endif
> +
>  static int do_sort(Elf_Ehdr *ehdr,
>  		   char const *const fname,
>  		   table_sort_t custom_sort)
> @@ -936,14 +1048,29 @@ int main(int argc, char *argv[])
>  	int i, n_error = 0;  /* gcc-4.3.0 false positive complaint */
>  	size_t size = 0;
>  	void *addr = NULL;
> +	int c;
> +
> +	while ((c = getopt(argc, argv, "s:")) >= 0) {
> +		switch (c) {
> +		case 's':
> +			if (parse_symbols(optarg) < 0) {
> +				fprintf(stderr, "Could not parse %s\n", optarg);
> +				return -1;
> +			}
> +			break;
> +		default:
> +			fprintf(stderr, "usage: sorttable [-s nm-file] vmlinux...\n");
> +			return 0;
> +		}
> +	}
>  
> -	if (argc < 2) {
> +	if ((argc - optind) < 1) {
>  		fprintf(stderr, "usage: sorttable vmlinux...\n");
>  		return 0;
>  	}
>  
>  	/* Process each file in turn, allowing deep failure. */
> -	for (i = 1; i < argc; i++) {
> +	for (i = optind; i < argc; i++) {
>  		addr = mmap_file(argv[i], &size);
>  		if (!addr) {
>  			++n_error;
> -- 
> 2.45.2
> 
> 
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ