linux-kernel - Re: [PATCH v4 04/19] selftests/resctrl: Close perf value read fd on errors

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <8b904781-c5d-8164-b8dc-903d412330fd@linux.intel.com>
Date:   Mon, 17 Jul 2023 16:05:43 +0300 (EEST)
From:   Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
To:     Reinette Chatre <reinette.chatre@...el.com>
cc:     linux-kselftest@...r.kernel.org, Shuah Khan <shuah@...nel.org>,
        Shaopeng Tan <tan.shaopeng@...fujitsu.com>,
        Fenghua Yu <fenghua.yu@...el.com>,
        Babu Moger <babu.moger@....com>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v4 04/19] selftests/resctrl: Close perf value read fd on
 errors

On Fri, 14 Jul 2023, Reinette Chatre wrote:
> On 7/14/2023 3:35 AM, Ilpo Järvinen wrote:
> > On Thu, 13 Jul 2023, Reinette Chatre wrote:
> >> On 7/13/2023 6:19 AM, Ilpo Järvinen wrote:
> >>> Perf event fd (fd_lm) is not closed on some error paths.
> >>>
> >>> Always close fd_lm in get_llc_perf() and add close into an error
> >>> handling block in cat_val().
> >>>
> >>> Fixes: 790bf585b0ee ("selftests/resctrl: Add Cache Allocation Technology (CAT) selftest")
> >>> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
> >>> ---
> >>>  tools/testing/selftests/resctrl/cache.c | 10 +++++-----
> >>>  1 file changed, 5 insertions(+), 5 deletions(-)
> >>>
> >>> diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c
> >>> index 8a4fe8693be6..ced47b445d1e 100644
> >>> --- a/tools/testing/selftests/resctrl/cache.c
> >>> +++ b/tools/testing/selftests/resctrl/cache.c
> >>> @@ -87,21 +87,20 @@ static int reset_enable_llc_perf(pid_t pid, int cpu_no)
> >>>  static int get_llc_perf(unsigned long *llc_perf_miss)
> >>>  {
> >>>  	__u64 total_misses;
> >>> +	int ret;
> >>>  
> >>>  	/* Stop counters after one span to get miss rate */
> >>>  
> >>>  	ioctl(fd_lm, PERF_EVENT_IOC_DISABLE, 0);
> >>>  
> >>> -	if (read(fd_lm, &rf_cqm, sizeof(struct read_format)) == -1) {
> >>> +	ret = read(fd_lm, &rf_cqm, sizeof(struct read_format));
> >>> +	close(fd_lm);
> >>> +	if (ret == -1) {
> >>>  		perror("Could not get llc misses through perf");
> >>> -
> >>>  		return -1;
> >>>  	}
> >>>  
> >>>  	total_misses = rf_cqm.values[0].value;
> >>> -
> >>> -	close(fd_lm);
> >>> -
> >>>  	*llc_perf_miss = total_misses;
> >>>  
> >>>  	return 0;
> >>> @@ -253,6 +252,7 @@ int cat_val(struct resctrl_val_param *param)
> >>>  					 memflush, operation, resctrl_val)) {
> >>>  				fprintf(stderr, "Error-running fill buffer\n");
> >>>  				ret = -1;
> >>> +				close(fd_lm);
> >>>  				break;
> >>>  			}
> >>>  
> >>
> >> Instead of fixing these existing patterns I think it would make the code
> >> easier to understand and maintain if it is made symmetrical.
> >> Having the perf event fd opened in one place but its close()
> >> scattered elsewhere has the potential for confusion and making later
> >> mistakes easy to miss.
> >>
> >> What if perf event fd is closed in a new "disable_llc_perf()" that
> >> is matched with "reset_enable_llc_perf()" and called
> >> from cat_val()?
> >>
> >> I think this raises another issue with the test trickery where
> >> measure_cache_vals() has some assumptions about state based on the
> >> test name.
> > 
> > I very much agree on the principle here, and thus I already have created 
> > patches which will do a major cleanup on this area. The cleaned-up code 
> > has pe_fd local var to cat_val() and handles closing it in cat_val() with 
> > the usual patterns.
> > 
> > However, the patch is currently resides post L3 CAT test rewrite. 
> > Backporting the cleanups/refactors into this series would require 
> > considerable effort due to how convoluted all those n-step cleanup patches 
> > and L3 CAT test rewrite are in this area. There's just very much to 
> > cleanup here and L3 rewrite will touch the same areas so its a net 
> > full of conflicts.
> > 
> > Do you want me to spend the effort to backport them into this series 
> > (I expect will take some time)?
> 
> Considering the "Fixes" tag, having a smaller fix that can easily
> be backported would be ideal so I am ok with deferring a bigger
> rework.
> 
> I do think this fix can be made more robust with a couple of small
> changes that should not introduce significant conflicts:
> * initialize fd_lm to -1 

> * do not close() fd_lm in get_llc_perf() but instead move its
>   close() to at exit of cat_val().

I changed the test to only close the fd in cat_val() which is the 
direction the later refactor/cleanup changes (not in this series) was 
moving anyway.

> * add check in get_llc_perf() that it does not attempt ioctl()
>   on "fd_lm == -1" (later addition would be error checking of
>   the ioctl())

The other two things suggested seem unnecessary and I've not implemented 
them, I don't thinkg fd_lm can be -1 at ioctl(). Given this code is going 
to be replaced soonish, putting any extra "safety" effort into it now 
seems waste of time.

-- 
 i.