linux-kernel - Re: [PATCH v4 04/19] selftests/resctrl: Close perf value read fd on errors

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4531fca1-2f3b-0c08-351b-f8e06c5f9f5c@intel.com>
Date:   Fri, 14 Jul 2023 10:36:14 -0700
From:   Reinette Chatre <reinette.chatre@...el.com>
To:     Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
CC:     <linux-kselftest@...r.kernel.org>, Shuah Khan <shuah@...nel.org>,
        Shaopeng Tan <tan.shaopeng@...fujitsu.com>,
        Fenghua Yu <fenghua.yu@...el.com>,
        Babu Moger <babu.moger@....com>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v4 04/19] selftests/resctrl: Close perf value read fd on
 errors

Hi Ilpo,

On 7/14/2023 3:35 AM, Ilpo Järvinen wrote:
> On Thu, 13 Jul 2023, Reinette Chatre wrote:
>> On 7/13/2023 6:19 AM, Ilpo Järvinen wrote:
>>> Perf event fd (fd_lm) is not closed on some error paths.
>>>
>>> Always close fd_lm in get_llc_perf() and add close into an error
>>> handling block in cat_val().
>>>
>>> Fixes: 790bf585b0ee ("selftests/resctrl: Add Cache Allocation Technology (CAT) selftest")
>>> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
>>> ---
>>>  tools/testing/selftests/resctrl/cache.c | 10 +++++-----
>>>  1 file changed, 5 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c
>>> index 8a4fe8693be6..ced47b445d1e 100644
>>> --- a/tools/testing/selftests/resctrl/cache.c
>>> +++ b/tools/testing/selftests/resctrl/cache.c
>>> @@ -87,21 +87,20 @@ static int reset_enable_llc_perf(pid_t pid, int cpu_no)
>>>  static int get_llc_perf(unsigned long *llc_perf_miss)
>>>  {
>>>  	__u64 total_misses;
>>> +	int ret;
>>>  
>>>  	/* Stop counters after one span to get miss rate */
>>>  
>>>  	ioctl(fd_lm, PERF_EVENT_IOC_DISABLE, 0);
>>>  
>>> -	if (read(fd_lm, &rf_cqm, sizeof(struct read_format)) == -1) {
>>> +	ret = read(fd_lm, &rf_cqm, sizeof(struct read_format));
>>> +	close(fd_lm);
>>> +	if (ret == -1) {
>>>  		perror("Could not get llc misses through perf");
>>> -
>>>  		return -1;
>>>  	}
>>>  
>>>  	total_misses = rf_cqm.values[0].value;
>>> -
>>> -	close(fd_lm);
>>> -
>>>  	*llc_perf_miss = total_misses;
>>>  
>>>  	return 0;
>>> @@ -253,6 +252,7 @@ int cat_val(struct resctrl_val_param *param)
>>>  					 memflush, operation, resctrl_val)) {
>>>  				fprintf(stderr, "Error-running fill buffer\n");
>>>  				ret = -1;
>>> +				close(fd_lm);
>>>  				break;
>>>  			}
>>>  
>>
>> Instead of fixing these existing patterns I think it would make the code
>> easier to understand and maintain if it is made symmetrical.
>> Having the perf event fd opened in one place but its close()
>> scattered elsewhere has the potential for confusion and making later
>> mistakes easy to miss.
>>
>> What if perf event fd is closed in a new "disable_llc_perf()" that
>> is matched with "reset_enable_llc_perf()" and called
>> from cat_val()?
>>
>> I think this raises another issue with the test trickery where
>> measure_cache_vals() has some assumptions about state based on the
>> test name.
> 
> I very much agree on the principle here, and thus I already have created 
> patches which will do a major cleanup on this area. The cleaned-up code 
> has pe_fd local var to cat_val() and handles closing it in cat_val() with 
> the usual patterns.
> 
> However, the patch is currently resides post L3 CAT test rewrite. 
> Backporting the cleanups/refactors into this series would require 
> considerable effort due to how convoluted all those n-step cleanup patches 
> and L3 CAT test rewrite are in this area. There's just very much to 
> cleanup here and L3 rewrite will touch the same areas so its a net 
> full of conflicts.
> 
> Do you want me to spend the effort to backport them into this series 
> (I expect will take some time)?

Considering the "Fixes" tag, having a smaller fix that can easily
be backported would be ideal so I am ok with deferring a bigger
rework.

I do think this fix can be made more robust with a couple of small
changes that should not introduce significant conflicts:
* initialize fd_lm to -1 
* do not close() fd_lm in get_llc_perf() but instead move its
  close() to at exit of cat_val().
* add check in get_llc_perf() that it does not attempt ioctl()
  on "fd_lm == -1" (later addition would be error checking of
  the ioctl())

> I currently have these items pending besides this series (in order):
> - L3 CAT test rewrite and its preparatory patches
> - More cleanups (including the pe_fd cleanup)
> - New generalized test framework
> - L2 CAT test

Thank you very much for taking this on.

Reinette