[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <TYAPR01MB63300E29EB4C3AF501E28D5F8B3F9@TYAPR01MB6330.jpnprd01.prod.outlook.com>
Date: Tue, 8 Nov 2022 08:32:10 +0000
From: "Shaopeng Tan (Fujitsu)" <tan.shaopeng@...itsu.com>
To: 'Reinette Chatre' <reinette.chatre@...el.com>,
Fenghua Yu <fenghua.yu@...el.com>,
Shuah Khan <shuah@...nel.org>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-kselftest@...r.kernel.org" <linux-kselftest@...r.kernel.org>
Subject: RE: [PATCH v3 4/5] selftests/resctrl: Cleanup properly when an error
occurs in CAT test
Hi Reinette and Shuah,
> On 11/2/2022 2:41 AM, Shuah Khan wrote:
> > On 11/1/22 03:43, Shaopeng Tan wrote:
> >> After creating a child process with fork() in CAT test, if there is
> >> an error occurs or such as a SIGINT signal is received, the parent
> >> process will be terminated immediately, but the child process will
> >> not be killed and also umount_resctrlfs() will not be called.
> >>
> >> Add a signal handler like other tests to kill child process, umount
> >> resctrlfs, cleanup result files, etc. when an error occurs.
> >>
> >> Signed-off-by: Shaopeng Tan <tan.shaopeng@...fujitsu.com>
> >> ---
> >> tools/testing/selftests/resctrl/cat_test.c | 28
> >> +++++++++++++++-------
> >> 1 file changed, 19 insertions(+), 9 deletions(-)
> >>
> >> diff --git a/tools/testing/selftests/resctrl/cat_test.c
> >> b/tools/testing/selftests/resctrl/cat_test.c
> >> index 6a8306b0a109..5f81817f4366 100644
> >> --- a/tools/testing/selftests/resctrl/cat_test.c
> >> +++ b/tools/testing/selftests/resctrl/cat_test.c
> >> @@ -98,12 +98,21 @@ void cat_test_cleanup(void)
> >> remove(RESULT_FILE_NAME2);
> >> }
> >> +static void ctrl_handler(int signo)
> >> +{
> >> + kill(bm_pid, SIGKILL);
> >> + umount_resctrlfs();
> >> + tests_cleanup();
> >> + ksft_print_msg("Ending\n\n");
> >
> > Is there a reason to print this message? Remove it unless it serves a
> > purpose.
>
> This function appears to be a duplicate of existing resctrl_val.c:ctrlc_handler().
> Could the duplication be avoided instead of refining the copy?
Yes, it's a duplicate of existing resctrl_val.c:ctrlc_handler().
I will try to use resctrl_val.c:ctrlc_handler() in next version patch series.
> >> +
> >> + exit(EXIT_SUCCESS);
> >> +}
> >> +
> >> int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
> >> {
> >> unsigned long l_mask, l_mask_1;
> >> int ret, pipefd[2], sibling_cpu_no;
> >> char pipe_message;
> >> - pid_t bm_pid;
> >
> > Odd. bm_pid is used below - why remove it here?
>
> There is a global bm_pid in resctrl_val.c that is made available via extern in
> resctrl.h. This is what causes this code to still compile but I would also like to
> learn why moving to that is desired as a change here. I expect such a big
> change to get a mention in the commit message.
Yes. I used global bm_pid, since bm_pid is used in ctrl_handler() before this function cat_perf_miss_val().
I will add a description to commit message.
> >> cache_size = 0;
> >> @@ -181,17 +190,19 @@ int cat_perf_miss_val(int cpu_no, int n, char
> >> *cache_type)
> >> strcpy(param.filename, RESULT_FILE_NAME1);
> >> param.num_of_runs = 0;
> >> param.cpu_no = sibling_cpu_no;
> >> + } else {
> >> + /* set up ctrl-c handler */
> >> + if (signal(SIGINT, ctrl_handler) == SIG_ERR ||
> >> + signal(SIGHUP, ctrl_handler) == SIG_ERR ||
> >> + signal(SIGTERM, ctrl_handler) == SIG_ERR)
> >> + printf("Failed to catch SIGNAL!\n");
> >
> > Is perror() more appropriate here?
>
> Should we be using signal() at all? "man signal" reads:
> "WARNING: the behavior of signal() varies across UNIX versions, and has also
> varied historically across different versions of Linux.
> Avoid its use: use sigaction(2) instead."
>
> "Failed to catch SIGNAL" also seems unclear to me. This is where a signal
> handler is set up, the signal for which the handler is installed has not arrived.
I will use sigaction(2) and perror().
> >> }
> >> remove(param.filename);
> >> ret = cat_val(¶m);
> >> - if (ret)
> >> - return ret;
> >> -
> >> - ret = check_results(¶m);
> >> - if (ret)
> >> - return ret;
> >> + if (ret == 0)
> >> + ret = check_results(¶m);
> >
> > Why not use a goto in error case to do umount_resctrlfs() instead of
> > changing the conditionals?
>
> My understanding is the code that follows is needed to synchronize the exits
> between the parent and child. It is the parent that will run umount_resctrlfs()
> and it should only do so after the child is done. A goto by the parent may thus
> cause
> umount_resctrlfs() to be run while the child still relies on it while a goto by the
> child may cause the parent not to receive the message that the child is
> complete.
Yes, the code that follows is needed to synchronize the exits between the parent and child.
> > @@ -201,7 +212,6 @@ int cat_perf_miss_val(int cpu_no, int n, char
> *cache_type)
> > sizeof(pipe_message)) {
> > close(pipefd[1]);
> > perror("# failed signaling parent process");
> > - return errno;
>
> It looks like pipefd[1] will be closed twice if the write() failed.
This close(pipefd[1]); should also be removed.
> It does look strange to let the child continue to its infinite loop after the write()
> failed. I assume that it is because the parent will also be stuck and the new
> ctrl_handler() is expected to unblock both?
If a SIGNAL is received, ctrl_handler() will be called to kill the child process and exit parent process.
If no SIGNAL is received, the parent process will kill the child process. (by else{kill(bm_pid, SIGKILL);})
> Could you please add a comment to the code to clarify this flow?
I will add comments here.
Best regards,
Shaopeng Tan
Powered by blists - more mailing lists