[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YxXFGLSmRri2T1yb@kernel.org>
Date: Mon, 5 Sep 2022 12:44:56 +0300
From: "jarkko@...nel.org" <jarkko@...nel.org>
To: "Huang, Kai" <kai.huang@...el.com>
Cc: "linux-sgx@...r.kernel.org" <linux-sgx@...r.kernel.org>,
"pmenzel@...gen.mpg.de" <pmenzel@...gen.mpg.de>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
"bp@...en8.de" <bp@...en8.de>,
"Dhanraj, Vijay" <vijay.dhanraj@...el.com>,
"Chatre, Reinette" <reinette.chatre@...el.com>,
"mingo@...hat.com" <mingo@...hat.com>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"x86@...nel.org" <x86@...nel.org>,
"haitao.huang@...ux.intel.com" <haitao.huang@...ux.intel.com>,
"stable@...r.kernel.org" <stable@...r.kernel.org>,
"hpa@...or.com" <hpa@...or.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 1/2] x86/sgx: Do not fail on incomplete sanitization on
premature stop of ksgxd
On Mon, Sep 05, 2022 at 07:50:33AM +0000, Huang, Kai wrote:
> On Sat, 2022-09-03 at 13:26 +0300, Jarkko Sakkinen wrote:
> > > static int ksgxd(void *p)
> > > {
> > > + unsigned long left_dirty;
> > > +
> > > set_freezable();
> > >
> > > /*
> > > * Sanitize pages in order to recover from kexec(). The 2nd pass is
> > > * required for SECS pages, whose child pages blocked EREMOVE.
> > > */
> > > - __sgx_sanitize_pages(&sgx_dirty_page_list);
> > > - __sgx_sanitize_pages(&sgx_dirty_page_list);
> > > + left_dirty = __sgx_sanitize_pages(&sgx_dirty_page_list);
> > > + pr_debug("%ld unsanitized pages\n", left_dirty);
> > %lu
> >
>
> I assume the intention is to print out the unsanitized SECS pages, but what is
> the value of printing it? To me it doesn't provide any useful information, even
> for debug.
How do you measure "useful"?
If for some reason there were unsanitized pages, I would at least
want to know where it ended on the first value.
Plus it does zero harm unless you explicitly turn it on.
> Besides, the first call of __sgx_sanitize_pages() can return 0, due to either
> kthread_should_stop() being true, or all EPC pages are EREMOVED successfully.
> So in this case kernel will print out "0 unsanitized pages\n", which doesn't
> make a lot sense?
>
> > >
> > > - /* sanity check: */
> > > - WARN_ON(!list_empty(&sgx_dirty_page_list));
> > > + left_dirty = __sgx_sanitize_pages(&sgx_dirty_page_list);
> > > + /*
> > > + * Never expected to happen in a working driver. If it happens the
> > > bug
> > > + * is expected to be in the sanitization process, but successfully
> > > + * sanitized pages are still valid and driver can be used and most
> > > + * importantly debugged without issues. To put short, the global
> > > state
> > > + * of kernel is not corrupted so no reason to do any more
> > > complicated
> > > + * rollback.
> > > + */
> > > + if (left_dirty)
> > > + pr_err("%ld unsanitized pages\n", left_dirty);
> > %lu
>
> No strong opinion, but IMHO we can still just WARN() when it is driver bug:
>
> 1) There's no guarantee the driver can continue to work if it has bug;
>
> 2) WARN() can panic() the kernel if /proc/sys/kernel/panic_on_warn is set is
> fine. It's expected behaviour. If I understand correctly, there are many
> places in the kernel that uses WARN() to catch bugs.
>
> In fact, we can even view WARN() as an advantage. For instance, if we only print
> out "xx unsanitized pages" in the existing code, people may even wouldn't have
> noticed this bug.
>
> From this perspective, if you want to print out, I think you may want to make
> the message more visible, that people can know it's driver bug. Perhaps
> something like "The driver has bug, please report to kernel community..", etc.
>
> 3) Changing WARN() to pr_err() conceptually isn't mandatory to fix this
> particular bug. So, it's kinda mixing things together.
>
> But again, no strong opinion here.
>
> --
> Thanks,
> -Kai
>
>
BR, Jarkko
Powered by blists - more mailing lists