lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YxEof73XWYc8pYdZ@kernel.org>
Date:   Fri, 2 Sep 2022 00:47:43 +0300
From:   "jarkko@...nel.org" <jarkko@...nel.org>
To:     "Huang, Kai" <kai.huang@...el.com>
Cc:     "linux-sgx@...r.kernel.org" <linux-sgx@...r.kernel.org>,
        "Chatre, Reinette" <reinette.chatre@...el.com>,
        "pmenzel@...gen.mpg.de" <pmenzel@...gen.mpg.de>,
        "bp@...en8.de" <bp@...en8.de>,
        "dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
        "Dhanraj, Vijay" <vijay.dhanraj@...el.com>,
        "x86@...nel.org" <x86@...nel.org>,
        "mingo@...hat.com" <mingo@...hat.com>,
        "tglx@...utronix.de" <tglx@...utronix.de>,
        "hpa@...or.com" <hpa@...or.com>,
        "haitao.huang@...ux.intel.com" <haitao.huang@...ux.intel.com>,
        "stable@...r.kernel.org" <stable@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2 2/6] x86/sgx: Do not consider unsanitized pages an
 error

On Thu, Sep 01, 2022 at 10:50:07AM +0000, Huang, Kai wrote:
> On Wed, 2022-08-31 at 13:39 -0700, Reinette Chatre wrote:
> > >   static int ksgxd(void *p)
> > >   {
> > > +	long ret;
> > > +
> > >   	set_freezable();
> > >   
> > >   	/*
> > >   	 * Sanitize pages in order to recover from kexec(). The 2nd pass is
> > >   	 * required for SECS pages, whose child pages blocked EREMOVE.
> > >   	 */
> > > -	__sgx_sanitize_pages(&sgx_dirty_page_list);
> > > -	__sgx_sanitize_pages(&sgx_dirty_page_list);
> > > +	ret = __sgx_sanitize_pages(&sgx_dirty_page_list);
> > > +	if (ret == -ECANCELED)
> > > +		/* kthread stopped */
> > > +		return 0;
> > >   
> > > -	/* sanity check: */
> > > -	WARN_ON(!list_empty(&sgx_dirty_page_list));
> > > +	ret = __sgx_sanitize_pages(&sgx_dirty_page_list);
> > > +	switch (ret) {
> > > +	case 0:
> > > +		/* success, no unsanitized pages */
> > > +		break;
> > > +
> > > +	case -ECANCELED:
> > > +		/* kthread stopped */
> > > +		return 0;
> > > +
> > > +	default:
> > > +		/*
> > > +		 * Never expected to happen in a working driver. If it
> > > happens
> > > +		 * the bug is expected to be in the sanitization process,
> > > but
> > > +		 * successfully sanitized pages are still valid and driver
> > > can
> > > +		 * be used and most importantly debugged without issues. To
> > > put
> > > +		 * short, the global state of kernel is not corrupted so no
> > > +		 * reason to do any more complicated rollback.
> > > +		 */
> > > +		pr_err("%ld unsanitized pages\n", ret);
> > > +	}
> > >   
> > >   	while (!kthread_should_stop()) {
> > >   		if (try_to_freeze())
> > 
> > 
> > I think I am missing something here. A lot of logic is added here but I
> > do not see why it is necessary.  ksgxd() knows via kthread_should_stop() if
> > the reclaimer was canceled. I am thus wondering, could the above not be
> > simplified to something similar to V1:
> > 
> > @@ -388,6 +393,8 @@ void sgx_reclaim_direct(void)
> >  
> >  static int ksgxd(void *p)
> >  {
> > +	unsigned long left_dirty;
> > +
> >  	set_freezable();
> >  
> >  	/*
> > @@ -395,10 +402,10 @@ static int ksgxd(void *p)
> >  	 * required for SECS pages, whose child pages blocked EREMOVE.
> >  	 */
> >  	__sgx_sanitize_pages(&sgx_dirty_page_list);
> > -	__sgx_sanitize_pages(&sgx_dirty_page_list);
> >  
> > -	/* sanity check: */
> > -	WARN_ON(!list_empty(&sgx_dirty_page_list));
> > +	left_dirty = __sgx_sanitize_pages(&sgx_dirty_page_list);
> > +	if (left_dirty && !kthread_should_stop())
> > +		pr_err("%lu unsanitized pages\n", left_dirty);
> >  
> 
> This basically means driver bug if I understand correctly.  To be consistent
> with the behaviour of existing code, how about just WARN()?
> 	
> 	...
> 	left_dirty = __sgx_sanitize_pages(&sgx_dirty_page_list);
> 	WARN_ON(left_dirty && !kthread_should_stop());
> 
> It seems there's little value to print out the unsanitized pages here.  The
> existing code doesn't print it anyway.

Using WARN IMHO here is too strong measure, given that
it tear down the whole kernel, if panic_on_warn is enabled.

For debugging, any information is useful information, so
would not make sense not print the number of pages, if 
that is available. That could very well point out the
issue why all pages are not sanitized if there was a bug.

BR, Jarkko

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ