linux-kernel - Re: [PATCH - take 2] knfsd: nfsd: Handle ERESTARTSYS from syscalls.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20080618210947.2110a541@tleilax.poochiereds.net>
Date:	Wed, 18 Jun 2008 21:09:47 -0400
From:	Jeff Layton <jlayton@...hat.com>
To:	NeilBrown <neilb@...e.de>
Cc:	"J. Bruce Fields" <bfields@...ldses.org>,
	linux-nfs@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH - take 2] knfsd: nfsd: Handle ERESTARTSYS from syscalls.

On Thu, 19 Jun 2008 10:11:09 +1000
NeilBrown <neilb@...e.de> wrote:

> 
> OCFS2 can return -ERESTARTSYS from write requests (and possibly
> elsewhere) if there is a signal pending.
> 
> If nfsd is shutdown (by sending a signal to each thread) while there
> is still an IO load from the client, each thread could handle one last
> request with a signal pending.  This can result in -ERESTARTSYS
> which is not understood by nfserrno() and so is reflected back to
> the client as nfserr_io aka -EIO.  This is wrong.
> 
> Instead, interpret ERESTARTSYS to mean "try again later" by returning
> nfserr_jukebox.  The client will resend and - if the server is
> restarted - the write will (hopefully) be successful and everyone will
> be happy.
> 
>  The symptom that I narrowed down to this was:
>     copy a large file via NFS to an OCFS2 filesystem, and restart
>     the nfs server during the copy.
>     The 'cp' might get an -EIO, and the file will be corrupted -
>     presumably holes in the middle where writes appeared to fail.
> 
> 
> Signed-off-by: Neil Brown <neilb@...e.de>
> 
> ### Diffstat output
>  ./fs/nfsd/nfsproc.c |    1 +
>  1 file changed, 1 insertion(+)
> 
> diff .prev/fs/nfsd/nfsproc.c ./fs/nfsd/nfsproc.c
> --- .prev/fs/nfsd/nfsproc.c	2008-06-19 10:06:36.000000000 +1000
> +++ ./fs/nfsd/nfsproc.c	2008-06-19 10:07:58.000000000 +1000
> @@ -614,6 +614,7 @@ nfserrno (int errno)
>  #endif
>  		{ nfserr_stale, -ESTALE },
>  		{ nfserr_jukebox, -ETIMEDOUT },
> +		{ nfserr_jukebox, -ERESTARTSYS },
>  		{ nfserr_dropit, -EAGAIN },
>  		{ nfserr_dropit, -ENOMEM },
>  		{ nfserr_badname, -ESRCH },
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

No objection to the patch, but what signal was being sent to nfsd when
you saw this? If it's anything but a SIGKILL, then I wonder if we have
a race that we need to deal with. My understanding is that we have nfsd
flip between 2 sigmasks to prevent anything but a SIGKILL from being
delivered while we're handling the local filesystem operation.

>From nfsd():

----------[snip]-----------
                sigprocmask(SIG_SETMASK, &shutdown_mask, NULL);

                /*
                 * Find a socket with data available and call its
                 * recvfrom routine.
                 */
                while ((err = svc_recv(rqstp, 60*60*HZ)) == -EAGAIN)
                        ;
                if (err < 0)
                        break;
                update_thread_usage(atomic_read(&nfsd_busy));
                atomic_inc(&nfsd_busy);

                /* Lock the export hash tables for reading. */
                exp_readlock();

                /* Process request with signals blocked.  */
                sigprocmask(SIG_SETMASK, &allowed_mask, NULL);

                svc_process(rqstp);

----------[snip]-----------

What happens if this catches a SIGINT after the err<0 check, but before
the mask is set to allowed_mask? Does svc_process() then get called with
a signal pending?

-- 
Jeff Layton <jlayton@...hat.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/