[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJfpegsr3fqcFuNekLwf69v3mpNJyze741=L5KUJjvH758eE6g@mail.gmail.com>
Date: Mon, 25 Apr 2022 10:37:58 +0200
From: Miklos Szeredi <miklos@...redi.hu>
To: Bernd Schubert <bernd.schubert@...tmail.fm>
Cc: Bernd Schubert <bschubert@....com>,
Linux-FSDevel <linux-fsdevel@...r.kernel.org>,
Linux Kernel <linux-kernel@...r.kernel.org>,
Dharmendra Singh <dsingh@....com>
Subject: Re: RFC fuse waitq latency
On Fri, 22 Apr 2022 at 17:46, Bernd Schubert <bernd.schubert@...tmail.fm> wrote:
>
> [I removed the failing netapp/zufs CCs]
>
> On 4/22/22 14:25, Miklos Szeredi wrote:
> > On Mon, 28 Mar 2022 at 15:21, Bernd Schubert <bschubert@....com> wrote:
> >>
> >> I would like to discuss the user thread wake up latency in
> >> fuse_dev_do_read(). Profiling fuse shows there is room for improvement
> >> regarding memory copies and splice. The basic profiling with flame graphs
> >> didn't reveal, though, why fuse is so much
> >> slower (with an overlay file system) than just accessing the underlying
> >> file system directly and also didn't reveal why a single threaded fuse
> >> uses less than 100% cpu, with the application on top of use also using
> >> less than 100% cpu (simple bonnie++ runs with 1B files).
> >> So I started to suspect the wait queues and indeed, keeping the thread
> >> that reads the fuse device for work running for some time gives quite
> >> some improvements.
> >
> > Might be related: I experimented with wake_up_sync() that didn't meet
> > my expectations. See this thread:
> >
> > https://lore.kernel.org/all/1638780405-38026-1-git-send-email-quic_pragalla@quicinc.com/#r
> >
> > Possibly fuse needs some wake up tweaks due to its special scheduling
> > requirements.
>
> Thanks I will look at that as well. I have a patch with spinning and
> avoid of thread wake that is almost complete and in my (still limited)
> testing almost does not take more CPU and improves meta data / bonnie
> performance in between factor ~1.9 and 3, depending on in which
> performance mode the cpu is.
>
> https://github.com/aakefbs/linux/commits/v5.17-fuse-scheduling3
>
> Missing is just another option for wake-queue-size trigger and handling
> of signals. Should be ready once I'm done with my other work.
Trying to understand what is being optimized here... does the
following correctly describe your use case?
- an I/O thread is submitting synchronous requests (direct I/O?)
- the fuse thread always goes to sleep, because the request queue is
empty (there's always a single request on the queue)
- with this change the fuse thread spins for a jiffy before going to
sleep, and by that time the I/O thread will submit a new sync request.
- the I/O thread does not spin while the the fuse thread is processing
the request, so it still goes to sleep.
Thanks,
Miklos
Powered by blists - more mailing lists