lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 18 Dec 2009 20:44:03 +0100
From:	Peter Zijlstra <peterz@...radead.org>
To:	Steve Rago <sar@...-labs.com>
Cc:	linux-nfs@...r.kernel.org, linux-kernel@...r.kernel.org,
	Trond.Myklebust@...app.com, Wu Fengguang <fengguang.wu@...el.com>,
	"jens.axboe" <jens.axboe@...cle.com>
Subject: Re: [PATCH] improve the performance of large sequential write NFS
 workloads

On Fri, 2009-12-18 at 14:33 -0500, Steve Rago wrote:
> > > +void nfs_wait_woutstanding(struct inode *inode)
> > > +{
> > > +   if (nfs_max_woutstanding != 0) {
> > > +           unsigned long background_thresh;
> > > +           unsigned long dirty_thresh;
> > > +           long npages;
> > > +           struct nfs_inode *nfsi = NFS_I(inode);
> > > +
> > > +           get_dirty_limits(&background_thresh, &dirty_thresh, NULL, NULL);
> > > +           npages = global_page_state(NR_FILE_DIRTY) +
> > > +                    global_page_state(NR_UNSTABLE_NFS) +
> > > +                    global_page_state(NR_WRITEBACK);
> > > +           if (npages >= background_thresh)
> > > +                   wait_event(nfsi->writes_wq,
> > > +                      atomic_read(&nfsi->writes) < nfs_max_woutstanding);
> > > +   }
> > > +}
> > 
> > This looks utterly busted too, why the global state and not the nfs
> > client's bdi state?
> > 
> > Also, why create this extra workqueue and not simply use the congestion
> > interface that is present? If the congestion stuff doesn't work for you,
> > fix that, don't add extra muck like this.
> 
> Pages are a global resource.  Once we hit the dirty_threshold, the
> system is going to work harder to flush the pages out.  This code
> prevents the caller from creating more dirty pages in this state,
> thereby making matters worse, when eager writeback is enabled.

You misunderstand, dirty limits are per BDI, all those npages might be
for !NFS traffic, in which case forcing the NFS into sync mode might be
the wrong thing to do.

The dirty pages are no longer a global resource in the current Linux
tree.

> This wait queue is used for different purposes than the congestion_wait
> interface.  Here we are preventing the caller from proceeding if there
> are too many NFS writes outstanding for this thread and we are in a
> memory pressure state.  It has nothing to do with the state of the bdi
> congestion. 

I'm thinking it ought to, congestion is exactly that, when the device
gets backed up and need to get moving.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ