lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 29 Sep 2011 17:58:07 -0700
From:	Simon Kirby <sim@...tway.ca>
To:	Trond Myklebust <Trond.Myklebust@...app.com>
Cc:	linux-nfs@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: NFS client growing system CPU

On Wed, Sep 28, 2011 at 12:58:35PM -0700, Simon Kirby wrote:

> On Tue, Sep 27, 2011 at 01:04:15PM -0400, Trond Myklebust wrote:
> 
> > On Tue, 2011-09-27 at 09:49 -0700, Simon Kirby wrote: 
> > > On Tue, Sep 27, 2011 at 07:42:53AM -0400, Trond Myklebust wrote:
> > > 
> > > > On Mon, 2011-09-26 at 17:39 -0700, Simon Kirby wrote: 
> > > > > Hello!
> > > > > 
> > > > > Following up on "System CPU increasing on idle 2.6.36", this issue is
> > > > > still happening even on 3.1-rc7. So, since it has been 9 months since I
> > > > > reported this, I figured I'd bisect this issue. The first bisection ended
> > > > > in an IPMI regression that looked like the problem, so I had to start
> > > > > again. Eventually, I got commit b80c3cb628f0ebc241b02e38dd028969fb8026a2
> > > > > which made it into 2.6.34-rc4.
> > > > > 
> > > > > With this commit, system CPU keeps rising as the log crunch box runs
> > > > > (reads log files via NFS and spews out HTML files into NFS-mounted report
> > > > > directories). When it finishes the daily run, the system time stays
> > > > > non-zero and continues to be higher and higher after each run, until the
> > > > > box never completes a run within a day due to all of the wasted cycles.
> > > > 
> > > > So reverting that commit fixes the problem on 3.1-rc7?
> > > > 
> > > > As far as I can see, doing so should be safe thanks to commit
> > > > 5547e8aac6f71505d621a612de2fca0dd988b439 (writeback: Update dirty flags
> > > > in two steps) which fixes the original problem at the VFS level.
> > > 
> > > Hmm, I went to git revert b80c3cb628f0ebc241b02e38dd028969fb8026a2, but
> > > for some reason git left the nfs_mark_request_dirty(req); line in
> > > nfs_writepage_setup(), even though the original commit had that. Is that
> > > OK or should I remove that as well?
> > > 
> > > Once that is sorted, I'll build it and let it run for a day and let you
> > > know. Thanks!
> > 
> > It shouldn't make any difference whether you leave it or remove it. The
> > resulting second call to __set_page_dirty_nobuffers() will always be a
> > no-op since the page will already be marked as dirty.
> 
> Ok, confirmed, git revert b80c3cb628f0ebc241b02e38dd028969fb8026a2 on
> 3.1-rc7 fixes the problem for me. Does this make sense, then, or do we
> need further investigation and/or testing?

Just to clear up what I said before, it seems that on plain 3.1-rc8, I am
actually able to clear the endless CPU use in nfs_writepages by just
running "sync". I am not sure when this changed, but I'm pretty sure that
some versions between 2.6.34 and 3.1-rc used to not be affected by just
"sync" unless it was paired with drop_caches. Maybe this makes the
problem more obvious...

Simon-
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ