lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:	Mon, 14 Jul 2008 17:27:41 +1000
From:	Alex Samad <alex@...ad.com.au>
To:	netdev@...r.kernel.org
Subject: Re: [alex@...ad.com.au: Re: Page swap allocation failure 2.6.25]

Hi


Sorry if I am breaking any rules by replying to my email, from my sent
items, I haven't received my copy from the list yet !?


I think I have tracked down the problem. the key word in the original
reply was about mtu.

So changed the sysctl values back to default, fixed up 2 box - the ones
with forcedeth driver.

I went over the nas box with a fine tooth comb. and what I found was
that i was setting up for jumbo frames, but because I have a mixed
environment some machine can accept large frames some can't I have a
scrip that sets individual mtu for machines.  anyway.

the mtu for the nic was being set to 6144 (the driver had limited it,
the original request was for 9100). but the routes where still in the
routing table with 9100 mtu, since fixing that up I don't seem to be
having the problem. I could reproduce the problem by scp 2G file between
the boxes, I have done that 5 times now with no crash.

If any one wants to follow that up please email directly as I will un
subscribe from the list now 

Alex

On Mon, Jul 14, 2008 at 01:56:06PM +1000, Alex Samad wrote:
> Hi
> 
> I am forwarding this again to the netdev mailing list, seems like my
> original mail did not make it to the list (checked www archive).
> 
> Alex
> ----- Forwarded message from Alex Samad <alex@...ad.com.au> -----
> 
> From: Alex Samad <alex@...ad.com.au>
> To: Francois Romieu <romieu@...zoreil.com>, linux-kernel@...r.kernel.org,
> 	Edward Hsu <edward_hsu@...ltek.com.tw>
> Cc: netdev@...r.kernel.org
> Subject: Re: Page swap allocation failure 2.6.25
> Mail-Followup-To: Francois Romieu <romieu@...zoreil.com>,
> 	linux-kernel@...r.kernel.org,
> 	Edward Hsu <edward_hsu@...ltek.com.tw>, netdev@...r.kernel.org
> List-ID: <linux-kernel.vger.kernel.org>
> 
> On Sun, Jul 13, 2008 at 09:49:44PM +1000, Alex Samad wrote:
> > On Sun, Jul 13, 2008 at 01:02:22PM +0200, Francois Romieu wrote:
> > > Alex Samad <alex@...ad.com.au> :
> > > [...]
> > > > For a while now I have been receiving page swap allocation failures
> > > > 
> > > > 
> > > > Similar to http://lkml.org/lkml/2008/6/10/3 and
> > > 
> > > Order 0 failure. Your is an order 2 one.
> > > 
> > > > http://lkml.org/lkml/2008/2/19/298
> > > 
> > > Order 3 failure which was fixed with the e1000e driver.
> > 
> > 
> > not sure about these, I will take your word for it.
> > 
> > > 
> > > > and I have filed a bug with debian (Bug#486300)
> > > > 
> > > > 
> > > > It seems like any time I put the system under load, transferring large
> > > > files across the network  (1G nic, a r8186 and forcedeth and a
> > > > broadcom).  I keep getting these errors
> > > 
> > > May I assume that you are working with a MTU greater than 1500 bytes on
> > > each interface ? If so plese add netdev@...r.kernel.org to the Cc: and
> > > remove linux-kernel@ from the Cc:.
> > 
> > I have 3 boxes, 2 are setup with > 1500 mtu and 1 isn't (the one with
> > the r8186 driver), I have tested with >1500 mtu and with mtu = 1500 with
> > the same result.
> > 
> > > 
> > > [...]
> > > > Jul 13 13:28:30 nas kernel: [  648.120756]  [<ffffffff881b525f>]
> > > > :r8168:rtl8168_rx_fill+0x64/0x106
> > > 
> > > It looks more like Realtek's out-of-tree driver than like the in-kernel
> > > one. Is it a customised kernel ?
> > The kernel is a stock debian amd64 kernel, not customised by me.
> > 
> > I did build the r8168 from the realtek site.
> > 
> > bit more info on the setup
> > 
> > I have 2 laptops (both HP's), 1(A) running Vista 1(B) running Debian lenny/sid
> > (2.6.25). I have three servers 2 shuttles (forcedeth) (multimedia & hufpuf ) 1 gigabyte
> > (realtek) (nas).
> > 
> > The nas box is the one I coped the error from the syslog. it is
> > primarily a nfs nas.  Hufpuf is the samba box, it used to be the nas
> > box. it currently mounts a few (large) shares from nas.  Multimedia is a
> > backup server.
> > 
> > A & B & NAS have 1500 MTU
> > 
> > multimedia and hufpuf can run with 9100 mtu
> > 
> > I have tried
> > i) coping files from A to hufpuf (smb) which then sends it on to nas via
> > nfs
> > ii) copy files from B to nas (nfs)
> > iii) scp from B to hufpuf and then on to nas via nfs
> > iv) scp from B to nas
> > v) scp from hufpuf to nas
> > vi) scp from hufpuf to multimedia
> > vii) scp from multimedia to nas
> > viii) hufpuf nfs to nas
> > ix) multimedia nfs to nas
> > 
> > all of these have caused these errors.
> > 
> > when I was testing again today, I noticed when I was coping from A to
> > hufpuf and then onto nas. that smaller files say < 200M would go okay,
> > anything greater (or if the total of the files was greater) then I would
> > start to get the errors.
> >  
> 
> I have done some more testing, I found that I had this line in my
> sysctl.conf ( a hand over from a long ago)
> 
> net.ipv4.tcp_rmem = 4096        87380   2097152
> 
> this was in my 2 servers multimedia and hufpuf (forcedeth), I have
> removed these and gone back to defaults.
> 
> Running a quick test scp'ing from the nas box to multimedia and to
> hufpuf, doesn't cause any page faults, but scp to the nas box causes
> more page faults. I tried scping between multimedia and hufpuf with
> jumbo frames and that went all okay.
> 
> So it looks like it might be the 8186 drivers, that being the case I
> will cc netdev@...r.kernel.org. I will leave linux-kernel still here for
> a trial
> 
> thanks
> 
> 
> 
> > 
> > > 
> > > [...]
> > > > Help
> > > 
> > > Don't panic.
> > not panicing yet but I am a bit concerned. the data seems to be okay
> > even after these errors
> 
> thanks
> 
> > 
> > 
> > > 
> > > -- 
> > > Ueimor
> > > 
> > 
> > -- 
> > "You see, the Senate wants to take away some of the powers of the administrative branch."
> > 
> > 	- George W. Bush
> > 09/19/2002
> > Washington, DC
> 
> 
> 
> -- 
> "See, free nations are peaceful nations. Free nations don't attack each other. Free nations don't develop weapons of mass destruction. "
> 
> 	- George W. Bush
> 10/03/2003
> Milwaukee, WI
> 
> 
> 
> ----- End forwarded message -----
> 
> -- 
> "Border relations between Canada and Mexico have never been better."
> 
> 	- George W. Bush
> 09/24/2001
> in a press conference with Canadian Prime Minister Jean Chretien



-- 
<Culus> there is 150 meg in the /tmp dir! DEAR LORD

Download attachment "signature.asc" of type "application/pgp-signature" (198 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ