lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20111201160130.0803e3e4@tlielax.poochiereds.net>
Date:	Thu, 1 Dec 2011 16:01:30 -0500
From:	Jeff Layton <jlayton@...hat.com>
To:	"Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com>
Cc:	Woody Suwalski <terraluna977@...il.com>,
	LKML <linux-kernel@...r.kernel.org>,
	"Rafael J. Wysocki" <rjw@...k.pl>,
	Linux PM mailing list <linux-pm@...r.kernel.org>,
	Tejun Heo <tj@...nel.org>,
	Belisko Marek <marek.belisko@...il.com>,
	linux-cifs@...r.kernel.org, awilliam@...hat.com
Subject: Re: CIFS mount: 3.2.0-rc3 suspend crash

On Fri, 02 Dec 2011 01:40:48 +0530
"Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com> wrote:

> On 12/02/2011 12:29 AM, Jeff Layton wrote:
> 
> > On Thu, 01 Dec 2011 23:47:01 +0530
> > "Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com> wrote:
> > 
> >> On 12/01/2011 11:00 PM, Srivatsa S. Bhat wrote:
> >>
> >>> On 12/01/2011 10:41 PM, Srivatsa S. Bhat wrote:
> >>>
> >>>> On 12/01/2011 05:03 PM, Jeff Layton wrote:
> >>>>
> >>>>> On Wed, 30 Nov 2011 17:34:35 -0500
> >>>>> Woody Suwalski <terraluna977@...il.com> wrote:
> >>>>>
> >>>>>> Jeff Layton wrote:
> >>>>>>> On Wed, 30 Nov 2011 11:03:59 +0530
> >>>>>>> "Srivatsa S. Bhat"<srivatsa.bhat@...ux.vnet.ibm.com>  wrote:
> >>>>>>>
> >>>>>>>> Adding linux-pm, Jeff Layton and others to CC.
> >>>>>>>>
> >>>>>>>> Regards,
> >>>>>>>> Srivatsa S. Bhat
> >>>>>>>>
> >>>>>>>> On 11/30/2011 05:04 AM, Woody Suwalski wrote:
> >>>>>>>>
> >>>>>>>>> With CIFS mount present, suspend2ram seems to be failing on 3.2.0-rc3.
> >>>>>>>>> It may be similar to recent NFS suspend issues?
> >>>>>>>>>
> >>>>>>>>> I have done couple of suspends, then did mount.cifs (at 555 seconds),
> >>>>>>>>> and then tried suspend again.  The problem is easy to duplicate.
> >>>>>>>>>
> >>>>>>>>> The hardware is a netbook Asus EeePC.
> >>>>>>>>> I guess will find out if:
> >>>>>>>>> - cifs for real 3.2 fixes the problem
> >>>>>>>>> - NFS fix for -rc4 work for CIFS as well...
> >>>>>>>>>
> >>>>>>>>> Thanks, Woody
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> [  153.974532] uhci_hcd 0000:00:1d.1: PCI INT B disabled
> >>>>>>>>> [  153.974578] uhci_hcd 0000:00:1d.0: PCI INT A disabled
> >>>>>>>>> [  153.975774] i915 0000:00:02.0: PCI INT A disabled
> >>>>>>>>> [  153.977210] ACPI handle has no context!
> >>>>>>>>> [  153.977302] atl2 0000:03:00.0: PCI INT A disabled
> >>>>>>>>> [  153.977316] ACPI handle has no context!
> >>>>>>>>> [  154.277181] snd_hda_intel 0000:00:1b.0: PCI INT A disabled
> >>>>>>>>> [  154.288119] PM: suspend of devices complete after 414.210 msecs
> >>>>>>>>> [  154.299212] ehci_hcd 0000:00:1d.7: PME# enabled
> >>>>>>>>> [  154.299257] ehci_hcd 0000:00:1d.7: wake-up capability enabled by ACPI
> >>>>>>>>> [  154.310114] uhci_hcd 0000:00:1d.3: wake-up capability enabled by ACPI
> >>>>>>>>> [  154.310178] uhci_hcd 0000:00:1d.2: wake-up capability enabled by ACPI
> >>>>>>>>> [  154.310235] uhci_hcd 0000:00:1d.1: wake-up capability enabled by ACPI
> >>>>>>>>> [  154.310293] uhci_hcd 0000:00:1d.0: wake-up capability enabled by ACPI
> >>>>>>>>> [  154.310418] PM: late suspend of devices complete after 22.283 msecs
> >>>>>>>>> [  154.310451] ACPI: Preparing to enter system sleep state S3
> >>>>>>>>> [  154.311671] PM: Saving platform NVS memory
> >>>>>>>>> [  154.312968] Disabling non-boot CPUs ...
> >>>>>>>>> [  154.312968] ACPI: Low-level resume complete
> >>>>>>>>> [  154.312968] PM: Restoring platform NVS memory
> >>>>>>>>> [  154.312968] Force enabled HPET at resume
> >>>>>>>>> [  154.313366] ACPI: Waking up from system sleep state S3
> >>>>>>>>> [  154.341485] snd_hda_intel 0000:00:1b.0: restoring config space at
> >>>>>>>>> offset 0x1 (was 0x100006, writing 0x100002)
> >>>>>>>>> [  154.341550] pcieport 0000:00:1c.0: restoring config space at offset
> >>>>>>>>> 0x9 (was 0x1fff1, writing 0x10511041)
> >>>>>>>>> [  154.341566] pcieport 0000:00:1c.0: restoring config space at offset
> >>>>>>>>> 0x8 (was 0xfff0, writing 0x10301020)
> >>>>>>>>> [  154.341580] pcieport 0000:00:1c.0: restoring config space at offset
> >>>>>>>>> 0x7 (was 0xf0, writing 0x3030)
> >>>>>>>>> [  154.341603] pcieport 0000:00:1c.0: restoring config space at offset
> >>>>>>>>> 0x1 (was 0x100104, writing 0x100507)
> >>>>>>>>> [  154.341689] pcieport 0000:00:1c.1: restoring config space at offset
> >>>>>>>>> 0x9 (was 0x1fff1, writing 0x10111001)
> >>>>>>>>> [  154.341705] pcieport 0000:00:1c.1: restoring config space at offset
> >>>>>>>>> 0x7 (was 0xf0, writing 0x2020)
> >>>>>>>>> [  154.341728] pcieport 0000:00:1c.1: restoring config space at offset
> >>>>>>>>> 0x1 (was 0x100106, writing 0x100507)
> >>>>>>>>> [  154.341812] pcieport 0000:00:1c.2: restoring config space at offset
> >>>>>>>>> 0x7 (was 0xf0, writing 0x1010)
> >>>>>>>>> [  154.341834] pcieport 0000:00:1c.2: restoring config space at offset
> >>>>>>>>> 0x1 (was 0x100106, writing 0x100507)
> >>>>>>>>> [  154.341920] uhci_hcd 0000:00:1d.0: restoring config space at offset
> >>>>>>>>> 0x1 (was 0x2800005, writing 0x2800001)
> >>>>>>>>> [  154.341958] uhci_hcd 0000:00:1d.0: wake-up capability disabled by ACPI
> >>>>>>>>> [  154.342004] uhci_hcd 0000:00:1d.1: restoring config space at offset
> >>>>>>>>> 0x1 (was 0x2800005, writing 0x2800001)
> >>>>>>>>> [  154.342086] uhci_hcd 0000:00:1d.1: wake-up capability disabled by ACPI
> >>>>>>>>> [  154.342137] uhci_hcd 0000:00:1d.2: restoring config space at offset
> >>>>>>>>> 0x1 (was 0x2800005, writing 0x2800001)
> >>>>>>>>> [  154.342166] uhci_hcd 0000:00:1d.2: wake-up capability disabled by ACPI
> >>>>>>>>> [  154.342211] uhci_hcd 0000:00:1d.3: restoring config space at offset
> >>>>>>>>> 0x1 (was 0x2800005, writing 0x2800001)
> >>>>>>>>> [  154.342239] uhci_hcd 0000:00:1d.3: wake-up capability disabled by ACPI
> >>>>>>>>> [  154.342296] ehci_hcd 0000:00:1d.7: restoring config space at offset
> >>>>>>>>> 0x1 (was 0x2900006, writing 0x2900002)
> >>>>>>>>> [  154.342335] ehci_hcd 0000:00:1d.7: wake-up capability disabled by ACPI
> >>>>>>>>> [  154.342349] ehci_hcd 0000:00:1d.7: PME# disabled
> >>>>>>>>> [  154.342374] pci 0000:00:1e.0: restoring config space at offset 0xf
> >>>>>>>>> (was 0x60000, writing 0x600ff)
> >>>>>>>>> [  154.342606] atl2 0000:03:00.0: restoring config space at offset 0x1
> >>>>>>>>> (was 0x100006, writing 0x100406)
> >>>>>>>>> [  154.343209] PM: early resume of devices complete after 1.941 msecs
> >>>>>>>>> [  154.343454] i915 0000:00:02.0: PCI INT A ->  GSI 16 (level, low) ->
> >>>>>>>>> IRQ 16
> >>>>>>>>> [  154.343470] i915 0000:00:02.0: setting latency timer to 64
> >>>>>>>>> [  154.347098] snd_hda_intel 0000:00:1b.0: PCI INT A ->  GSI 16 (level,
> >>>>>>>>> low) ->  IRQ 16
> >>>>>>>>> [  154.347117] snd_hda_intel 0000:00:1b.0: setting latency timer to 64
> >>>>>>>>> [  154.347218] snd_hda_intel 0000:00:1b.0: irq 43 for MSI/MSI-X
> >>>>>>>>> [  154.347424] uhci_hcd 0000:00:1d.0: PCI INT A ->  GSI 23 (level, low)
> >>>>>>>>> ->  IRQ 23
> >>>>>>>>> [  154.347442] uhci_hcd 0000:00:1d.0: setting latency timer to 64
> >>>>>>>>> [  154.347479] usb usb2: root hub lost power or was reset
> >>>>>>>>> [  154.347543] uhci_hcd 0000:00:1d.1: PCI INT B ->  GSI 19 (level, low)
> >>>>>>>>> ->  IRQ 19
> >>>>>>>>> [  154.347559] uhci_hcd 0000:00:1d.1: setting latency timer to 64
> >>>>>>>>> [  154.347593] usb usb3: root hub lost power or was reset
> >>>>>>>>> [  154.347653] uhci_hcd 0000:00:1d.2: PCI INT C ->  GSI 18 (level, low)
> >>>>>>>>> ->  IRQ 18
> >>>>>>>>> [  154.347670] uhci_hcd 0000:00:1d.2: setting latency timer to 64
> >>>>>>>>> [  154.347702] usb usb4: root hub lost power or was reset
> >>>>>>>>> [  154.347760] uhci_hcd 0000:00:1d.3: PCI INT D ->  GSI 16 (level, low)
> >>>>>>>>> ->  IRQ 16
> >>>>>>>>> [  154.347776] uhci_hcd 0000:00:1d.3: setting latency timer to 64
> >>>>>>>>> [  154.347808] usb usb5: root hub lost power or was reset
> >>>>>>>>> [  154.347867] ehci_hcd 0000:00:1d.7: PCI INT A ->  GSI 23 (level, low)
> >>>>>>>>> ->  IRQ 23
> >>>>>>>>> [  154.347884] ehci_hcd 0000:00:1d.7: setting latency timer to 64
> >>>>>>>>> [  154.347988] pci 0000:00:1e.0: setting latency timer to 64
> >>>>>>>>> [  154.348125] ata_piix 0000:00:1f.2: PCI INT B ->  GSI 19 (level, low)
> >>>>>>>>> ->  IRQ 19
> >>>>>>>>> [  154.348140] ata_piix 0000:00:1f.2: setting latency timer to 64
> >>>>>>>>> [  154.354172] atl2 0000:03:00.0: PCI INT A ->  GSI 17 (level, low) ->
> >>>>>>>>> IRQ 17
> >>>>>>>>> [  154.354192] atl2 0000:03:00.0: setting latency timer to 64
> >>>>>>>>> [  154.354309] atl2 0000:03:00.0: irq 44 for MSI/MSI-X
> >>>>>>>>> [  154.354722] sd 1:0:0:0: [sda] Starting disk
> >>>>>>>>> [  154.525453] ata2.00: ACPI cmd ef/03:44:00:00:00:a0 (SET FEATURES)
> >>>>>>>>> filtered out
> >>>>>>>>> [  154.525470] ata2.00: ACPI cmd ef/03:0c:00:00:00:a0 (SET FEATURES)
> >>>>>>>>> filtered out
> >>>>>>>>> [  154.531325] ata2.00: configured for UDMA/66
> >>>>>>>>> [  154.591089] usb 1-5: reset high-speed USB device number 2 using ehci_hcd
> >>>>>>>>> [  154.805066] PM: resume of devices complete after 461.739 msecs
> >>>>>>>>> [  154.805801] Restarting tasks ... done.
> >>>>>>>>> [  154.832376] video LNXVIDEO:00: Restoring backlight state
> >>>>>>>>> [  155.964388] atl2: eth0 NIC Link is Up<100 Mbps Full Duplex>
> >>>>>>>>> [  265.377852] PM: Syncing filesystems ... done.
> >>>>>>>>> [  265.427397] Freezing user space processes ... (elapsed 0.01 seconds)
> >>>>>>>>> done.
> >>>>>>>>> [  265.438118] Freezing remaining freezable tasks ... (elapsed 0.01
> >>>>>>>>> seconds) done.
> >>>>>>>>> [  265.449211] Suspending console(s) (use no_console_suspend to debug)
> >>>>>>>>> [  265.451744] sd 1:0:0:0: [sda] Stopping disk
> >>>>>>>>> [  265.555169] ata_piix 0000:00:1f.2: PCI INT B disabled
> >>>>>>>>> [  265.555281] ehci_hcd 0000:00:1d.7: PCI INT A disabled
> >>>>>>>>> [  265.555336] uhci_hcd 0000:00:1d.3: PCI INT D disabled
> >>>>>>>>> [  265.555384] uhci_hcd 0000:00:1d.2: PCI INT C disabled
> >>>>>>>>> [  265.555432] uhci_hcd 0000:00:1d.1: PCI INT B disabled
> >>>>>>>>> [  265.555479] uhci_hcd 0000:00:1d.0: PCI INT A disabled
> >>>>>>>>> [  265.555639] snd_hda_intel 0000:00:1b.0: PCI INT A disabled
> >>>>>>>>> [  265.556302] i915 0000:00:02.0: PCI INT A disabled
> >>>>>>>>> [  265.558207] ACPI handle has no context!
> >>>>>>>>> [  265.558302] atl2 0000:03:00.0: PCI INT A disabled
> >>>>>>>>> [  265.558316] ACPI handle has no context!
> >>>>>>>>> [  265.569117] PM: suspend of devices complete after 119.214 msecs
> >>>>>>>>> [  265.580216] ehci_hcd 0000:00:1d.7: PME# enabled
> >>>>>>>>> [  265.580261] ehci_hcd 0000:00:1d.7: wake-up capability enabled by ACPI
> >>>>>>>>> [  265.591121] uhci_hcd 0000:00:1d.3: wake-up capability enabled by ACPI
> >>>>>>>>> [  265.591186] uhci_hcd 0000:00:1d.2: wake-up capability enabled by ACPI
> >>>>>>>>> [  265.591244] uhci_hcd 0000:00:1d.1: wake-up capability enabled by ACPI
> >>>>>>>>> [  265.591302] uhci_hcd 0000:00:1d.0: wake-up capability enabled by ACPI
> >>>>>>>>> [  265.591581] PM: late suspend of devices complete after 22.446 msecs
> >>>>>>>>> [  265.591615] ACPI: Preparing to enter system sleep state S3
> >>>>>>>>> [  265.592832] PM: Saving platform NVS memory
> >>>>>>>>> [  265.594133] Disabling non-boot CPUs ...
> >>>>>>>>> [  265.594133] ACPI: Low-level resume complete
> >>>>>>>>> [  265.594133] PM: Restoring platform NVS memory
> >>>>>>>>> [  265.594133] Force enabled HPET at resume
> >>>>>>>>> [  265.594133] ACPI: Waking up from system sleep state S3
> >>>>>>>>> [  265.621463] snd_hda_intel 0000:00:1b.0: restoring config space at
> >>>>>>>>> offset 0x1 (was 0x100006, writing 0x100002)
> >>>>>>>>> [  265.621528] pcieport 0000:00:1c.0: restoring config space at offset
> >>>>>>>>> 0x9 (was 0x1fff1, writing 0x10511041)
> >>>>>>>>> [  265.621544] pcieport 0000:00:1c.0: restoring config space at offset
> >>>>>>>>> 0x8 (was 0xfff0, writing 0x10301020)
> >>>>>>>>> [  265.621559] pcieport 0000:00:1c.0: restoring config space at offset
> >>>>>>>>> 0x7 (was 0xf0, writing 0x3030)
> >>>>>>>>> [  265.621582] pcieport 0000:00:1c.0: restoring config space at offset
> >>>>>>>>> 0x1 (was 0x100104, writing 0x100507)
> >>>>>>>>> [  265.621667] pcieport 0000:00:1c.1: restoring config space at offset
> >>>>>>>>> 0x9 (was 0x1fff1, writing 0x10111001)
> >>>>>>>>> [  265.621684] pcieport 0000:00:1c.1: restoring config space at offset
> >>>>>>>>> 0x7 (was 0xf0, writing 0x2020)
> >>>>>>>>> [  265.621706] pcieport 0000:00:1c.1: restoring config space at offset
> >>>>>>>>> 0x1 (was 0x100106, writing 0x100507)
> >>>>>>>>> [  265.621791] pcieport 0000:00:1c.2: restoring config space at offset
> >>>>>>>>> 0x7 (was 0xf0, writing 0x1010)
> >>>>>>>>> [  265.621813] pcieport 0000:00:1c.2: restoring config space at offset
> >>>>>>>>> 0x1 (was 0x100106, writing 0x100507)
> >>>>>>>>> [  265.621899] uhci_hcd 0000:00:1d.0: restoring config space at offset
> >>>>>>>>> 0x1 (was 0x2800005, writing 0x2800001)
> >>>>>>>>> [  265.621938] uhci_hcd 0000:00:1d.0: wake-up capability disabled by ACPI
> >>>>>>>>> [  265.621983] uhci_hcd 0000:00:1d.1: restoring config space at offset
> >>>>>>>>> 0x1 (was 0x2800005, writing 0x2800001)
> >>>>>>>>> [  265.622057] uhci_hcd 0000:00:1d.1: wake-up capability disabled by ACPI
> >>>>>>>>> [  265.622111] uhci_hcd 0000:00:1d.2: restoring config space at offset
> >>>>>>>>> 0x1 (was 0x2800005, writing 0x2800001)
> >>>>>>>>> [  265.622139] uhci_hcd 0000:00:1d.2: wake-up capability disabled by ACPI
> >>>>>>>>> [  265.622183] uhci_hcd 0000:00:1d.3: restoring config space at offset
> >>>>>>>>> 0x1 (was 0x2800005, writing 0x2800001)
> >>>>>>>>> [  265.622210] uhci_hcd 0000:00:1d.3: wake-up capability disabled by ACPI
> >>>>>>>>> [  265.622267] ehci_hcd 0000:00:1d.7: restoring config space at offset
> >>>>>>>>> 0x1 (was 0x2900006, writing 0x2900002)
> >>>>>>>>> [  265.622305] ehci_hcd 0000:00:1d.7: wake-up capability disabled by ACPI
> >>>>>>>>> [  265.622320] ehci_hcd 0000:00:1d.7: PME# disabled
> >>>>>>>>> [  265.622344] pci 0000:00:1e.0: restoring config space at offset 0xf
> >>>>>>>>> (was 0x60000, writing 0x600ff)
> >>>>>>>>> [  265.622575] atl2 0000:03:00.0: restoring config space at offset 0x1
> >>>>>>>>> (was 0x100006, writing 0x100406)
> >>>>>>>>> [  265.623172] PM: early resume of devices complete after 1.925 msecs
> >>>>>>>>> [  265.623416] i915 0000:00:02.0: PCI INT A ->  GSI 16 (level, low) ->
> >>>>>>>>> IRQ 16
> >>>>>>>>> [  265.623432] i915 0000:00:02.0: setting latency timer to 64
> >>>>>>>>> [  265.627096] snd_hda_intel 0000:00:1b.0: PCI INT A ->  GSI 16 (level,
> >>>>>>>>> low) ->  IRQ 16
> >>>>>>>>> [  265.627115] snd_hda_intel 0000:00:1b.0: setting latency timer to 64
> >>>>>>>>> [  265.627215] snd_hda_intel 0000:00:1b.0: irq 43 for MSI/MSI-X
> >>>>>>>>> [  265.627429] uhci_hcd 0000:00:1d.0: PCI INT A ->  GSI 23 (level, low)
> >>>>>>>>> ->  IRQ 23
> >>>>>>>>> [  265.627448] uhci_hcd 0000:00:1d.0: setting latency timer to 64
> >>>>>>>>> [  265.627485] usb usb2: root hub lost power or was reset
> >>>>>>>>> [  265.627548] uhci_hcd 0000:00:1d.1: PCI INT B ->  GSI 19 (level, low)
> >>>>>>>>> ->  IRQ 19
> >>>>>>>>> [  265.627564] uhci_hcd 0000:00:1d.1: setting latency timer to 64
> >>>>>>>>> [  265.627597] usb usb3: root hub lost power or was reset
> >>>>>>>>> [  265.627659] uhci_hcd 0000:00:1d.2: PCI INT C ->  GSI 18 (level, low)
> >>>>>>>>> ->  IRQ 18
> >>>>>>>>> [  265.627675] uhci_hcd 0000:00:1d.2: setting latency timer to 64
> >>>>>>>>> [  265.627707] usb usb4: root hub lost power or was reset
> >>>>>>>>> [  265.627766] uhci_hcd 0000:00:1d.3: PCI INT D ->  GSI 16 (level, low)
> >>>>>>>>> ->  IRQ 16
> >>>>>>>>> [  265.627781] uhci_hcd 0000:00:1d.3: setting latency timer to 64
> >>>>>>>>> [  265.627814] usb usb5: root hub lost power or was reset
> >>>>>>>>> [  265.627873] ehci_hcd 0000:00:1d.7: PCI INT A ->  GSI 23 (level, low)
> >>>>>>>>> ->  IRQ 23
> >>>>>>>>> [  265.627889] ehci_hcd 0000:00:1d.7: setting latency timer to 64
> >>>>>>>>> [  265.627993] pci 0000:00:1e.0: setting latency timer to 64
> >>>>>>>>> [  265.628130] ata_piix 0000:00:1f.2: PCI INT B ->  GSI 19 (level, low)
> >>>>>>>>> ->  IRQ 19
> >>>>>>>>> [  265.628145] ata_piix 0000:00:1f.2: setting latency timer to 64
> >>>>>>>>> [  265.634153] atl2 0000:03:00.0: PCI INT A ->  GSI 17 (level, low) ->
> >>>>>>>>> IRQ 17
> >>>>>>>>> [  265.634173] atl2 0000:03:00.0: setting latency timer to 64
> >>>>>>>>> [  265.634290] atl2 0000:03:00.0: irq 44 for MSI/MSI-X
> >>>>>>>>> [  265.634712] sd 1:0:0:0: [sda] Starting disk
> >>>>>>>>> [  265.796709] ata2.00: ACPI cmd ef/03:44:00:00:00:a0 (SET FEATURES)
> >>>>>>>>> filtered out
> >>>>>>>>> [  265.796726] ata2.00: ACPI cmd ef/03:0c:00:00:00:a0 (SET FEATURES)
> >>>>>>>>> filtered out
> >>>>>>>>> [  265.803326] ata2.00: configured for UDMA/66
> >>>>>>>>> [  265.880091] usb 1-5: reset high-speed USB device number 2 using ehci_hcd
> >>>>>>>>> [  266.091960] PM: resume of devices complete after 468.673 msecs
> >>>>>>>>> [  266.092731] Restarting tasks ... done.
> >>>>>>>>> [  266.112098] video LNXVIDEO:00: Restoring backlight state
> >>>>>>>>> [  267.219459] atl2: eth0 NIC Link is Up<100 Mbps Full Duplex>
> >>>>>>>>> [  554.912659] FS-Cache: Loaded
> >>>>>>>>> [  554.950749] FS-Cache: Netfs 'cifs' registered for caching
> >>>>>>>>> [  555.047550] CIFS VFS: default security mechanism requested.  The
> >>>>>>>>> default security mechanism will be upgraded from ntlm to ntlmv2 in
> >>>>>>>>> kernel release 3.2
> >>>>>>>>> [ 2094.090461] PM: Syncing filesystems ... done.
> >>>>>>>>> [ 2094.155385] Freezing user space processes ... (elapsed 0.01 seconds)
> >>>>>>>>> done.
> >>>>>>>>> [ 2094.166113] Freezing remaining freezable tasks ...
> >>>>>>>>> [ 2114.175108] Freezing of tasks failed after 20.00 seconds (1 tasks
> >>>>>>>>> refusing to freeze, wq_busy=0):
> >>>>>>>>> [ 2114.175586] cifsd           S ccb50b40     0  2219      2 0x00800000
> >>>>>>>>> [ 2114.175611]  cc6cc000 00000046 00000000 ccb50b40 cc776b40 cc41d000
> >>>>>>>>> cc776780 cc61d960
> >>>>>>>>> [ 2114.175635]  cc4d81c0 c178ec40 c178ec40 ccb50980 00000000 ccb50980
> >>>>>>>>> c1031e88 00000001
> >>>>>>>>> [ 2114.175660]  c1031f9f c10230f5 c1559ee1 c17a7140 c1031e88 00000001
> >>>>>>>>> c1031f9f c10230f5
> >>>>>>>>> [ 2114.175684] Call Trace:
> >>>>>>>>> [ 2114.175710]  [<c1031e88>] ? get_parent_ip+0x8/0x20
> >>>>>>>>> [ 2114.175723]  [<c1031f9f>] ? add_preempt_count+0x5f/0xa0
> >>>>>>>>> [ 2114.175747]  [<c10230f5>] ? default_spin_lock_flags+0x5/0x10
> >>>>>>>>> [ 2114.175770]  [<c1559ee1>] ? _raw_spin_lock_irqsave+0x31/0x50
> >>>>>>>>> [ 2114.175782]  [<c1031e88>] ? get_parent_ip+0x8/0x20
> >>>>>>>>> [ 2114.175794]  [<c1031f9f>] ? add_preempt_count+0x5f/0xa0
> >>>>>>>>> [ 2114.175807]  [<c10230f5>] ? default_spin_lock_flags+0x5/0x10
> >>>>>>>>> [ 2114.175820]  [<c1559ee1>] ? _raw_spin_lock_irqsave+0x31/0x50
> >>>>>>>>> [ 2114.175832]  [<c1031e88>] ? get_parent_ip+0x8/0x20
> >>>>>>>>> [ 2114.175846]  [<c15588c1>] ? schedule_timeout+0x111/0x1d0
> >>>>>>>>> [ 2114.175860]  [<c1512500>] ? tcp_rcv_established+0x170/0x760
> >>>>>>>>> [ 2114.175878]  [<c1041fa0>] ? usleep_range+0x40/0x40
> >>>>>>>>> [ 2114.175897]  [<c14cb6e7>] ? sk_wait_data+0xa7/0xb0
> >>>>>>>>> [ 2114.175920]  [<c104ffc0>] ? abort_exclusive_wait+0x90/0x90
> >>>>>>>>> [ 2114.175943]  [<c1509948>] ? tcp_recvmsg+0x588/0xb80
> >>>>>>>>> [ 2114.175956]  [<c1001c0a>] ? __switch_to+0xea/0x1d0
> >>>>>>>>> [ 2114.175976]  [<c1008e97>] ? native_sched_clock+0x27/0x90
> >>>>>>>>> [ 2114.175993]  [<c152685f>] ? inet_recvmsg+0x6f/0xb0
> >>>>>>>>> [ 2114.176063]  [<c14c7f70>] ? sock_recvmsg+0xe0/0x120
> >>>>>>>>> [ 2114.176086]  [<c1242df8>] ? timerqueue_add+0x58/0xb0
> >>>>>>>>> [ 2114.176098]  [<c1031e88>] ? get_parent_ip+0x8/0x20
> >>>>>>>>> [ 2114.176112]  [<c1031e88>] ? get_parent_ip+0x8/0x20
> >>>>>>>>> [ 2114.176123]  [<c1031f07>] ? sub_preempt_count+0x67/0xa0
> >>>>>>>>> [ 2114.176145]  [<c1559d1b>] ? _raw_spin_unlock_irqrestore+0x1b/0x40
> >>>>>>>>> [ 2114.176165]  [<c1053a43>] ? hrtimer_try_to_cancel+0x43/0xa0
> >>>>>>>>> [ 2114.176178]  [<c1053ab1>] ? hrtimer_cancel+0x11/0x20
> >>>>>>>>> [ 2114.176191]  [<c155934c>] ? schedule_hrtimeout_range_clock+0xcc/0x170
> >>>>>>>>> [ 2114.176206]  [<c10536c0>] ? update_rmtp+0x80/0x80
> >>>>>>>>> [ 2114.176219]  [<c14c7fe7>] ? kernel_recvmsg+0x37/0x50
> >>>>>>>>> [ 2114.176268]  [<d0611228>] ? cifs_readv_from_socket+0x148/0x2b0 [cifs]
> >>>>>>>>> [ 2114.176283]  [<c1031e88>] ? get_parent_ip+0x8/0x20
> >>>>>>>>> [ 2114.176309]  [<d06113ac>] ? cifs_read_from_socket+0x1c/0x20 [cifs]
> >>>>>>>>> [ 2114.176333]  [<d0611522>] ? cifs_demultiplex_thread+0x112/0xc20 [cifs]
> >>>>>>>>> [ 2114.176348]  [<c1031e88>] ? get_parent_ip+0x8/0x20
> >>>>>>>>> [ 2114.176370]  [<d0611410>] ? dequeue_mid+0x60/0x60 [cifs]
> >>>>>>>>> [ 2114.176391]  [<c104f9c9>] ? kthread+0x69/0x70
> >>>>>>>>> [ 2114.176406]  [<c104f960>] ? kthread_worker_fn+0x150/0x150
> >>>>>>>>> [ 2114.176420]  [<c155af76>] ? kernel_thread_helper+0x6/0xd
> >>>>>>>>> [ 2114.176466]
> >>>>>>>>> [ 2114.176473] Restarting tasks ... done.
> >>>>>>>>> [ 2114.201577] video LNXVIDEO:00: Restoring backlight state
> >>>>>>>>> [ 2114.210975] lxpanel[1452]: segfault at 0 ip b6ead703 sp bfea8924
> >>>>>>>>> error 4 in libc-2.13.so[b6e39000+13e000]
> >>>>>>>>> -- 
> >>>>>>> Thanks for reporting this. Does the following patch fix it?
> >>>>>>>
> >>>>>>> -----------------------[snip]------------------------
> >>>>>>>
> >>>>>>> cifs: attempt to freeze while looping on a receive attempt
> >>>>>>>
> >>>>>>> In the recent overhaul of the demultiplex thread receive path, we
> >>>>>>> neglected to ensure that we attempt to freeze on each pass through the
> >>>>>>> loop.
> >>>>>>>
> >>>>>>> Reported-by: Woody Suwalski<terraluna977@...il.com>
> >>>>>>> Signed-off-by: Jeff Layton<jlayton@...hat.com>
> >>>>>>> ---
> >>>>>>>   fs/cifs/connect.c |    2 ++
> >>>>>>>   1 files changed, 2 insertions(+), 0 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
> >>>>>>> index bf82f88..d166e3a 100644
> >>>>>>> --- a/fs/cifs/connect.c
> >>>>>>> +++ b/fs/cifs/connect.c
> >>>>>>> @@ -441,6 +441,8 @@ cifs_readv_from_socket(struct TCP_Server_Info *server, struct kvec *iov_orig,
> >>>>>>>   	smb_msg.msg_controllen = 0;
> >>>>>>>
> >>>>>>>   	for (total_read = 0; to_read; total_read += length, to_read -= length) {
> >>>>>>> +		try_to_freeze();
> >>>>>>> +
> >>>>>>>   		if (server_unresponsive(server)) {
> >>>>>>>   			total_read = -EAGAIN;
> >>>>>>>   			break;
> >>>>>> Jeff, I have tested on a similar EeePC box - the same cifsd crash.
> >>>>>> The patch did not help...
> >>>>>>
> >>>>>> Woody
> >>>>>>
> >>>>>
> >>>>> Well, that is odd then...
> >>>>>
> >>>>> First, we should clarify here: This is not a crash. The above stack
> >>>>> trace is just telling us that the freezer wasn't able to freeze the
> >>>>> cifs kthread that handles the receive codepath. It's possible that the
> >>>>> userspace code wasn't able to deal with that and you had to reboot your
> >>>>> machine after this, but that's really a separate problem if so.
> >>>>>
> >>>>> cifsd does blocking receives. When it creates a socket, it does this:
> >>>>>
> >>>>>         socket->sk->sk_rcvtimeo = 7 * HZ;
> >>>>>
> >>>>> ...so the maximum amount of time that this code ought to sit in
> >>>>> kernel_recvmsg is ~7s. If kernel_recvmsg returns certain error codes
> >>>>> (like -EAGAIN), it'll sleep for another second or two before going
> >>>>> around the loop again. So the max time that a loop iteration takes
> >>>>> should always be less than ~10s.
> >>>>>
> >>>>> With the above patch, cifsd will try_to_freeze() on each pass through
> >>>>> the loop. If that didn't fix the problem, and you got the same stack
> >>>>> trace, then that suggests that the time being spent in kernel_recvmsg
> >>>>> is more than 20s. Perhaps this is due to some lower-level networking
> >>>>> problem?
> >>>>>
> >>>>
> >>>>
> >>>> Hey, looking at the error logs again, I see that cifsd is in "S" state,
> >>>> meaning it is sleeping in TASK_INTERRUPTIBLE state. And, in fs/cifs/connect.c,
> >>>> in cifs_demultiplex_thread() function, I see set_freezable(), which is
> >>>> more than enough to allow the freezer to freeze this kernel thread if it
> >>>> is sleeping interruptibly.
> >>>> In other words, we should not be getting freezing failures here at all!
> >>>>
> >>>> Not sure what could be going wrong here...
> >>>>
> >>>
> >>>
> >>> Sorry, interruptible sleep + set_freezable() only ensures that the freezer
> >>> can wake the kernel thread from sleep. Calling try_to_freeze() after waking
> >>> up is still the kernel thread's job, and without doing that soon enough the
> >>> kernel thread won't get frozen and freezing failures will occur.
> >>>
> >>> So Jeff's patch should have solved the issue IMO, but ...
> >>>
> >>
> >>
> >> I see usleep_range() in the call stack. And that sleeps uninterruptibly (though
> >> usually it is used to sleep for small intervals)..
> >>
> > 
> > I'm not sure how much we can trust that stack trace. It's hard to see
> > what's actually happening there. Maybe it's worth turning on
> > CONFIG_FRAME_POINTER?
> > 
> > FWIW, There is a call to usleep_range in cifs_readv_from_socket:
> > 
> >                         usleep_range(1000, 2000);
> > 
> > ...which should only sleep for 1-2ms though. 
> 
> 
> Yeah, I saw that. The addition of try_to_freeze() after kernel_recvmsg() must
> take care of that too, since a fresh 1-2ms sleep (i.e., after a try_to_freeze)
> definitely wont hurt the freezer. However I saw a number of usleep_range()
> calls in the networking code, so I felt the problem could really be in
> kernel_recvmsg and hence added the try_to_freeze() immediately after it.
> But again, its not much more than a guess.
> 
> > A pass through the
> > entire loop shouldn't take more than 8s or so.
> > 
> >> To see if we can solve this issue without touching the networking code, I have
> >> written the patch below. Woody, could you please give it a shot?
> >> If it does fix your issue, I'll write it up as a formal patch.
> >>
> >> The idea here is kind of obvious: if the freezer was unlucky and tried to freeze
> >> when the task was in uninterruptible sleep... cooperate with the freezer ASAP.
> >> So add the try_to_freeze() immediately after kernel_recvmsg, since we have a good
> >> hunch that kernel_recvmsg is the culprit.
> >>
> >> ---
> >>
> >>  fs/cifs/connect.c |    2 ++
> >>  1 files changed, 2 insertions(+), 0 deletions(-)
> >>
> >> diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
> >> index d6a972d..773bad0 100644
> >> --- a/fs/cifs/connect.c
> >> +++ b/fs/cifs/connect.c
> >> @@ -451,6 +451,8 @@ cifs_readv_from_socket(struct TCP_Server_Info *server, struct kvec *iov_orig,
> >>  		length = kernel_recvmsg(server->ssocket, &smb_msg,
> >>  					iov, segs, to_read, 0);
> >>  
> >> +		try_to_freeze();
> >> +
> >>  		if (server->tcpStatus == CifsExiting) {
> >>  			total_read = -ESHUTDOWN;
> >>  			break;
> >>
> > 
> > That's probably worth a shot since it should tell us something either
> > way. If it doesn't work, what may be interesting is a debug patch that
> > prints timestamps before and after the kernel_recvmsg call, so we can
> > see whether this is spending more time in that call than it should.
> > 
> 
> 

FWIW, Adam Williamson hit the same problem and his stack trace is a bit
clearer:

https://bugzilla.redhat.com/show_bug.cgi?id=717735#c31

...I asked him to test the same patch I sent earlier to see if it helps
him. If not I'll plan to roll up a debug patch for you guys to play
with.

-- 
Jeff Layton <jlayton@...hat.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ