lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4F21A80D.6080200@linux.vnet.ibm.com>
Date:	Fri, 27 Jan 2012 00:52:53 +0530
From:	"Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com>
To:	"Rafael J. Wysocki" <rjw@...k.pl>
CC:	Jiri Slaby <jirislaby@...il.com>, Tejun Heo <tj@...nel.org>,
	Jiri Slaby <jslaby@...e.cz>,
	LKML <linux-kernel@...r.kernel.org>, Baohua.Song@....com,
	"pavel@....cz" <pavel@....cz>,
	Linux PM mailing list <linux-pm@...r.kernel.org>
Subject: Re: [linux-pm] PM: cannot hibernate -- BUG at kernel/workqueue.c:3659

On 01/26/2012 05:21 AM, Rafael J. Wysocki wrote:

> Hi,
>>
>> SNAPSHOT_CREATE_IMAGE has a check for data->ready such as:
>>
>>         if (data->mode != O_RDONLY || !data->frozen  || data->ready) {
>>                 error = -EPERM;
>>                 break;
>>         }
>>
>> data->ready would be set to 1 only under SNAPSHOT_CREATE_IMAGE. However,
>> SNAPSHOT_FREE (invoked at the place shown above) will reset the value to 0.
>> This makes it possible for hibernation_snapshot() and hence
>> freeze_workqueues_begin() to be called a second time, which is unfortunate.
> 
> Yes, I obviously forgot about that code path when I was working on the commit
> that introduced the problem. :-(
> 
> Thanks a lot for the great analysis, it's really helpful!
>


Welcome :-) It was fun!

 
>> And actually, the patch I posted in my previous mail is not really the right
>> long-term fix, though it might fix the particular issue that Jiri is facing..
>>
>> Because, allowing hibernation_snapshot() to get called a second time while
>> kernel threads are still frozen brings us to the same situation that commit
>> 2aede851 (PM / Hibernate: Freeze kernel threads after preallocating memory)
>> tried to prevent! IOW, a call to hibernate_preallocate_memory() would be
>> done inside hibernation_snapshot(), when kernel threads are frozen.. which
>> is known to break XFS, to give one example as mentioned in the changelog
>> of the above commit.
> 
> That's exactly right.
> 
>> So, the right way to fix this IMHO, would be to split up thaw_processes()
>> just like freezing phase:
>>
>> /* freezes or thaws user space processes */
>> freeze_processes() - thaw_processes()
>>
>> /* freezes or thaws kernel threads */
>> freeze_kernel_threads() - thaw_kernel_threads()
>>
>> We have to insert this thaw_kernel_threads() at appropriate places in such a
>> way as to not require another ioctl if possible... Then things would be
>> more symmetric (and hence more easy to understand) and we can avoid getting
>> into strange situations as discussed here.
>>
>> But before we venture into that, it would be good to know if the patch posted
>> in the previous mail fixes the particular problem reported in this thread,
>> atleast just to see if there are other problems lurking that we aren't aware
>> of yet..
> 
> Jiri has already said that the patch works.
> 
> I think we could avoid the issue entirely by introducing thaw_kernel_threads
> and making SNAPSHOT_FREE call it.  No other changes should be necessary.
> 
> IOW, Jiri, does the patch below help?
> 
> [BTW, the freeze_tasks()'s kerneldoc seems to be outdated.  Tejun?]
> 
> ---


This is exactly the kind of fix I was suggesting.. Thanks Rafael!

I have a small request for a comment. Please see below.
I have a question too, but for that I'll have to reply to my earlier
thread so that I can comment on the userspace code.

>  include/linux/freezer.h |    2 ++
>  kernel/power/process.c  |   19 +++++++++++++++++++
>  kernel/power/user.c     |    1 +
>  3 files changed, 22 insertions(+)
> 
> Index: linux/include/linux/freezer.h
> ===================================================================
> --- linux.orig/include/linux/freezer.h
> +++ linux/include/linux/freezer.h
> @@ -39,6 +39,7 @@ extern bool __refrigerator(bool check_kt
>  extern int freeze_processes(void);
>  extern int freeze_kernel_threads(void);
>  extern void thaw_processes(void);
> +extern void thaw_kernel_threads(void);
> 
>  static inline bool try_to_freeze(void)
>  {
> @@ -174,6 +175,7 @@ static inline bool __refrigerator(bool c
>  static inline int freeze_processes(void) { return -ENOSYS; }
>  static inline int freeze_kernel_threads(void) { return -ENOSYS; }
>  static inline void thaw_processes(void) {}
> +static inline void thaw_kernel_threads(void) {}
> 
>  static inline bool try_to_freeze(void) { return false; }
> 
> Index: linux/kernel/power/process.c
> ===================================================================
> --- linux.orig/kernel/power/process.c
> +++ linux/kernel/power/process.c
> @@ -188,3 +188,22 @@ void thaw_processes(void)
>  	printk("done.\n");
>  }
> 
> +void thaw_kernel_threads(void)
> +{
> +	struct task_struct *g, *p;
> +
> +	pm_nosig_freezing = false;
> +	printk("Restarting kernel threads ... ");
> +
> +	thaw_workqueues();
> +
> +	read_lock(&tasklist_lock);
> +	do_each_thread(g, p) {
> +		if (p->flags & (PF_KTHREAD | PF_WQ_WORKER))
> +			__thaw_task(p);
> +	} while_each_thread(g, p);
> +	read_unlock(&tasklist_lock);
> +
> +	schedule();
> +	printk("done.\n");
> +}
> Index: linux/kernel/power/user.c
> ===================================================================
> --- linux.orig/kernel/power/user.c
> +++ linux/kernel/power/user.c
> @@ -274,6 +274,7 @@ static long snapshot_ioctl(struct file *
>  		swsusp_free();
>  		memset(&data->handle, 0, sizeof(struct snapshot_handle));
>  		data->ready = 0;


It would be nice to have a comment here explaining why we call
thaw_kernel_threads() here. (Such a comment would avoid confusion when people
look at SNAPSHOT_CREATE_IMAGE and SNAPSHOT_FREE and wonder why there is
thawing involved, while the corresponding freezing is nowhere in sight..
Of course the freezing is hidden inside hibernation_snapshot(), but that
might not be immediately apparent to everyone.)

> +		thaw_kernel_threads();

>  		break;

> 
>  	case SNAPSHOT_PREF_IMAGE_SIZE:


Regards,
Srivatsa S. Bhat

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ