[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LRH.2.02.1310040923320.8473@file01.intranet.prod.int.rdu2.redhat.com>
Date: Fri, 4 Oct 2013 09:38:50 -0400 (EDT)
From: Mikulas Patocka <mpatocka@...hat.com>
To: Akira Hayakawa <ruby.wktk@...il.com>
cc: dm-devel@...hat.com, devel@...verdev.osuosl.org,
thornber@...hat.com, snitzer@...hat.com,
gregkh@...uxfoundation.org, david@...morbit.com,
linux-kernel@...r.kernel.org, dan.carpenter@...cle.com,
joe@...ches.com, akpm@...ux-foundation.org, m.chehab@...sung.com,
ejt@...hat.com, agk@...hat.com, cesarb@...arb.net, tj@...nel.org
Subject: Re: [dm-devel] dm-writeboost testing
On Fri, 4 Oct 2013, Akira Hayakawa wrote:
> Hi, Mikulas,
>
> I am sorry to say that
> I don't have such machines to reproduce the problem.
>
> But agree with that I am dealing with workqueue subsystem
> in a little bit weird way.
> I should clean them up.
>
> For example,
> free_cache() routine below is
> a deconstructor of the cache metadata
> including all the workqueues.
>
> void free_cache(struct wb_cache *cache)
> {
> cache->on_terminate = true;
>
> /* Kill in-kernel daemons */
> cancel_work_sync(&cache->sync_work);
> cancel_work_sync(&cache->recorder_work);
> cancel_work_sync(&cache->modulator_work);
>
> cancel_work_sync(&cache->flush_work);
> destroy_workqueue(cache->flush_wq);
>
> cancel_work_sync(&cache->barrier_deadline_work);
>
> cancel_work_sync(&cache->migrate_work);
> destroy_workqueue(cache->migrate_wq);
> free_migration_buffer(cache);
>
> /* Destroy in-core structures */
> free_ht(cache);
> free_segment_header_array(cache);
>
> free_rambuf_pool(cache);
> }
>
> cancel_work_sync() before destroy_workqueue()
> can probably be removed because destroy_workqueue() first
> flush all the works.
>
> Although I prepares independent workqueue
> for each flush_work and migrate_work
> other four works are queued into the system_wq
> through schedule_work() routine.
> This asymmetricity is not welcome for
> architecture-portable code.
> Dependencies to the subsystem should be minimized.
> In detail, workqueue subsystem is really changing
> about its concurrency support so
> trusting only the single threaded workqueue
> will be a good idea for stability.
The problem is that you are using workqueues the wrong way. You submit a
work item to a workqueue and the work item is active until the device is
unloaded.
If you submit a work item to a workqueue, it is required that the work
item finishes in finite time. Otherwise, it may stall stall other tasks.
The deadlock when I terminate Xserver is caused by this - the nvidia
driver tries to flush system workqueue and it waits for all work items to
terminate - but your work items don't terminate.
If you need a thread that runs for a long time, you should use
kthread_create, not workqueues (see this
http://people.redhat.com/~mpatocka/patches/kernel/dm-crypt-paralelizace/old-3/dm-crypt-encryption-threads.patch
or this
http://people.redhat.com/~mpatocka/patches/kernel/dm-crypt-paralelizace/old-3/dm-crypt-offload-writes-to-thread.patch
as an example how to use kthreads).
Mikulas
> To begin with,
> these works are never out of queue
> until the deconstructor is called
> but they are repeating running and sleeping.
> Queuing these kind of works to system_wq
> may be unsupported.
>
> So,
> my strategy is to clean them up in a way that
> 1. all daemons are having their own workqueue
> 2. never use cancel_work_sync() but only calls destroy_workqueue()
> in the deconstructor free_cache() and error handling in resume_cache().
>
> Could you please run the same test again
> after I fixed these points
> to see whether it is still reproducible?
>
>
> > On 3.11.3 on PA-RISC without preemption, the device unloads (although it
> > takes many seconds and vmstat shows that the machine is idle during this
> > time)
> This behavior is benign but probably should be improved.
> In said free_cache() it first turns `on_terminate` flag to true
> to notify all the daemons that we are shutting down.
> Since the `update_interval` and `sync_interval` are 60 seconds by default
> we must wait for them to finish for a while.
>
> Akira
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists