lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <4542FF94.4090005@adaptec.com>
Date:	Sat, 28 Oct 2006 12:28:28 +0530
From:	Ravi Krishnamurthy <Ravi_Krishnamurthy@...ptec.com>
To:	linux-kernel@...r.kernel.org
CC:	axboe@...e.de
Subject: Block driver freezes when using CFQ

Hi all,

    I have written a block driver that registers a virtual device and
routes requests to appropriate real devices after some re-mapping of
the requests. I am testing the driver by creating a filesystem on the
virtual device and copying a large number of files on to it. The test
causes the device to become unresponsive after some time. After some
debugging, I noticed that this happens only if the I/O scheduler being
used is CFQ. I have not had any trouble if the scheduler is noop,
anticipatory or deadline. The problem occurs on all the kernels I have
tested - 2.6.18-rc2, 2.6.18-rc4, 2.6.19-rc3.

Below are some details about the driver and what I have observed during
testing:

The request function registered by my driver is a simple loop -

   while ((req = elv_next_request(q))) {
         blkdev_dequeue_request(req);

         /*
          Add request to an internal queue for further processing
          Wake up thread to start processing the queue
          Update some variables for book-keeping
          */
   }

Completed requests are handled in a different thread -
   while (work to be done) {
       /*
         Dequeue completed requests from internal queue
         Call end_that_request_first() and end_that_request_last()
         Update some variables for book-keeping
       */
   }

Several times during the test run, the while() loop in the request
function comes out without dequeuing any request even though the
elevator queue is not empty. (Confirmed by printing the return value of
elv_queue_empty(), and the values of q->rq.count[] outside the loop).
After one such occurrence, the request function is not called at all
and the device becomes unresponsive.
I added some code that lets me trigger the request function from userspace.
If I nudge the driver this way, I/Os continue for a short while and stop
again.

Since CFQ is the default I/O scheduler in current kernels, it has been
widely used and tested. So I suspect I am not doing something right in my
driver. Since the driver works well with the other schedulers, is there
something CFQ-specific that I should take care of?


Please Cc me on the responses since I am not subscribed to lkml.

Thanks,
Ravi.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ