[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4B4C8918.2080604@redhat.com>
Date: Tue, 12 Jan 2010 15:37:12 +0100
From: Michal Novotny <minovotn@...hat.com>
To: Chris Lee <cslee-list@...ericom.co.uk>
CC: Ric Wheeler <rwheeler@...hat.com>,
Christoph Hellwig <hch@...radead.org>,
linux-ext4@...r.kernel.org
Subject: Re: [PATCH] extend e2fsprogs functionality to add EXT2_FLAG_DIRECT
option
On 01/12/2010 03:33 PM, Chris Lee wrote:
>
>
> Michal Novotny wrote:
>> On 01/12/2010 02:29 PM, Ric Wheeler wrote:
>>> On 01/12/2010 08:23 AM, Michal Novotny wrote:
>>>> On 01/12/2010 02:12 PM, Michal Novotny wrote:
>>>>> On 01/12/2010 02:04 PM, Ric Wheeler wrote:
>>>>>> On 01/12/2010 08:01 AM, Michal Novotny wrote:
>>>>>>> On 01/12/2010 01:46 PM, Christoph Hellwig wrote:
>>>>>>>> On Tue, Jan 12, 2010 at 01:30:40PM +0100, Michal Novotny wrote:
>>>>>>>>> Not really, pygrub doesn't do any manipulation with file
>>>>>>>>> system and
>>>>>>>>> also, it's not working on a life file system. It's called
>>>>>>>>> before the
>>>>>>>>> guest boots up to read information about grub.conf/initrd and
>>>>>>>>> kernel for
>>>>>>>>> PV guest and after this is read and selected in pygrub then the
>>>>>>>>> guest is
>>>>>>>>> booted using the kernel and initrd extracted from the image
>>>>>>>>> (after
>>>>>>>>> which
>>>>>>>>> the file is closed). Once again, nothing uses write support
>>>>>>>>> and it
>>>>>>>>> was
>>>>>>>>> added just to make it use O_DIRECT for both read and write
>>>>>>>>> operations
>>>>>>>>> but only pygrub uses only read support and O_DIRECT passed
>>>>>>>>> here is
>>>>>>>>> the
>>>>>>>>> only way to make it use non-cached data.
>>>>>>>> So what caches get in the way? From the above it seems the
>>>>>>>> situation
>>>>>>>> is the following:
>>>>>>>>
>>>>>>>> - filesystem N is a guest filesystem. It's not usually mounted
>>>>>>>> on the
>>>>>>>> host, except for initial setup long time ago
>>>>>>>
>>>>>>> Yes, it is really a guest file system. This is not mounted in
>>>>>>> the host
>>>>>>> and the reason is to get actual version of grub.conf, initrd and
>>>>>>> kernel
>>>>>>> to be booted...
>>>>>>>
>>>>>>>> - before booting a guest your "pygrub" tools needs to read
>>>>>>>> files on
>>>>>>>> it, and it's doing so using e2fsprogs
>>>>>>>
>>>>>>> Correct.
>>>>>>>
>>>>>>>> - once the guest is life it uses the extN kernel driver to
>>>>>>>> access the
>>>>>>>> filesystem
>>>>>>>
>>>>>>> That's right. So this is no longer pygrub responsibility...
>>>>>>>
>>>>>>>> nowhere in this cycle you should have any stale cached data. The
>>>>>>>> kernel
>>>>>>>> always makes sure to write back data on umount/reboot, as does
>>>>>>>> e2fsprogs
>>>>>>>> if actually used to write data (which you said is not the case
>>>>>>>> anyway).
>>>>>>>
>>>>>>> In fact I was unable to run into those problems myself but
>>>>>>> reporter/customer did.
>>>>>>>
>>>>>>>> The only data that may be in the cache are unmodified data from
>>>>>>>> reads
>>>>>>>> on the block device from either e2fsprogs or a suboptimal virtual
>>>>>>>> block
>>>>>>>> device implementation, but these can't cause any problems.
>>>>>>> Michal
>>>>>>
>>>>>> If the guest is the only one (when running) that installs a new
>>>>>> grub.conf file and kernel and it shuts down properly, you should be
>>>>>> good. It if does not shut down cleanly, it could have a stale
>>>>>> grub.conf file (or worse, a partially written one), but using
>>>>>> O_DIRECT to bypass the file system cache should not help.
>>>>>>
>>>>>> If we cannot reproduce this failure, sounds like we need to go back
>>>>>> and get a better understanding of what the customer saw?
>>>>>>
>>>>>> ric
>>>>>>
>>>>> That's right. I am going write an e-mail regarding this
>>>>> information to
>>>>> the reproducer if this bug and tell him that I need more information
>>>>> about what's happening at the customer side.
>>>>>
>>>> One more thing to point out, let's have a look at:
>>>> https://bugzilla.redhat.com/show_bug.cgi?id=466681#c15 .This is about
>>>> workaround to drop caches to be added to pygrub in the host machine
>>>> using this command:
>>>>
>>>> echo 1> /proc/sys/vm/drop_caches
>>>>
>>>> So this really looks like the caching issue if it's working fine after
>>>> dropping the caches. That may be the reason why this could be fine
>>>> with
>>>> this patch present in e2fsprogs.
>>>>
>>>> Michal
>>>
>>> That BZ has a pretty long and twisted history, but after a quick
>>> read, I still don't see why a cleanly shutdown guest would have
>>> issues with caching that using O_DIRECT on read would help.
>>>
>>> We will need to dig into a bit more...
>>>
>>> ric
>>>
>> I am not saying we don't need to dig a little bit more, we surely do
>> but unfortunately I am waiting for information from reporter. But I
>> am also thinking that this O_DIRECT functionality support to bypass
>> caches could be useful...
>>
>> Thanks,
>> Michal
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to majordomo@...r.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
> I can not see where the cache could cause this problem but is it
> possible that it is in the Host file system rather than than the guest
> where it is causing a problem;
This may be right because drop caches in the host is a working
workaround. Also, I am having some information about it. Scott wrote
that he was able to reproduce it but with my patches applied it is
working fine. I am waiting for more information about that and customer
test results...
Michal
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists