lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 27 Jul 2016 17:16:36 +0200
From:	Stefan Bader <stefan.bader@...onical.com>
To:	Kent Overstreet <kent.overstreet@...il.com>
Cc:	linux-bcache@...r.kernel.org, dm-devel@...hat.com,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	liuzhengyuang521@...il.com, bcache@...ux.ewheeler.net,
	apw@...onical.com, Stefan Bader <stefan.bader@...onical.com>
Subject: Re: bcache super block corruption with non 4k pages

On 26.07.2016 14:49, Kent Overstreet wrote:
> On Tue, Jul 26, 2016 at 02:32:31PM +0200, Stefan Bader wrote:
>> On 26.07.2016 12:21, Kent Overstreet wrote:
>>> On Tue, Jul 26, 2016 at 11:51:25AM +0200, Stefan Bader wrote:
>>>> On 21.07.2016 10:58, Stefan Bader wrote:
>>>>> I was pointed at the thread which seems to address the same after
>>>>> I wrote most of below text. Did not want to re-write this so please
>>>>> bear with the odd layout.
>>>>>
>>>>> https://www.redhat.com/archives/dm-devel/2016-June/msg00015.html
>>>>>
>>>>> Zhengyuan tries to fix the problem by relocating the superblock on
>>>>> disk. But I am not sure whether there is really any guarantee about
>>>>> how __bread fills data into the buffer_head. What if there is the next
>>>>> odd arch with 128K pages?
>>>>>
>>>>> So below is an attempt to be more generic. Still I don't feel completely
>>>>> happy with the way that a page moves (or is shared) between buffer_head
>>>>> and biovec. What I tried to outline below is to let the register functions
>>>>> allocate bio+biovec memory and use the in-memory sb_cache data to initialize
>>>>> the biovec buffer.
>>>>
>>>> Any opinions here? Also adding LKML as I don't seem to get through moderation on
>>>> dm-devel.
>>>
>>> The correct solution is to rip out the __bread() and just read the superblock by
>>> issuing a bio, the same way all the other IO in bcache is done.
>>>
>>> This is the way it's done in the bcache-dev branch - unfortunately, the patch
>>> that does that in bcache-dev is big and invasive and probably not worth the
>>> hassle to backport:
>>>
>>> https://evilpiepirate.org/git/linux-bcache.git/commit/?h=bcache-dev&id=303eb67bffad57b4d9e71523e7df04bf258e66d1
>>
>> I agree that this looks better and also rather large.
>>>
>>> Probably best to just do something small and localized.
>>>
>> So what did you think about the change I did? It seemed to be ok for 4K and 64K
>> at least and is rather small. And I believe that, compared to Zhengyuan's
>> approach this would have the benefit of not changing the superblock sector. So
>> it would be compatible with previously created superblocks.
> 
> Too ugly to live. Just kmalloc() 4k, allocate a bio on the stack, set it up, and
> submit it with submit_bio_wait(). Use virt_to_page(), don't bother with raw
> pages - you want 4k, not whatever the page size is.
> 

So here is another attempt which does half the proposed changes. And before you
point out that it looks still ugly, let me explain some reasons. The goal for me
would be to have something as small as possible to be acceptable as stable change.
And the part about putting a bio on the stack and using submit_bio_wait: I
believe you meant in read_super to replace the __bread. I was thinking about
that but in the end it seemed to make the change unnecessary big. Whether using
__bread or submit_bio_wait, in both cases and without needing to make more
changes on the write side, read_super has to return the in-memory and on-disk
variant of the superblock. So as long as nothing that is related to __bread is
leaked out of read_super, it is much better than what is there now. And I remain
as small as possible for the delta.

So there is one part of the patch which I find hard to do in a better manner but
is a bit ugly: and that is to sanely free the sb_disk_data blob on all error
paths but not on success when it is assigned to either cache or cached_dev.
Could possibly pass the address of the pointer and then clear it inside the
called functions. Not sure that would be much less strange...


-Stefan



View attachment "0001-bcache-read_super-handle-architectures-with-more-tha.patch" of type "text/x-diff" (7690 bytes)

Download attachment "signature.asc" of type "application/pgp-signature" (837 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ