linux-kernel - Re: [RFC PATCH] mmc: core: Disable REQ_FUA if the eMMC supports an internal cache

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fe407f8f-6676-12b8-3e45-2511219bfe28@intel.com>
Date:   Tue, 7 Mar 2023 15:15:50 +0200
From:   Adrian Hunter <adrian.hunter@...el.com>
To:     Ulf Hansson <ulf.hansson@...aro.org>
Cc:     "linux-mmc@...r.kernel.org" <linux-mmc@...r.kernel.org>,
        Jens Axboe <axboe@...nel.dk>,
        Wenchao Chen <wenchao.chen666@...il.com>,
        Avri Altman <avri.altman@....com>,
        "linux-block@...r.kernel.org" <linux-block@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Christian Löhle <CLoehle@...erstone.com>,
        Bean Huo <huobean@...il.com>
Subject: Re: [RFC PATCH] mmc: core: Disable REQ_FUA if the eMMC supports an
 internal cache

On 6/03/23 18:09, Ulf Hansson wrote:
> On Fri, 3 Mar 2023 at 15:41, Adrian Hunter <adrian.hunter@...el.com> wrote:
>>
>> On 3/03/23 14:01, Ulf Hansson wrote:
>>> On Fri, 3 Mar 2023 at 12:40, Christian Löhle <CLoehle@...erstone.com> wrote:
>>>>
>>>>
>>>>>>
>>>>>> REQ_FUA is in general supported for eMMC cards, which translates into so called "reliable writes". To support these write operations, the CMD23 (MMC_CAP_CMD23), needs to be supported by the mmc host too, which is common but not always the case.
>>>>>>
>>>>>> For some eMMC devices, it has been reported that reliable writes are quite costly, leading to performance degradations.
>>>>>>
>>>>>> In a way to improve the situation, let's avoid announcing REQ_FUA support if the eMMC supports an internal cache, as that allows us to rely solely on flush-requests (REQ_OP_FLUSH) instead, which seems to be a lot cheaper.
>>>>>> Note that, those mmc hosts that lacks CMD23 support are already using this type of configuration, whatever that could mean.
>>>>>
>>>>> Just note that reliable write is strictly weaker than turning cache off/flushing, if card loses power during cache off/flush programming / busy, sector-wise atomicity is not mandated by the spec.
>>>>> (And that is assuming cache off/flush is actually respected by the card as intended by the spec, should some cards be checked?) Maybe some FS people can also chime in?
>>>>
>>>> Nevermind, the sector-wise atomicity should not matter on 5.1 cards or if the block length isn't being played with, which it isn't in our case.
>>>> If reliable write is implemented only according to spec, I don't see why the cache flushing should be less expensive, which would only make sense if
>>>> a) < sector chunks are committed to flash
>>>> b) reliable write is implemented much stricter than the spec, ensuring atomicity for the entire write.
>>>
>>> Right, I agree!
>>>
>>> Note 1) Reliable write was introduced way before cache management in
>>> the eMMC spec. So, if the support for reliable write would have a
>>> stricter implementation than needed, I would not be surprised.
>>
>> I am not sure when you say stricter than needed.  Historically
>> file systems assumed that sectors are updated atomically i.e.
>> there is never a sector with a mixture of old and new data.
>> The eMMC spec does not guarantee that, except for reliable
>> write.
> 
> Yes, I agree. With stricker, I was merely thinking of whether the eMMC
> makes the entire write request (multiple sectors) being atomic, or the
> guarantee is only per sector basis.
> 
> According to the eMMC spec, that seems to be implementation specific.
> One option could be heavier than the other, I guess.
> 
>>
>> File systems may use REQ_FUA for important information, like the
>> superblock or a journal commit record, so using reliable write
>> for REQ_FUA would seem to give better protection against file system
>> corruption than a cache flush which could leave a sector
>> half-written.
> 
> Yes, I agree. If we should fully conform to what is stated in the eMMC
> spec, we should probably keep the current path to support REQ_FUA.
> 
>>
>> On the other hand, sudden power loss is probably rare in battery
>> powered systems because they are designed to monitor the battery
>> power and shutdown when it gets too low.
>>
>> And file systems can use checksums to detect half-written updates.
>>
>> And there is anyway no protection for other (non REQ_FUA) writes a
>> file system might do and expect not to tear sectors.
>>
>> And you are more likely to smash the screen than bounce the battery
>> out and cause an unrecoverable file system error.
> 
> Right, these are good arguments to why $subject patch perhaps makes
> sense to move forward with anyway.

Yes

> 
> Moreover, it seems like some eMMC vendors don't really have any
> concerns with us moving away from reliable writes, to instead use only
> "cache flushing". I guess it can indicate that the regular writes may
> already be treated in an atomic kind of way, but who knows.

Indeed

> 
>>
>> Nevertheless, the commit message of this patch reads like the change
>> is an optimization, whereas it seems more like a policy change.
>> The commit message should perhaps say something like:
>> "The consensus is that the benefit of improved performance by not
>> using reliable-write for REQ_FUA is much greater than any potential
>> benefit that reliable-write might provide to avoid file system
>> corruption in the event of sudden power loss."
> 
> I agree. I will improve it along the lines of what you suggest.
> 
>>
>> As for allowing for the policy to be overridden, perhaps an mmc_core
>> module option?
> 
> Even if I am not very fond of module parameters, this seems like a
> reasonable thing to use for this case.
> 
> I was also looking at using a card quirk.

Yes that makes sense

> 
>>
>>>
>>> Note 2) In the eMMC v5.1 spec, the cache flushing support has been
>>> extended to allow an explicit barrier operation. Perhaps, we should
>>> let that option take precedence over a regular flush+barrier, for
>>> REQ_OP_FLUSH!?
>>>
>>>>
>>>> I guess the cards which increase performance do b)? Or something else?
>>>
>>> That's the tricky part to know, as it's the internals of the eMMC.
>>
>> It is the natural conclusion though.  The eMMC probably does not
>> update mapping information with every write, instead if power is
>> lost, it scans the updated areas at the next initialization. (The
>> power-off notify feature would commit the mapping information to
>> media to avoid that).  So a reliable write might have to:
>> 1. write information to record that the old mapping
>> should be used, not what might be discovered by scanning
>> 2. do the actual write
>> 3. write mapping information to record the new mapping
> 
> Yes. And depending on the eMMC device, some may be more clever than
> others for how to actually deal with this.
> 
>>
>>>
>>> Although, it seems like both Avri (WDC) and Bean (Micron) would be
>>> happy to proceed with $subject patch, which makes me more comfortable
>>> to move forward.
>>>
>>>> Anyway regarding FUA i don't have any concerns regarding reliability with cache flush.
>>>> I can add some performance comparisons with some eMMCs I have around though.
>>>
>>> That would be great, thanks a lot for helping out with testing!
>>>
> 
> Kind regards
> Uffe