[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1b02ddae-35ae-8ff7-760f-bc9bafee4541@gmail.com>
Date: Wed, 5 Sep 2018 08:07:25 -0400
From: "Austin S. Hemmelgarn" <ahferroin7@...il.com>
To: 焦晓冬 <milestonejxd@...il.com>,
R.E.Wolff@...wizard.nl
Cc: martin@...htvoll.de, jlayton@...hat.com,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: POSIX violation by writeback error
On 2018-09-05 04:37, 焦晓冬 wrote:
> On Wed, Sep 5, 2018 at 4:04 PM Rogier Wolff <R.E.Wolff@...wizard.nl> wrote:
>>
>> On Wed, Sep 05, 2018 at 09:39:58AM +0200, Martin Steigerwald wrote:
>>> Rogier Wolff - 05.09.18, 09:08:
>>>> So when a mail queuer puts mail the mailq files and the mail processor
>>>> can get them out of there intact, nobody is going to notice. (I know
>>>> mail queuers should call fsync and report errors when that fails, but
>>>> there are bound to be applications where calling fsync is not
>>>> appropriate (*))
>>>
>>> AFAIK at least Postfix MDA only reports mail as being accepted over SMTP
>>> once fsync() on the mail file completed successfully. And I´d expect
>>> every sensible MDA to do this. I don´t know how Dovecot MDA which I
>>> currently use for sieve support does this tough.
>>
>
> Is every implementation of mail editor really going to call fsync()? Why
> they are going to call fsync(), when fsync() is meant to persist the file
> on disk which is apparently unnecessary if the delivering to SMTP task
> won't start again after reboot?
Not mail clients, the actual servers. If they implement the SMTP
standard correctly, they _have_ to call fsync() before they return that
an email was accepted for delivery or relaying, because SMTP requires
that a successful return means that the system can actually attempt
delivery, which is not guaranteed if they haven't verified that it's
actually written out to persistent storage.
>
>> Yes. That's why I added the remark that mailers will call fsync and know
>> about it on the write side. I encountered a situation in the last few
>> days that when a developer runs into this while developing, would have
>> caused him to write:
>> /* Calling this fsync causes unacceptable performance */
>> // fsync (fd);
>>
>> I know of an application somewhere that does realtime-gathering of
>> call-records (number X called Y for Z seconds). They come in from a
>> variety of sources, get de-duplicated standardized and written to
>> files. Then different output modules push the data to the different
>> consumers within the company. Billing among them.
>>
>> Now getting old data there would be pretty bad. And calling fsync
>> all the time might have performance issues....
>>
>> That's the situation where "old data is really bad".
>>
>> But when apt-get upgrade replaces your /bin/sh and gets a write error
>> returning error on subsequent reads is really bad.
>
> At this point, the /bin/sh may be partially old and partially new. Execute
> this corrupted bin is also dangerous though.
But the system may still be usable in that state, while returning an
error there guarantees it isn't. This is, in general, not the best
example though, because no sane package manager directly overwrites
_anything_, they all do some variation on replace-by-rename and call
fsync _before_ renaming, so this situation is not realistically going to
happen on any real system.
Powered by blists - more mailing lists