[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <OFCA827B78.F987C4A1-ON8825742C.00781E5A-8825742C.0079F2C9@us.ibm.com>
Date: Tue, 15 Apr 2008 15:11:54 -0700
From: Bryan Henderson <hbryan@...ibm.com>
To: lsorense@...lub.uwaterloo.ca (Lennart Sorensen)
Cc: Bodo Eggert <7eggert@....de>, Diego Calleja <diegocg@...il.com>,
Jan Kara <jack@...e.cz>, Jiri Kosina <jkosina@...e.cz>,
linux-fsdevel@...r.kernel.org,
Linux Kernel list <linux-kernel@...r.kernel.org>,
Michal Hocko <mhocko@...e.cz>, Meelis Roos <mroos@...ux.ee>,
Pavel Machek <pavel@....cz>
Subject: Re: file offset corruption on 32-bit machines?
> Well it would take seriously hard work to make a program that would work
> correctly if it was atomic and would break if it isn't. Certainly a
> normal program that just tries to seek and read/write should never have
> any issue.
I can easily imagine such a program. I think you aren't exercising enough
imagination about the kinds of requirements a program might be
implementing.
That lack of imagination (in all of us) is the reason we shouldn't
tolerate something working not as designed or not as expected just because
we went through every possible use scenario and it didn't matter in any of
them. Just focus on the layer in question.
The easiest way to imagine a program not doing locking and being useful
anyway (as long as the kernel is thread-safe) is to use the same arguments
you use for the kernel doing it: there's a higher level user responsible
for locking. The code in question doesn't guarantee that user writes all
its stuff to the right place, but at least it guarantees that that user's
lack of locking doesn't screw some other user of the file. It does that
by ensuring it never seeks to a place the user doesn't own and that no two
separate users ever access the file at the same time.
I'd even like to accomodate the poor user trying to debug the broken
locking in his application. He sees the file getting corrupted and
immediately thinks, "what if my thread serialization isn't working right?"
But he notices that the corruption isn't consistent with that hypothesis.
He knows he was working with only the beginning and the end of the file
and the corruption happened in the middle. So he wastes a week
considering other hypotheses, including a kernel bug, until someone points
out a paragraph in the lseek() man page that says contrary to all Unix
convention, that particular function and system call is not thread-safe,
and it doesn't necessarily seek to the place mentioned in its argument.
--
Bryan Henderson IBM Almaden Research Center
San Jose CA Filesystems
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists