[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20201218012725.GD6908@magnolia>
Date: Thu, 17 Dec 2020 17:27:25 -0800
From: "Darrick J. Wong" <darrick.wong@...cle.com>
To: Wang Shilong <wangshilong1991@...il.com>
Cc: Saranya Muruganandam <saranyamohan@...gle.com>,
Ext4 Developers List <linux-ext4@...r.kernel.org>,
"Theodore Ts'o" <tytso@....edu>, adilger.kernel@...ger.ca,
Li Xi <lixi@....com>, Wang Shilong <wshilong@....com>
Subject: Re: [RFC PATCH v3 02/61] e2fsck: copy context when using
multi-thread fsck
On Fri, Dec 18, 2020 at 09:13:25AM +0800, Wang Shilong wrote:
> On Fri, Dec 18, 2020 at 8:01 AM Darrick J. Wong <darrick.wong@...cle.com> wrote:
> >
> > On Wed, Nov 18, 2020 at 07:38:48AM -0800, Saranya Muruganandam wrote:
> > > From: Li Xi <lixi@....com>
> > >
> > > This patch only copy the context to a new one when -m is enabled.
> > > It doesn't actually start any thread. When pass1 test finishes,
> > > the new context is copied back to the original context.
> > >
> > > Since the signal handler only changes the original context, so
> > > add global_ctx in "struct e2fsck_struct" and use that to check
> > > whether there is any signal of canceling.
> > >
> > > This patch handles the long jump properly so that all the existing
> > > tests can be passed even the context has been copied. Otherwise,
> > > test f_expisize_ea_del would fail when aborting.
> > >
> > > Signed-off-by: Li Xi <lixi@....com>
> > > Signed-off-by: Wang Shilong <wshilong@....com>
> > > Signed-off-by: Saranya Muruganandam <saranyamohan@...gle.com>
> > > ---
> > > e2fsck/pass1.c | 114 +++++++++++++++++++++++++++++++++++++++++++++----
> > > e2fsck/unix.c | 1 +
> > > 2 files changed, 107 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/e2fsck/pass1.c b/e2fsck/pass1.c
> > > index 8eecd958..64d237d3 100644
> > > --- a/e2fsck/pass1.c
> > > +++ b/e2fsck/pass1.c
> > > @@ -1144,7 +1144,22 @@ static int quota_inum_is_reserved(ext2_filsys fs, ext2_ino_t ino)
> > > return 0;
> > > }
> > >
> > > -void e2fsck_pass1(e2fsck_t ctx)
> > > +static int e2fsck_should_abort(e2fsck_t ctx)
> > > +{
> > > + e2fsck_t global_ctx;
> > > +
> > > + if (ctx->flags & E2F_FLAG_SIGNAL_MASK)
> > > + return 1;
> > > +
> > > + if (ctx->global_ctx) {
> > > + global_ctx = ctx->global_ctx;
> > > + if (global_ctx->flags & E2F_FLAG_SIGNAL_MASK)
> > > + return 1;
> > > + }
> > > + return 0;
> > > +}
> > > +
> > > +void e2fsck_pass1_thread(e2fsck_t ctx)
> > > {
> > > int i;
> > > __u64 max_sizes;
> > > @@ -1360,7 +1375,7 @@ void e2fsck_pass1(e2fsck_t ctx)
> > > if (ino > ino_threshold)
> > > pass1_readahead(ctx, &ra_group, &ino_threshold);
> > > ehandler_operation(old_op);
> > > - if (ctx->flags & E2F_FLAG_SIGNAL_MASK)
> > > + if (e2fsck_should_abort(ctx))
> > > goto endit;
> > > if (pctx.errcode == EXT2_ET_BAD_BLOCK_IN_INODE_TABLE) {
> > > /*
> > > @@ -1955,7 +1970,7 @@ void e2fsck_pass1(e2fsck_t ctx)
> > > if (process_inode_count >= ctx->process_inode_size) {
> > > process_inodes(ctx, block_buf);
> > >
> > > - if (ctx->flags & E2F_FLAG_SIGNAL_MASK)
> > > + if (e2fsck_should_abort(ctx))
> > > goto endit;
> > > }
> > > }
> > > @@ -2068,6 +2083,89 @@ endit:
> > > else
> > > ctx->invalid_bitmaps++;
> > > }
> > > +
> > > +static errcode_t e2fsck_pass1_thread_prepare(e2fsck_t global_ctx, e2fsck_t *thread_ctx)
> > > +{
> > > + errcode_t retval;
> > > + e2fsck_t thread_context;
> > > +
> > > + retval = ext2fs_get_mem(sizeof(struct e2fsck_struct), &thread_context);
> >
> > Hm, so I guess the strategy here is that parallel e2fsck makes
> > per-thread copies of the ext2_filsys and e2fsck_t global contexts?
> > And then after the threaded parts complete, each thread merges its
> > per-thread contexts back into the global one, right?
>
> Yes.
>
> >
> > This means that we have to be careful to track which fields in those
> > cloned contexts have been updated by the thread so that we can copy them
> > back and not lose any data.
> >
> > I'm wondering if for future maintainability it would be better to track
> > the per-thread data in a separate structure to make it very explicit
> > which data (sub)structures are effectively per-thread and hence don't
> > require locking?
>
> Maybe use a per-thread structure is better maintained, but i am not sure
> we could remove locking completely.
>
> Locking is mostly used for fix, because fixing is serialized now
> and for some global structure which could be used seldomly
> but could simplify codes.
<nod> I was assuming that you'd still put a lock in the global structure
and use it for data fields that aren't so frequently accessed.
--D
> >
> > (I ask that mostly because I'm having a hard time figuring out which
> > fields are supposed to be shared and which ones aren't...)
> >
> > --D
> >
> > > + if (retval) {
> > > + com_err(global_ctx->program_name, retval, "while allocating memory");
Powered by blists - more mailing lists