Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use background Job for search indices #760

Open
sebastienros opened this issue May 23, 2017 · 26 comments
Open

Use background Job for search indices #760

sebastienros opened this issue May 23, 2017 · 26 comments
Milestone

Comments

@sebastienros
Copy link
Member

Once we have the Jobs Queue feature, when rebuilding/resetting and index, use a job.
This will prevent to potentially process a long running task on the request thread. In case there are several content items to process.

@lahma
Copy link
Contributor

lahma commented Feb 1, 2022

Five years later, I'm still trying to figure out, why aren't you using Quartz.NET 😉

@Skrypt
Copy link
Contributor

Skrypt commented Feb 1, 2022

Maybe because of the tenant context.

@lahma
Copy link
Contributor

lahma commented Feb 1, 2022

Maybe. You could have job/trigger groups per tenant though. If you are using single database for multiple tenants that also maps to Quartz in a way, you just need a discriminator.

@Skrypt
Copy link
Contributor

Skrypt commented Feb 1, 2022

Also, these long-running tasks could be executed using SignalR but in Orchard 1 we used simple AJAX recursive calls on a controller. Maybe because it was simpler (fewer dependencies) that way. At the same, I understand that we should probably start thinking about a common way to handle long-running tasks. I was about to start on working the "rebuild" button for the Lucene indices today but maybe we should elaborate a plan about this and discuss Quartz.NET.

I know we already have Background Tasks. Though, no Jobs Queue feature yet... unless I missed it.

I let people debate Quartz vs Hangfire or any other similar alternatives. But I bet that we have already one vote for Quartz.NET.😉

@Skrypt
Copy link
Contributor

Skrypt commented Feb 2, 2022

@lahma Maybe you should be the one working on this ... if you want ... and have time.

@deanmarcussen
Copy link
Member

@lahma I did actually get started on something related to this. Video here https://www.youtube.com/watch?v=q6MsI0CSbPY

The infrastructure is quite light, compared to Quartz.Net - not trying to reinvent the wheel, but I didn't go down the Quartz.Net route with the infrastructure, due to the multi tenancy / databases and service providers.

I'll try and get a draft pr opened this week, it's been lagging, due to time constraints, and maybe we can talk about how we could use Quartz.Net for it instead?

(as I said the actual infrastructure I've written for it, is pretty lightweight, so it's unlikely to be as resilient...)

@hishamco
Copy link
Member

hishamco commented Feb 2, 2022

@deanmarcussen is that you started for Archive & Publish Later modules? If Yes hope to see some code ASAP ;)

@lahma
Copy link
Contributor

lahma commented Feb 2, 2022

@Skrypt I might be a bit time-constrained, but I can try to help where I can. I have some scheduling experience at least.

@deanmarcussen sounds cool, I'd be happy to see a version of that. By all means I'm not pushing Quartz.NET as golden hammer here, started with a joke 😉 Both Hangfire and Quartz.NET (there are others too) offer some extra features and both come with pros and cons of course.

If there's suitable abstraction for using the service the implementation could be probably changed with some effort. All depends what "more advanced" features are needed which might create problems due to implementation differences.

@lahma
Copy link
Contributor

lahma commented Feb 2, 2022

We recently had a discussion about multi-tenancy with Quartz and here's one solution that was used, just for reference: quartznet/quartznet#1486

@lahma
Copy link
Contributor

lahma commented Feb 2, 2022

OK watched the video and some thoughts, I'm mostly speaking from Quartz.NET perspective as it's a bit more familiar library to me. NOTE: I haven't looked into OC's implementation, not even the the UI configuration, I don't know the needs and wants.

  • love the confusion between naming and concepts, naming just is hard and I've had to explain to me and others which is which in Quartz context (job, trigger, fire instance etc), and these can differ between scheduling implementations
  • a lot of stuff discussed (priority, concurrency, pausing, manual triggering, job/thread/task pools and sizes) are solved and non-trivial problems, you might want to consider existing solutions or limit what you are going to implement as minimal version, unless you see benefit here to have own
  • you follow unix-like cron logic in the UI at least (minutes), Quartz has seconds and is non-compliant compared to classic version, but can of course be hidden by not showing the first part (seconds) which allows more-fine grained scheduling (which I think was also discussed here)
  • time zones, these can get tricky if you want to support publishing content at 08AM Finnish time, don't get me started with DST
  • I don't know how persistence works in OC tasks, but running in a cluster (multiple web instances) also requires distributed locking
  • retries and misfires, when instance is down during expected fire time what should happen? Quartz has different strategies for this like "fire now "or "skip and continue with schedule", here's important to distinguish one-shot and repeating jobs
  • trigger/task types, your UI shows cron which is quite flexible but can be a bit clumsy for some tasks (Quartz has different trigger types for this reason)
  • calendars (a Quartz concept), a way to refine schedule by checking something extra if given calculated schedule is valid - "don't fire if it's not a bank day"

Many of these are solved problems like I stated before, but maybe things to consider in your implementation.

@Skrypt
Copy link
Contributor

Skrypt commented Feb 2, 2022

Looks like I missed that recent meeting which makes a big difference in my planning now. 😉
Is there any way to show for example the progress of a running background job with your solution @deanmarcussen ?
I would like to be able for example on the Lucene Indices list view to trigger a rebuild and display a progress bar that would display an approximate of the task completion. Also, since you seem to have job instances, it would be interesting to see if this progress bar could be displayed still after a page refresh. One thing that I thought about is that we could also use something like this component https://getbootstrap.com/docs/5.0/components/toasts/ to notify an admin UI user that the job actually completed.

Maybe that would require that we use SignalR for that matter?

@deanmarcussen
Copy link
Member

  • love the confusion between naming and concepts, naming just is hard and I've had to explain to me and others which is which in Quartz context (job, trigger, fire instance etc), and these can differ between scheduling implementations

😄

  • a lot of stuff discussed (priority, concurrency, pausing, manual triggering, job/thread/task pools and sizes) are solved and non-trivial problems, you might want to consider existing solutions or limit what you are going to implement as minimal version, unless you see benefit here to have own

Going with limiting what we do, as some are super non trivial, but the interfaces are extendable, so if anyone wants to do custom things, they can.

  • you follow unix-like cron logic in the UI at least (minutes), Quartz has seconds and is non-compliant compared to classic version, but can of course be hidden by not showing the first part (seconds) which allows more-fine grained scheduling (which I think was also discussed here)

Yes, went with Cron for the repeat scheduler. Again extendable, if anyone wants something different.
The scheduler itself is driven by our other background task infrastructure which is only accurate to 1min.

  • time zones, these can get tricky if you want to support publishing content at 08AM Finnish time, don't get me started with DST

Haha, not a chance I'm getting into that game... Extendable, of course... and the publish later part already handles that time conversion, in the case of publishing to a local time zone.

  • I don't know how persistence works in OC tasks, but running in a cluster (multiple web instances) also requires distributed locking

Yes, persistance and queuing, is based around a distributed lock.

  • retries and misfires, when instance is down during expected fire time what should happen? Quartz has different strategies for this like "fire now "or "skip and continue with schedule", here's important to distinguish one-shot and repeating jobs

Hadn't thought about that one, worth considering what options should happen. Perhaps a window of time would make a configuration

  • trigger/task types, your UI shows cron which is quite flexible but can be a bit clumsy for some tasks (Quartz has different trigger types for this reason)

It's intended to be largely code driven for actual trigger, currently have Now Delay Utc , Cron for rescheduling.
We have a good background task infrastructure as well, so this is largely about extending that infra, to support queues, for firing at specific times, and

  • calendars (a Quartz concept), a way to refine schedule by checking something extra if given calculated schedule is valid - "don't fire if it's not a bank day"

Cool! Yes, probably something to handle by implementing queue handlers.

Many of these are solved problems like I stated before, but maybe things to consider in your implementation.

Failure, due to server shutdown, or just general failure is the one I'm still trying to figure out!

Really hoping I can get some time to finish this feature off, appreciate your thoughts @lahma

@Skrypt there will be an api to poll, for status, and scheduling a job could return an id for polling. Similar to creating a resource in Azure.

If you used signalr for progress (we do on the front end, works well), I would anticipate the job itself should be sending those messages, rather than the infra surrounding it. (progress can be so hard to calculate on some jobs / easy on others)

@sebastienros sebastienros added this to the backlog milestone Feb 3, 2022
@Skrypt
Copy link
Contributor

Skrypt commented Feb 3, 2022

Yes progress is always an estimate which can be hard to calculate but better something than nothing I guess. When we talk about Lucene indexing it is most of the time taking the same amount of time for each content item unless there are IndexingProviders involved or Handlers.

@lahma
Copy link
Contributor

lahma commented Feb 3, 2022

@Skrypt there will be an api to poll, for status, and scheduling a job could return an id for polling. Similar to creating a resource in Azure.

If you used signalr for progress (we do on the front end, works well), I would anticipate the job itself should be sending those messages, rather than the infra surrounding it. (progress can be so hard to calculate on some jobs / easy on others)

Polling for task status sounds good, allows listing of in-progress jobs and with persistence works inside cluster where job could be running on other node.

Quartz also has concept of "maintenance schedulers" which don't run jobs but allow CRUD for schedules. Not all nodes are made equal, "node running this job needs to have MS Office installed".

@jtkech
Copy link
Member

jtkech commented Mar 4, 2023

@sebastienros

Once we have the Jobs Queue feature, when rebuilding/resetting and index, use a job.
This will prevent to potentially process a long running task on the request thread. In case there are several content items to process.

Just for info, for now we have at least a simple background job implementation allowing a job to be executed in an isolated scope after the current HTTP request is completed.

/// <summary>
/// Executes a background job in an isolated <see cref="ShellScope"/> after the current HTTP request is completed.
/// </summary>
public static Task ExecuteAfterEndOfRequestAsync(string jobName, Func<ShellScope, Task> job)

We already use it to rebuild indexes in the related Lucene recipe step, but not yet in the content handler.

await HttpBackgroundJob.ExecuteAfterEndOfRequestAsync("lucene-index-rebuild", async (scope) =>
{
var luceneIndexSettingsService = scope.ServiceProvider.GetRequiredService<LuceneIndexSettingsService>();
var luceneIndexingService = scope.ServiceProvider.GetRequiredService<LuceneIndexingService>();
var indices = model.IncludeAll ? (await luceneIndexSettingsService.GetSettingsAsync()).Select(x => x.IndexName).ToArray() : model.Indices;
foreach (var indexName in indices)
{
var luceneIndexSettings = await luceneIndexSettingsService.GetSettingsAsync(indexName);
if (luceneIndexSettings != null)
{
await luceneIndexingService.RebuildIndexAsync(indexName);
await luceneIndexingService.ProcessContentItemsAsync(indexName);
}
}
});

@Piedone
Copy link
Member

Piedone commented Apr 12, 2024

@deanmarcussen can you point to the code of where you started this work?

@hishamco
Copy link
Member

hishamco commented Apr 12, 2024

If I'm not wrong this is the branch https://github.com/OrchardCMS/OrchardCore/tree/deanmarcussen/jobs

@deanmarcussen
Copy link
Member

deanmarcussen commented Apr 12, 2024

Five years later, I'm still trying to figure out, why aren't you using Quartz.NET 😉

@Piedone @lahma
I needed this functionality for my day job, and the implementation I started here was -ok- but after a bit of time spent it was obvious Quartz.NET was significantly more robust so I implemented a multi tenancy wrapper around the IJob

Would recommend, very stable.
We now run all our workflows as background jobs in Quartz (they are long running) and many other jobs.

Happy to share the code, it is in the end, not that hard.

@giannik
Copy link
Contributor

giannik commented Apr 13, 2024

@deanmarcussen would be great if you could share any code on using quartz.net with multi tenancy in orchard core.

@Piedone
Copy link
Member

Piedone commented Apr 14, 2024

Thanks, Hisham!

Dean: yes, please do. "Just" integrating Quartz looks like a more appealing solution than implementing our own schedule task infrastructure (#5755).

@hishamco
Copy link
Member

Maybe @lahma is interested in this while he mentioned it here :)

@deanmarcussen
Copy link
Member

@Piedone the concept with quartz is reasonably straightforward.

Quartz is a singleton so needs to be loaded in at the Orchard Hosts Startup

something like this.

Note: we only use sql server, so I haven't looked into the sqlite support with Quartz. If it isn't supported it would be a great pr for someone to do to Quartz.Net

        services.AddQuartz(q =>
        {
            q.UsePersistentStore(x =>
            {
                // Changes in quartz suggest this runs before the start delay now, so it has been disabled.
                x.PerformSchemaValidation = false;
                x.UseProperties = true;
                x.UseSqlServer(quartzConnectionString);
                x.UseNewtonsoftJsonSerializer();
            });
        });

        services.AddQuartzHostedService(options =>
        {
            options.StartDelay = TimeSpan.FromSeconds(60); // This allows autosetup to run and complete migrations.
            options.AwaitApplicationStarted = true; // The default, but being specific.
            // when shutting down we want jobs to complete gracefully
            options.WaitForJobsToComplete = true;
        });

the quartz job is also loaded in as a scoped service here, as it is a host service, not a tenant service.

services.AddScoped<MyQuartzJob>();

the job itself takes the ShellName in its data map

    public async Task Execute(IJobExecutionContext context)
    {
        try
        {
            var shellName = context.JobDetail.JobDataMap.GetString("ShellName");
            ThrowHelper.ThrowIfNullOrEmpty(shellName);

           if (_shellHost.TryGetShellContext(shellName, out var shellContext))
           {
               // These are both async locals so should work independantly.
               _httpContextAccessor.HttpContext = shellContext.CreateHttpContext();
               SetBaseUrl(_httpContextAccessor.HttpContext, baseUrl);

               _actionContextAccessor.ActionContext = await _httpContextAccessor.HttpContext.MakeActionContextAsync();
           }
           else
           {
               throw new InvalidOperationException($"Could not get shell context for shell '{shellName}'");
           }

           var shellScope = await _shellHost.GetScopeAsync(shellName);

           await shellScope.UsingAsync(async scope =>
           {
                      // do work inside the tenants shell here.
                     // some try catching and rethrowing to handle errors inside the shell and also support quartz.net requirements.
           });     
       catch 
       {
             // follow quartz.net general error handling rules here.
       }
    }

submit job to quartz with the current shell name

        var job = JobBuilder
            .Create<MyQuartzJob>()
            .RequestRecovery()
            .UsingJobData("ShellName", _shellSettings.Name)

there's a bunch more stuff around error handling, but that's specific to our use case and how we choice to manage those errors.

This proved more useful for us than the background job code I started for OC, we also replace the workflows background job with a scheduled quartz job which gives better accuracy about when it will fire, and allows us to track those seperately in a ui across all shells (and gives us slightly better fine grained control preventing similar long running and processor intensive background jobs from different shells executing at the same time, or providing rate limits on how many can execute etc).

Happy to help out with advice if someone wants to pick it up as a feature for OC.

@Piedone
Copy link
Member

Piedone commented Apr 19, 2024

Thanks, awesome! Perhaps we could somehow substitute the storage Quartz uses so it stores data in a tenant-specific document or at least table managed via YesSql?

Do you just use the Quartzu UI sideloaded to OC, or is it somehow integrated into the admin?

@deanmarcussen
Copy link
Member

Probably the point on storage, and quartz itself, is that the way I chose to implement it (and I think the best way), is that it's a host specific singleton, with the storage at host level. This is a) simple and b) allows the host to throttle work, as allowing the tenant to throttle work, just means the host can be overloaded).

I have no particular UI related to it, (I largely use it to background workflows, so they provide enough ui in, and of itself - too much in some cases).

I could imagine an Orchard Core implementation, might take what is on the branch I did for jobs (as that has some ui in it), and provides a slight wrapper around quartz.net IJob interface, to remove the need to involve IShellHost etc, (so wrapping away, the need to provide and find shell scopes etc.
So the actual IJob itself might take a reference to the shell name, and the OC.Job (I forget what I called it), then when it runs, retrieve that OC.Job in the correct shell scope, and execute it.

This should give some ui.

Quartz.UI itself, because of the host singleton factor, probably only belongs on the Default host itself, I believe there are a couple of community packages available for it, but have not investigated heavily.

@Piedone
Copy link
Member

Piedone commented Apr 19, 2024

OK, great, thanks for elaborating!

@Piedone Piedone changed the title Use background Job for indices Use background Job for search indices Apr 28, 2024
@Piedone
Copy link
Member

Piedone commented May 21, 2024

Related: #10625.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants