Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cron job to cleanup hook_task table #13080

Merged
merged 30 commits into from
Jan 26, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
42073c2
sketching out hook_task cleanup code
bhalbright Sep 26, 2020
e505ced
continued work on cleanup hook_task
bhalbright Sep 29, 2020
6278f52
added tests for age based hook_task cleanup
bhalbright Sep 29, 2020
65eea98
mostly local debug
bhalbright Oct 7, 2020
1ea1f6a
Merge branch 'master' of https://github.com/go-gitea/gitea into prune…
bhalbright Oct 7, 2020
f541590
changed to use a time duration for the default cleanup
bhalbright Oct 7, 2020
2253354
fixed oldenthan default and unit tests
bhalbright Oct 7, 2020
49304b1
fixed defaults
bhalbright Oct 8, 2020
693a6bd
gofmt
bhalbright Oct 8, 2020
2bfdb06
Merge branch 'master' of https://github.com/go-gitea/gitea into prune…
bhalbright Oct 9, 2020
d1fe028
fixed method names on cleanup hook task unit tests
bhalbright Oct 9, 2020
cfc2984
fixed unit test
bhalbright Oct 9, 2020
b49384e
attempt to fix unit test
bhalbright Oct 9, 2020
9699c77
gofmt
bhalbright Oct 9, 2020
cbe3d89
working on unit test
bhalbright Oct 9, 2020
809f333
added delivered date to hook task test that updates delivered to true
bhalbright Oct 10, 2020
89df8c3
gofmt
bhalbright Oct 10, 2020
04faa7c
added null check
bhalbright Oct 10, 2020
ff4fe65
Merge branch 'master' of https://github.com/go-gitea/gitea into prune…
bhalbright Oct 11, 2020
8556252
Merge branch 'master' of https://github.com/go-gitea/gitea into prune…
bhalbright Oct 18, 2020
d69e6e2
Merge branch 'master' of https://github.com/go-gitea/gitea into prune…
bhalbright Oct 26, 2020
475d3d0
Merge branch 'master' of https://github.com/go-gitea/gitea into prune…
bhalbright Nov 6, 2020
882a7e5
Merge branch 'master' of https://github.com/go-gitea/gitea into prune…
bhalbright Nov 7, 2020
f7b4cad
Merge branch 'master' into prunehooktask
6543 Jan 19, 2021
01a50f3
Merge branch 'master' of https://github.com/go-gitea/gitea into prune…
bhalbright Jan 20, 2021
b8afeb2
fix lint errors
bhalbright Jan 20, 2021
2e4710e
ran gofmt
bhalbright Jan 20, 2021
d57a65f
Merge branch 'master' into prunehooktask
6543 Jan 20, 2021
14b6bef
Merge branch 'master' of https://github.com/go-gitea/gitea into prune…
bhalbright Jan 24, 2021
f7ed1cd
Merge branch 'master' into prunehooktask
6543 Jan 26, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions custom/conf/app.example.ini
Original file line number Diff line number Diff line change
Expand Up @@ -1023,6 +1023,21 @@ SCHEDULE = @every 24h
; deleted branches than OLDER_THAN ago are subject to deletion
OLDER_THAN = 24h

; Cleanup hook_task table
[cron.cleanup_hook_task_table]
; Whether to enable the job
ENABLED = true
; Whether to always run at start up time (if ENABLED)
RUN_AT_START = false
; Time interval for job to run
SCHEDULE = @every 24h
; OlderThan or PerWebhook. How the records are removed, either by age (i.e. how long ago hook_task record was delivered) or by the number to keep per webhook (i.e. keep most recent x deliveries per webhook).
CLEANUP_TYPE = OlderThan
; If CLEANUP_TYPE is set to OlderThan, then any delivered hook_task records older than this expression will be deleted.
OLDER_THAN = 168h
; If CLEANUP_TYPE is set to PerWebhook, this is number of hook_task records to keep for a webhook (i.e. keep the most recent x deliveries).
NUMBER_TO_KEEP = 10

; Extended cron task - not enabled by default

; Delete all unactivated accounts
Expand Down
9 changes: 9 additions & 0 deletions docs/content/doc/advanced/config-cheat-sheet.en-us.md
Original file line number Diff line number Diff line change
Expand Up @@ -694,6 +694,15 @@ NB: You must `REDIRECT_MACARON_LOG` and have `DISABLE_ROUTER_LOG` set to `false`
- `RUN_AT_START`: **true**: Run repository statistics check at start time.
- `SCHEDULE`: **@every 24h**: Cron syntax for scheduling repository statistics check.

### Cron - Cleanup hook_task Table (`cron.cleanup_hook_task_table`)

- `ENABLED`: **true**: Enable cleanup hook_task job.
- `RUN_AT_START`: **false**: Run cleanup hook_task at start time (if ENABLED).
- `SCHEDULE`: **@every 24h**: Cron syntax for cleaning hook_task table.
- `CLEANUP_TYPE` **OlderThan** OlderThan or PerWebhook Method to cleanup hook_task, either by age (i.e. how long ago hook_task record was delivered) or by the number to keep per webhook (i.e. keep most recent x deliveries per webhook).
- `OLDER_THAN`: **168h**: If CLEANUP_TYPE is set to OlderThan, then any delivered hook_task records older than this expression will be deleted.
- `NUMBER_TO_KEEP`: **10**: If CLEANUP_TYPE is set to PerWebhook, this is number of hook_task records to keep for a webhook (i.e. keep the most recent x deliveries).

#### Cron - Update Migration Poster ID (`cron.update_migration_poster_id`)

- `SCHEDULE`: **@every 24h** : Interval as a duration between each synchronization, it will always attempt synchronization when the instance starts.
Expand Down
87 changes: 87 additions & 0 deletions models/webhook.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
package models

import (
"context"
"encoding/json"
"fmt"
"strings"
Expand Down Expand Up @@ -39,6 +40,26 @@ func ToHookContentType(name string) HookContentType {
return hookContentTypes[name]
}

// HookTaskCleanupType is the type of cleanup to perform on hook_task
type HookTaskCleanupType int

const (
// OlderThan hook_task rows will be cleaned up by the age of the row
OlderThan HookTaskCleanupType = iota
// PerWebhook hook_task rows will be cleaned up by leaving the most recent deliveries for each webhook
PerWebhook
)

var hookTaskCleanupTypes = map[string]HookTaskCleanupType{
"OlderThan": OlderThan,
"PerWebhook": PerWebhook,
}

// ToHookTaskCleanupType returns HookTaskCleanupType by given name.
func ToHookTaskCleanupType(name string) HookTaskCleanupType {
return hookTaskCleanupTypes[name]
}

// Name returns the name of a given web hook's content type
func (t HookContentType) Name() string {
switch t {
Expand Down Expand Up @@ -738,3 +759,69 @@ func FindRepoUndeliveredHookTasks(repoID int64) ([]*HookTask, error) {
}
return tasks, nil
}

// CleanupHookTaskTable deletes rows from hook_task as needed.
func CleanupHookTaskTable(ctx context.Context, cleanupType HookTaskCleanupType, olderThan time.Duration, numberToKeep int) error {
log.Trace("Doing: CleanupHookTaskTable")

if cleanupType == OlderThan {
deleteOlderThan := time.Now().Add(-olderThan).UnixNano()
deletes, err := x.
Where("is_delivered = ? and delivered < ?", true, deleteOlderThan).
Delete(new(HookTask))
if err != nil {
return err
}
log.Trace("Deleted %d rows from hook_task", deletes)
} else if cleanupType == PerWebhook {
hookIDs := make([]int64, 0, 10)
err := x.Table("webhook").
Where("id > 0").
Cols("id").
Find(&hookIDs)
if err != nil {
return err
}
for _, hookID := range hookIDs {
select {
case <-ctx.Done():
return ErrCancelledf("Before deleting hook_task records for hook id %d", hookID)
default:
}
if err = deleteDeliveredHookTasksByWebhook(hookID, numberToKeep); err != nil {
return err
}
}
}
log.Trace("Finished: CleanupHookTaskTable")
return nil
}

func deleteDeliveredHookTasksByWebhook(hookID int64, numberDeliveriesToKeep int) error {
log.Trace("Deleting hook_task rows for webhook %d, keeping the most recent %d deliveries", hookID, numberDeliveriesToKeep)
var deliveryDates = make([]int64, 0, 10)
err := x.Table("hook_task").
Where("hook_task.hook_id = ? AND hook_task.is_delivered = ? AND hook_task.delivered is not null", hookID, true).
Cols("hook_task.delivered").
Join("INNER", "webhook", "hook_task.hook_id = webhook.id").
OrderBy("hook_task.delivered desc").
Limit(1, int(numberDeliveriesToKeep)).
Find(&deliveryDates)
if err != nil {
return err
}

if len(deliveryDates) > 0 {
deletes, err := x.
Where("hook_id = ? and is_delivered = ? and delivered <= ?", hookID, true, deliveryDates[0]).
Delete(new(HookTask))
if err != nil {
return err
}
log.Trace("Deleted %d hook_task rows for webhook %d", deletes, hookID)
} else {
log.Trace("No hook_task rows to delete for webhook %d", hookID)
}

return nil
}
114 changes: 114 additions & 0 deletions models/webhook_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,10 @@
package models

import (
"context"
"encoding/json"
"testing"
"time"

api "code.gitea.io/gitea/modules/structs"

Expand Down Expand Up @@ -223,3 +225,115 @@ func TestUpdateHookTask(t *testing.T) {
assert.NoError(t, UpdateHookTask(hook))
AssertExistsAndLoadBean(t, hook)
}

func TestCleanupHookTaskTable_PerWebhook_DeletesDelivered(t *testing.T) {
assert.NoError(t, PrepareTestDatabase())
hookTask := &HookTask{
RepoID: 3,
HookID: 3,
Typ: GITEA,
URL: "http://www.example.com/unit_test",
Payloader: &api.PushPayload{},
IsDelivered: true,
Delivered: time.Now().UnixNano(),
}
AssertNotExistsBean(t, hookTask)
assert.NoError(t, CreateHookTask(hookTask))
AssertExistsAndLoadBean(t, hookTask)

assert.NoError(t, CleanupHookTaskTable(context.Background(), PerWebhook, 168*time.Hour, 0))
AssertNotExistsBean(t, hookTask)
}

func TestCleanupHookTaskTable_PerWebhook_LeavesUndelivered(t *testing.T) {
assert.NoError(t, PrepareTestDatabase())
hookTask := &HookTask{
RepoID: 2,
HookID: 4,
Typ: GITEA,
URL: "http://www.example.com/unit_test",
Payloader: &api.PushPayload{},
IsDelivered: false,
}
AssertNotExistsBean(t, hookTask)
assert.NoError(t, CreateHookTask(hookTask))
AssertExistsAndLoadBean(t, hookTask)

assert.NoError(t, CleanupHookTaskTable(context.Background(), PerWebhook, 168*time.Hour, 0))
AssertExistsAndLoadBean(t, hookTask)
}

func TestCleanupHookTaskTable_PerWebhook_LeavesMostRecentTask(t *testing.T) {
assert.NoError(t, PrepareTestDatabase())
hookTask := &HookTask{
RepoID: 2,
HookID: 4,
Typ: GITEA,
URL: "http://www.example.com/unit_test",
Payloader: &api.PushPayload{},
IsDelivered: true,
Delivered: time.Now().UnixNano(),
}
AssertNotExistsBean(t, hookTask)
assert.NoError(t, CreateHookTask(hookTask))
AssertExistsAndLoadBean(t, hookTask)

assert.NoError(t, CleanupHookTaskTable(context.Background(), PerWebhook, 168*time.Hour, 1))
AssertExistsAndLoadBean(t, hookTask)
}

func TestCleanupHookTaskTable_OlderThan_DeletesDelivered(t *testing.T) {
assert.NoError(t, PrepareTestDatabase())
hookTask := &HookTask{
RepoID: 3,
HookID: 3,
Typ: GITEA,
URL: "http://www.example.com/unit_test",
Payloader: &api.PushPayload{},
IsDelivered: true,
Delivered: time.Now().AddDate(0, 0, -8).UnixNano(),
}
AssertNotExistsBean(t, hookTask)
assert.NoError(t, CreateHookTask(hookTask))
AssertExistsAndLoadBean(t, hookTask)

assert.NoError(t, CleanupHookTaskTable(context.Background(), OlderThan, 168*time.Hour, 0))
AssertNotExistsBean(t, hookTask)
}

func TestCleanupHookTaskTable_OlderThan_LeavesUndelivered(t *testing.T) {
assert.NoError(t, PrepareTestDatabase())
hookTask := &HookTask{
RepoID: 2,
HookID: 4,
Typ: GITEA,
URL: "http://www.example.com/unit_test",
Payloader: &api.PushPayload{},
IsDelivered: false,
}
AssertNotExistsBean(t, hookTask)
assert.NoError(t, CreateHookTask(hookTask))
AssertExistsAndLoadBean(t, hookTask)

assert.NoError(t, CleanupHookTaskTable(context.Background(), OlderThan, 168*time.Hour, 0))
AssertExistsAndLoadBean(t, hookTask)
}

func TestCleanupHookTaskTable_OlderThan_LeavesTaskEarlierThanAgeToDelete(t *testing.T) {
assert.NoError(t, PrepareTestDatabase())
hookTask := &HookTask{
RepoID: 2,
HookID: 4,
Typ: GITEA,
URL: "http://www.example.com/unit_test",
Payloader: &api.PushPayload{},
IsDelivered: true,
Delivered: time.Now().AddDate(0, 0, -6).UnixNano(),
}
AssertNotExistsBean(t, hookTask)
assert.NoError(t, CreateHookTask(hookTask))
AssertExistsAndLoadBean(t, hookTask)

assert.NoError(t, CleanupHookTaskTable(context.Background(), OlderThan, 168*time.Hour, 0))
AssertExistsAndLoadBean(t, hookTask)
}
8 changes: 8 additions & 0 deletions modules/cron/setting.go
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,14 @@ type UpdateExistingConfig struct {
UpdateExisting bool
}

// CleanupHookTaskConfig represents a cron task with settings to cleanup hook_task
type CleanupHookTaskConfig struct {
BaseConfig
CleanupType string
OlderThan time.Duration
NumberToKeep int
}

6543 marked this conversation as resolved.
Show resolved Hide resolved
// GetSchedule returns the schedule for the base config
func (b *BaseConfig) GetSchedule() string {
return b.Schedule
Expand Down
17 changes: 17 additions & 0 deletions modules/cron/tasks_basic.go
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,22 @@ func registerUpdateMigrationPosterID() {
})
}

func registerCleanupHookTaskTable() {
RegisterTaskFatal("cleanup_hook_task_table", &CleanupHookTaskConfig{
BaseConfig: BaseConfig{
Enabled: true,
RunAtStart: false,
Schedule: "@every 24h",
},
CleanupType: "OlderThan",
OlderThan: 168 * time.Hour,
NumberToKeep: 10,
}, func(ctx context.Context, _ *models.User, config Config) error {
realConfig := config.(*CleanupHookTaskConfig)
return models.CleanupHookTaskTable(ctx, models.ToHookTaskCleanupType(realConfig.CleanupType), realConfig.OlderThan, realConfig.NumberToKeep)
})
}

func initBasicTasks() {
registerUpdateMirrorTask()
registerRepoHealthCheck()
Expand All @@ -119,4 +135,5 @@ func initBasicTasks() {
if !setting.Repository.DisableMigrations {
registerUpdateMigrationPosterID()
}
registerCleanupHookTaskTable()
}
1 change: 1 addition & 0 deletions options/locale/locale_en-US.ini
Original file line number Diff line number Diff line change
Expand Up @@ -2065,6 +2065,7 @@ dashboard.resync_all_sshprincipals.desc = (Not needed for the built-in SSH serve
dashboard.resync_all_hooks = Resynchronize pre-receive, update and post-receive hooks of all repositories.
dashboard.reinit_missing_repos = Reinitialize all missing Git repositories for which records exist
dashboard.sync_external_users = Synchronize external user data
dashboard.cleanup_hook_task_table = Cleanup hook_task table
dashboard.server_uptime = Server Uptime
dashboard.current_goroutine = Current Goroutines
dashboard.current_memory_usage = Current Memory Usage
Expand Down