Skip to content

Commit

Permalink
[Fixes #7826] Implement tests for harvesters (#7827)
Browse files Browse the repository at this point in the history
* Bump urllib3 from 1.26.2 to 1.26.3 (#6908)

Bumps [urllib3](https://github.com/urllib3/urllib3) from 1.26.2 to 1.26.3.
- [Release notes](https://github.com/urllib3/urllib3/releases)
- [Changelog](https://github.com/urllib3/urllib3/blob/1.26.3/CHANGES.rst)
- [Commits](urllib3/urllib3@1.26.2...1.26.3)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Toni <toni.schoenbuchner@csgis.de>

* [Fixes #6880] Circle CI upload tests fail irregulary (#6881)

* [Fixes #6880] Circle CI upload tests fail irregulary

* CircleCI test fix: sometimes expires due to upload timeout in the test environment

* - Avoid infinite loop on upload testing

* Revert "CircleCI test fix: sometimes expires due to upload timeout in the test environment"

This reverts commit 66139fd.

Co-authored-by: Alessio Fabiani <alessio.fabiani@geo-solutions.it>
Co-authored-by: afabiani <alessio.fabiani@gmail.com>

* [Fixes #6914] Remove "add to basket" tool for documents and maps (#6915)

* Added malnajdi as contributor

* [Fixes #6910] meaningful filename for document download (#6911)

* get meaningful document filenames on download

* - Strip extension from document title before slugify it (e.g.: image.jpg instead of imagejpg.jpg)

Co-authored-by: afabiani <alessio.fabiani@gmail.com>
Co-authored-by: Alessio Fabiani <alessio.fabiani@geo-solutions.it>

* - CircleCI Upload Tests: trying to reduce more the risk of infinite loop on "wait_for_progress"

* [Fixes #6916] gsimporter.api.NotFound caused by missing trailing slash at the end of GEOSERVER_LOCATION (#6913)

* [Fixes #6916] gsimporter.api.NotFound caused by missing trailing slash at the end of GEOSERVER_LOCATION

* [Fixes #6916] unit test for GEOSERVER_LOCATION

* Bump django-cors-headers from 3.6.0 to 3.7.0 (#6901)

Bumps [django-cors-headers](https://github.com/adamchainz/django-cors-headers) from 3.6.0 to 3.7.0.
- [Release notes](https://github.com/adamchainz/django-cors-headers/releases)
- [Changelog](https://github.com/adamchainz/django-cors-headers/blob/master/HISTORY.rst)
- [Commits](adamchainz/django-cors-headers@3.6.0...3.7.0)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump amqp from 5.0.3 to 5.0.5 (#6905)

Bumps [amqp](https://github.com/celery/py-amqp) from 5.0.3 to 5.0.5.
- [Release notes](https://github.com/celery/py-amqp/releases)
- [Changelog](https://github.com/celery/py-amqp/blob/master/Changelog)
- [Commits](celery/py-amqp@v5.0.3...v5.0.5)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump pip from 21.0 to 21.0.1 (#6900)

Bumps [pip](https://github.com/pypa/pip) from 21.0 to 21.0.1.
- [Release notes](https://github.com/pypa/pip/releases)
- [Changelog](https://github.com/pypa/pip/blob/master/NEWS.rst)
- [Commits](pypa/pip@21.0...21.0.1)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump coverage from 5.3.1 to 5.4 (#6903)

Bumps [coverage](https://github.com/nedbat/coveragepy) from 5.3.1 to 5.4.
- [Release notes](https://github.com/nedbat/coveragepy/releases)
- [Changelog](https://github.com/nedbat/coveragepy/blob/master/CHANGES.rst)
- [Commits](nedbat/coveragepy@coverage-5.3.1...coverage-5.4)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump pytest from 6.2.1 to 6.2.2 (#6907)

Bumps [pytest](https://github.com/pytest-dev/pytest) from 6.2.1 to 6.2.2.
- [Release notes](https://github.com/pytest-dev/pytest/releases)
- [Changelog](https://github.com/pytest-dev/pytest/blob/master/CHANGELOG.rst)
- [Commits](pytest-dev/pytest@6.2.1...6.2.2)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump djangorestframework-gis from 0.16 to 0.17 (#6902)

Bumps [djangorestframework-gis](https://github.com/openwisp/django-rest-framework-gis) from 0.16 to 0.17.
- [Release notes](https://github.com/openwisp/django-rest-framework-gis/releases)
- [Changelog](https://github.com/openwisp/django-rest-framework-gis/blob/master/CHANGES.rst)
- [Commits](openwisp/django-rest-framework-gis@v0.16.0...v0.17.0)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* - Algin setup.cfg to requirements.txt

* [Fixes #6922][REST API v2] Expose the curated thumbnail URL if it has… (#6923)

* [Fixes #6922][REST API v2] Expose the curated thumbnail URL if it has been uploaded

* - Add REST APIs test suite to CircleCI

* [Fixes #6918] Removal of QGIS support (#6919)

* [Cleanup and Refactor] Remove QGIS server backend dependencies

* [Cleanup and Refactor] Remove QGIS server backend dependencies

* - Fix LGTM issues

* allow Basic authenticated requests in LOCKDOWN mode

* fix to avoid circular import

* flake8 check fix

* added tests

* [Fixes #6880] Circle CI upload tests fail irregulary (#6881)

* [Fixes #6880] Circle CI upload tests fail irregulary

* CircleCI test fix: sometimes expires due to upload timeout in the test environment

* - Avoid infinite loop on upload testing

* Revert "CircleCI test fix: sometimes expires due to upload timeout in the test environment"

This reverts commit 66139fd.

Co-authored-by: Alessio Fabiani <alessio.fabiani@geo-solutions.it>
Co-authored-by: afabiani <alessio.fabiani@gmail.com>

* [Fixes #6914] Remove "add to basket" tool for documents and maps (#6915)

* Added malnajdi as contributor

* Bump pip from 21.0 to 21.0.1 (#6900)

Bumps [pip](https://github.com/pypa/pip) from 21.0 to 21.0.1.
- [Release notes](https://github.com/pypa/pip/releases)
- [Changelog](https://github.com/pypa/pip/blob/master/NEWS.rst)
- [Commits](pypa/pip@21.0...21.0.1)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* - Algin setup.cfg to requirements.txt

* [Fixes #6922][REST API v2] Expose the curated thumbnail URL if it has… (#6923)

* [Fixes #6922][REST API v2] Expose the curated thumbnail URL if it has been uploaded

* - Add REST APIs test suite to CircleCI

* [Fixes #6918] Removal of QGIS support (#6919)

* [Cleanup and Refactor] Remove QGIS server backend dependencies

* [Cleanup and Refactor] Remove QGIS server backend dependencies

* - Fix LGTM issues

* allow Basic authenticated requests in LOCKDOWN mode

* fix to avoid circular import

* - Align to upstream master branch

* Add harvester tests

* Add more tests and rename some stuff in order to conform to django test naming conventions

* Rebase on top of master

* Add harvester tests

* Add more tests and rename some stuff in order to conform to django test naming conventions

* Update tests

* rebase with latest master changes

* Fix tests according to latest rebase and add migration for harvesting

The migration file shall allow different dynamic harvesters to be added without generating additional migration files for them due to the choices field changing its value

The value for choices is now gotten in a dynamic way inside the migration file

* Fix settings for harvester is not found

* Fix migrations

* Fix pep8 issues and minor test fix

Co-authored-by: Giovanni Allegri <giohappy@gmail.com>
Co-authored-by: allyoucanmap <bovio.stefano@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Toni <toni.schoenbuchner@csgis.de>
Co-authored-by: Alessio Fabiani <alessio.fabiani@geo-solutions.it>
Co-authored-by: afabiani <alessio.fabiani@gmail.com>
Co-authored-by: Florian Hoedt <gannebamm@gmail.com>
Co-authored-by: Mohammed Y. Alnajdi <mohdnagfy@gmail.com>
Co-authored-by: biegan <bieganowski.rev@gmail.com>
Co-authored-by: Ricardo Garcia Silva <ricardo@kartoza.com>
Co-authored-by: Ricardo Garcia Silva <ricardo.garcia.silva@gmail.com>
  • Loading branch information
12 people committed Sep 7, 2021
1 parent 19a53e4 commit e20df76
Show file tree
Hide file tree
Showing 20 changed files with 1,475 additions and 32 deletions.
24 changes: 15 additions & 9 deletions geonode/harvesting/api/serializers.py
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ def get_links(self, obj):
},
request=self.context["request"],
),
"harvestable-resources": reverse(
"harvestable_resources": reverse(
"harvestable-resources-list",
kwargs={
"harvester_id": obj.id,
Expand Down Expand Up @@ -118,10 +118,6 @@ class Meta:
"links",
)

def validate_harvester_type_specific_configuration(self, value):
logger.debug(f"inside validate_harvester_type_specific_configuration instance: {self.instance}")
return value

def validate(self, data):
"""Perform object-level validation
Expand All @@ -138,7 +134,7 @@ def validate(self, data):
"""

worker_config_field = "harvester_type_specific_configuration"
worker_type_field = "worker_type"
worker_type_field = "harvester_type"
worker_type = data.get(
worker_type_field, getattr(self.instance, worker_type_field, None))
worker_config = data.get(
Expand All @@ -157,7 +153,6 @@ def validate(self, data):
)
return data

# FIXME: ensure supplied worker-specific config validates our json-schema
def create(self, validated_data):
desired_status = validated_data.get("status", models.Harvester.STATUS_READY)
if desired_status != models.Harvester.STATUS_READY:
Expand Down Expand Up @@ -263,13 +258,24 @@ class Meta:
"unique_identifier",
"title",
"should_be_harvested",
"available",
"last_updated",
"status",
"remote_resource_type",
]
read_only_fields = [
"title",
"last_updated",
"status",
"remote_resource_type",
]

def create(self, validated_data):
# TODO: check if there is no other property being set other than `should_be_harvested`
# NOTE: We are implementing `create()` rather than `update` intentionally, even if the
# user is not allowed to create new records (check the `views.py` module) - the rationale
# being that since we keep a harvestable_resource's `id` private it would be more involved
# to deal with its update than with its creation. We are providing a custom `UpdateListModelMixin` class
# that allows for bulk update of multiple instances simultaneously. This mixin class is instantiating
# this serializer class without providing an instance and then calling its `save()` method
harvestable_resource = models.HarvestableResource.objects.get(
harvester=self.context["harvester"],
unique_identifier=validated_data["unique_identifier"]
Expand Down
36 changes: 23 additions & 13 deletions geonode/harvesting/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,22 +23,32 @@
"""

import typing

from django.conf import settings

_default_harvesters = [

_DEFAULT_HARVESTERS: typing.Final = [
"geonode.harvesting.harvesters.geonode.GeonodeLegacyHarvester",
# "geonode.harvesting.harvesters.geonode.GeonodeCswHarvester",
"geonode.harvesting.harvesters.wms.OgcWmsHarvester",
# "geonode.harvesting.harvesters.geonode.GeonodeCswHarvester",
]

try:
_configured_harvester_classes = getattr(settings, "HARVESTER_CLASSES")
HARVESTER_CLASSES = (
_default_harvesters +
[i for i in _configured_harvester_classes if i not in _default_harvesters]
)
except AttributeError:
HARVESTER_CLASSES = _default_harvesters

HARVESTED_RESOURCE_FILE_MAX_MEMORY_SIZE = getattr(
settings, "HARVESTED_RESOURCE_FILE_MAX_MEMORY_SIZE", settings.FILE_UPLOAD_MAX_MEMORY_SIZE)

def _get_harvester_class_paths(custom_class_paths: typing.List[str]) -> typing.List[str]:
result = _DEFAULT_HARVESTERS[:]
for i in custom_class_paths:
if i not in result:
result.append(i)
return result


def get_setting(setting_key: str) -> typing.Any:
result = {
"HARVESTER_CLASSES": _get_harvester_class_paths(
getattr(settings, "HARVESTER_CLASSES", [])
),
"HARVESTED_RESOURCE_FILE_MAX_MEMORY_SIZE": getattr(
settings, "HARVESTED_RESOURCE_MAX_MEMORY_SIZE", settings.FILE_UPLOAD_MAX_MEMORY_SIZE)
}.get(setting_key, getattr(settings, setting_key, None))
return result
3 changes: 2 additions & 1 deletion geonode/harvesting/harvesters/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -359,7 +359,8 @@ def download_resource_file(url: str, target_name: str) -> str:
file_size = response.headers.get("Content-Length")
content_type = response.headers.get("Content-Type")
charset = response.apparent_encoding
if file_size is not None and int(file_size) < config.HARVESTED_RESOURCE_FILE_MAX_MEMORY_SIZE:
size_threshold = config.get_setting("HARVESTED_RESOURCE_FILE_MAX_MEMORY_SIZE")
if file_size is not None and int(file_size) < size_threshold:
logger.debug("Downloading to an in-memory buffer...")
file_ = uploadedfile.InMemoryUploadedFile(
None, None, target_name, content_type, file_size, charset)
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Generated by Django 3.2.4 on 2021-06-28 18:26
# Hand edited in order to set choices for `modelsHarvester.harvester_type` to come from settings in a dynamic fashion
# This shall prevent Django autogenerating new migration files for `geonode.harvesting`
# whenever new custom harvester classes are added to the settings

from django.db import migrations, models

from .. import config


class Migration(migrations.Migration):

dependencies = [
('harvesting', '0028_harvester_num_harvestable_resources'),
]

operations = [
migrations.AlterField(
model_name='harvester',
name='harvester_type',
field=models.CharField(
choices=[(value, value) for value in config.get_setting("HARVESTER_CLASSES")],
default='geonode.harvesting.harvesters.geonode.GeonodeLegacyHarvester',
help_text=(
'Harvester class used to perform harvesting sessions. New harvester types can be added by an admin by changing the '
'main GeoNode `settings.py` file'
),
max_length=255
),
),
]
10 changes: 6 additions & 4 deletions geonode/harvesting/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@
)

from . import utils
from .config import HARVESTER_CLASSES
from .config import get_setting

logger = logging.getLogger(__name__)

Expand Down Expand Up @@ -127,8 +127,8 @@ class Harvester(models.Model):
"Harvester class used to perform harvesting sessions. New harvester types "
"can be added by an admin by changing the main GeoNode `settings.py` file"
),
choices=(((i, i) for i in HARVESTER_CLASSES)),
default=HARVESTER_CLASSES[0]
choices=(((i, i) for i in get_setting("HARVESTER_CLASSES"))),
default=get_setting("HARVESTER_CLASSES")[0]
)
harvester_type_specific_configuration = models.JSONField(
default=dict,
Expand Down Expand Up @@ -157,7 +157,9 @@ class Harvester(models.Model):
editable=False,
)
num_harvestable_resources = models.IntegerField(
default=0)
blank=True,
default=0
)

def __str__(self):
return f"{self.name}({self.id})"
Expand Down
File renamed without changes.
60 changes: 60 additions & 0 deletions geonode/harvesting/tests/factories.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
##############################################
#
# Copyright (C) 2021 OSGeo
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
#
#########################################################################


import uuid
import datetime
from geonode.harvesting import resourcedescriptor
from geonode.harvesting.harvesters.base import HarvestedResourceInfo, BriefRemoteResource

contact_example = resourcedescriptor.RecordDescriptionContact(
role='role',
name="Test"
)
identification_example = resourcedescriptor.RecordIdentification(
name='Test',
title='Test',
date=datetime.datetime.now(),
date_type='type',
originator=contact_example,
graphic_overview_uri='',
place_keywords=['keyword'],
other_keywords=('test',),
license=['test']
)
distribution_example = resourcedescriptor.RecordDistribution()
resource_description_example = resourcedescriptor.RecordDescription(
uuid=uuid.uuid4(),
point_of_contact=contact_example,
author=contact_example,
date_stamp=datetime.datetime.now(),
identification=identification_example,
distribution=distribution_example
)

resource_info_example = HarvestedResourceInfo(
resource_descriptor=resource_description_example,
additional_information={}
)

brief_remote_resource_example = BriefRemoteResource(
unique_identifier='id',
title='Test',
resource_type='Layer'
)
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
#########################################################################
##############################################
#
# Copyright (C) 2021 OSGeo
#
Expand Down
98 changes: 98 additions & 0 deletions geonode/harvesting/tests/harvesters/base.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
##############################################
#
# Copyright (C) 2021 OSGeo
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
#
#########################################################################

import datetime
from django.contrib.auth import get_user_model
from geonode.tests.base import GeoNodeBaseTestSupport
from geonode.harvesting.models import Harvester, HarvestableResource
from geonode.harvesting.tests.harvesters.test_harvester import TestHarvester
from geonode.layers.models import Dataset


class TestBaseHarvester(GeoNodeBaseTestSupport):
"""
Test Base harvester
"""
remote_url = 'test.com'
name = 'This is geonode harvester'
user = get_user_model().objects.get(username='AnonymousUser')
harvester_type = 'geonode.harvesting.tests.harvesters.test_harvester.TestHarvester'

def setUp(self):
super().setUp()
self.worker = TestHarvester(
remote_url=self.remote_url,
harvester_id=1
)

def test_worker_from_harvester(self):
"""
Test worker that generated from harvester
"""
harvester = Harvester.objects.create(
remote_url=self.remote_url,
name=self.name,
default_owner=self.user,
harvester_type=self.harvester_type
)
worker = harvester.get_harvester_worker()
self.assertEqual(worker.__class__, TestHarvester)
self.assertEqual(worker.remote_url, self.remote_url)
self.assertEqual(harvester.default_owner, self.user)

def test_worker_from_django_record(self):
"""
Test worker that generated from worker using harvester record
"""
harvester = Harvester.objects.create(
remote_url=self.remote_url,
name=self.name,
default_owner=self.user,
harvester_type=self.harvester_type
)
worker = TestHarvester.from_django_record(harvester)
self.assertEqual(worker.__class__, TestHarvester)
self.assertEqual(worker.remote_url, self.remote_url)
self.assertEqual(harvester.default_owner, self.user)

def test_worker_methods(self):
"""
Test functions in worker
"""
self.assertEqual(self.worker.remote_url, self.remote_url)
self.assertEqual(self.worker.harvester_id, 1)
self.assertTrue(self.worker.allows_copying_resources)
self.assertTrue(self.worker.check_availability())
self.assertEqual(self.worker.get_num_available_resources(), 1)
self.assertEqual(len(self.worker.list_resources()), 1)
self.assertEqual(self.worker.get_geonode_resource_type('type'), Dataset)

harvestable_resource = HarvestableResource(
unique_identifier='1',
title='Test Resource',
harvester=Harvester(
remote_url=self.remote_url,
name=self.name,
default_owner=self.user,
harvester_type=self.harvester_type
),
last_refreshed=datetime.datetime.now()
)
self.assertIsNone(self.worker.get_resource(
harvestable_resource, 1))
Loading

0 comments on commit e20df76

Please sign in to comment.