Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update readnoise monitor to use django database models #1482

Merged
merged 11 commits into from
Feb 15, 2024
145 changes: 134 additions & 11 deletions docs/source/database.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,137 @@
database
********

database_interface.py
---------------------
.. automodule:: jwql.database.database_interface
:members:
:undoc-members:

reset_database.py
-----------------
.. automodule:: jwql.database.reset_database
:members:
:undoc-members:
Introduction
------------

JWQL uses the django database models for creating tables, updating table fields, adding
new data to tables, and retrieving data from tables. For instrument monitors, in particular,
there are a number of issues that may be relevant.

In general, django model documentation can be found
`on the django website <https://docs.djangoproject.com/en/5.0/#the-model-layer>`_.
Unfortunately, finding a particular bit of documentation in django can be a challenge, so
a few quick-reference notes are provided below.

Retrieving Data
---------------

Django retrieves data directly from its model tables. So, for example, if you want to
select data from the `MIRIMyMonitorStats` table, you must first import the relevant
object:

.. code-block:: python

from jwql.website.apps.jwql.monitor_models.my_monitor import MIRIMyMonitorStats

Then, you would access the database contents via the `objects` member of the class. For
example, to search the `MIRIMyMonitorStats` table for all entries matching a given
aperture, and to sort them with the most recent date at the top, you might do a query like
the following:

.. code-block:: python

aperture = "my_miri_aperture"

records = MIRIMyMonitorStats.objects.filter(aperture__iexact=aperture).order_by("-mjd_end").all()

In the above code,

* The `filter()` function selects matching records from the full table. You can use
multiple filter statements, or a single filter function with multiple filters. `filter()`
statements are always combined with an implicit AND.
* If you have a long filter statement and want to separate it from the query statement,
you can create a dictionary and add it in with the `**` prepended. The dictionary
equivalent to the above would be `{'aperture__iexact': aperture}`
* The text before the double underscore is a field name, and the text afterwards describes
the type of comparison. `iexact` indicates "case-insensitive exact match". You can also
use a variety of standard SQL comparisons (`like`, `startswith`, `gte`, etc.)
* If you want to get only records that *don't* match a pattern, then you can use the
`exclude()` function, which otherwise operates exactly the same as `filter()`.
* In the `order_by()` function, the `-` at the start is used to reverse the sort order,
and the `mjd_end` is the name of the field to be sorted by.
* The `all()` statement indicates that you want all the values returned. `get()` returns
a single value and can be iterated on, `first()` returns only the first value, etc.

As an example of multiple filters, the code below:

.. code-block:: python

records = MIRIMyMonitorStats.objects.filter(aperture__iexact=ap, mjd_end__gte=60000)

filters = {
"aperture__iexact": ap,
"mjd_end__gte": 60000
}
records = MIRIMyMonitorStats.objects.filter(**filters)

show two different ways of combining a search for a particular aperture *and* only data
taken more recently than MJD=60000.

Note that django executes queries lazily, meaning that it will only actually *do* the
query when it needs the results. The above statement, for example, will not actually
run the query. Instead, it will be run when you operate on it, such as

* Getting the length of the result with e.g. `len(records)`
* Printing out any of the results
* Asking for the value of one of the fields (e.g. `records[3].aperture`)

Q Objects
=========

In order to make more complex queries, Django supplies "Q Objects", which are essentially
encapsulated filters which can be combined using logical operators. For more on this, see
`the django Q object documentation <https://docs.djangoproject.com/en/5.0/topics/db/queries/#complex-lookups-with-q-objects>`_.

Storing Data
------------

Django also uses the model tables (and objects) directly for storing new data. For example,
if you have a monitor table defined as below:

.. code-block:: python

from django.db import models
from django.contrib.postgres.fields import ArrayField

class NIRISSMyMonitorStats(models.Model):
aperture = models.CharField(blank=True, null=True)
mean = models.FloatField(blank=True, null=True)
median = models.FloatField(blank=True, null=True)
stddev = models.FloatField(blank=True, null=True)
counts = ArrayField(models.FloatField())
entry_date = models.DateTimeField(blank=True, null=True)

class Meta:
managed = True
db_table = 'niriss_my_monitor_stats'
unique_together = (('id', 'entry_date'),)
app_label = 'monitors'

then you would create a new entry as follows:

.. code-block:: python

values = {
"aperture": "my_aperture",
"mean": float(mean),
"median": float(median),
"stddev": float(stddev),
"counts": list(counts.astype(float)),
"entry_date": datetime.datetime.now()
}

entry = NIRISSMyMonitorStats(**values)
entry.save()

There are (as usual) a few things to note above:

* Django doesn't have a built-in array data type, so you need to import it from the
database-compatibility layers. The ArrayField takes, as a required argument, the type
of data that makes up the array.
york-stsci marked this conversation as resolved.
Show resolved Hide resolved
* In the Meta sub-class of the monitor class, the `all_label = 'monitors'` statement is
required so that django knows that the model should be stored in the monitors table.
* The `float()` casts are required because the database interface doesn't understand
numpy data types.
* The `list()` cast is required because the database interface doesn't understand the
numpy `ndarray` data type
8 changes: 7 additions & 1 deletion docs/source/website.rst
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,12 @@ monitor_views.py
:members:
:undoc-members:

monitor_models
--------------
.. automodule:: jwql.website.apps.jwql.monitor_models.common
:members:
:undoc-members:

settings.py
-----------
.. automodule:: jwql.website.jwql_proj.settings
Expand All @@ -60,4 +66,4 @@ views.py
--------
.. automodule:: jwql.website.apps.jwql.views
:members:
:undoc-members:
:undoc-members:
66 changes: 31 additions & 35 deletions jwql/instrument_monitors/common_monitors/readnoise_monitor.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,14 +47,16 @@
import matplotlib.pyplot as plt # noqa: E348 (comparison to true)
import numpy as np # noqa: E348 (comparison to true)
from pysiaf import Siaf # noqa: E348 (comparison to true)
from sqlalchemy.sql.expression import and_ # noqa: E348 (comparison to true)

from jwql.database.database_interface import FGSReadnoiseQueryHistory, FGSReadnoiseStats # noqa: E348 (comparison to true)
from jwql.database.database_interface import MIRIReadnoiseQueryHistory, MIRIReadnoiseStats # noqa: E348 (comparison to true)
from jwql.database.database_interface import NIRCamReadnoiseQueryHistory, NIRCamReadnoiseStats # noqa: E348 (comparison to true)
from jwql.database.database_interface import NIRISSReadnoiseQueryHistory, NIRISSReadnoiseStats # noqa: E348 (comparison to true)
from jwql.database.database_interface import NIRSpecReadnoiseQueryHistory, NIRSpecReadnoiseStats # noqa: E348 (comparison to true)
from jwql.database.database_interface import session, engine # noqa: E348 (comparison to true)

# Need to set up django apps before we can access the models
from django import setup
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "jwql.website.jwql_proj.settings")
york-stsci marked this conversation as resolved.
Show resolved Hide resolved
setup()

# PEP8 will undoubtedly complain, but the file is specifically designed so that everything
# importable is a monitor class.
from jwql.website.apps.jwql.monitor_models.readnoise import *

from jwql.shared_tasks.shared_tasks import only_one, run_pipeline, run_parallel_pipeline # noqa: E348 (comparison to true)
from jwql.instrument_monitors import pipeline_tools # noqa: E348 (comparison to true)
from jwql.utils import instrument_properties, monitor_utils # noqa: E348 (comparison to true)
Expand Down Expand Up @@ -149,17 +151,9 @@ def file_exists_in_database(self, filename):
file_exists : bool
``True`` if filename exists in the readnoise stats database.
"""

query = session.query(self.stats_table)
results = query.filter(self.stats_table.uncal_filename == filename).all()

if len(results) != 0:
file_exists = True
else:
file_exists = False

session.close()
return file_exists

results = self.stats_table.objects.filter(uncal_filename__iexact=filename).values()
return (len(results) != 0)

def get_amp_stats(self, image, amps):
"""Calculates the sigma-clipped mean and stddev, as well as the
Expand Down Expand Up @@ -385,17 +379,19 @@ def most_recent_search(self):
Date (in MJD) of the ending range of the previous MAST query
where the readnoise monitor was run.
"""

query = session.query(self.query_table).filter(and_(self.query_table.aperture == self.aperture,
self.query_table.run_monitor == True)).order_by(self.query_table.end_time_mjd).all() # noqa: E712 (comparison to True)

filter_kwargs = {
'aperture__iexact': self.aperture,
'run_monitor__exact': True
}
query = self.query_table.objects.filter(**filter_kwargs).order_by("-end_time_mjd").all()
york-stsci marked this conversation as resolved.
Show resolved Hide resolved

if len(query) == 0:
query_result = 59607.0 # a.k.a. Jan 28, 2022 == First JWST images (MIRI)
logging.info(('\tNo query history for {}. Beginning search date will be set to {}.'.format(self.aperture, query_result)))
else:
query_result = query[-1].end_time_mjd
query_result = query[0].end_time_mjd
mfixstsci marked this conversation as resolved.
Show resolved Hide resolved

session.close()
return query_result

def process(self, file_list):
Expand Down Expand Up @@ -512,24 +508,24 @@ def process(self, file_list):
'readnoise_filename': os.path.basename(readnoise_outfile),
'full_image_mean': float(full_image_mean),
'full_image_stddev': float(full_image_stddev),
'full_image_n': full_image_n.astype(float),
'full_image_bin_centers': full_image_bin_centers.astype(float),
'full_image_n': list(full_image_n.astype(float)),
mfixstsci marked this conversation as resolved.
Show resolved Hide resolved
'full_image_bin_centers': list(full_image_bin_centers.astype(float)),
'readnoise_diff_image': os.path.basename(readnoise_diff_png),
'diff_image_mean': float(diff_image_mean),
'diff_image_stddev': float(diff_image_stddev),
'diff_image_n': diff_image_n.astype(float),
'diff_image_bin_centers': diff_image_bin_centers.astype(float),
'diff_image_n': list(diff_image_n.astype(float)),
'diff_image_bin_centers': list(diff_image_bin_centers.astype(float)),
'entry_date': datetime.datetime.now()
}
for key in amp_stats.keys():
if isinstance(amp_stats[key], (int, float)):
readnoise_db_entry[key] = float(amp_stats[key])
else:
readnoise_db_entry[key] = amp_stats[key].astype(float)

readnoise_db_entry[key] = list(amp_stats[key].astype(float))
# Add this new entry to the readnoise database table
with engine.begin() as connection:
connection.execute(self.stats_table.__table__.insert(), readnoise_db_entry)
entry = self.stats_table(**readnoise_db_entry)
entry.save()
logging.info('\tNew entry added to readnoise database table')

# Remove the raw and calibrated files to save memory space
Expand Down Expand Up @@ -658,15 +654,15 @@ def run(self):
'files_found': len(new_files),
'run_monitor': monitor_run,
'entry_date': datetime.datetime.now()}
with engine.begin() as connection:
connection.execute(self.query_table.__table__.insert(), new_entry)
stats_entry = self.query_table(**new_entry)
stats_entry.save()
logging.info('\tUpdated the query history table')

logging.info('Readnoise Monitor completed successfully.')


if __name__ == '__main__':

module = os.path.basename(__file__).strip('.py')
start_time, log_file = monitor_utils.initialize_instrument_monitor(module)

Expand Down
Loading
Loading