-
Notifications
You must be signed in to change notification settings - Fork 0
Dev
This documentation is for individuals either wanting to contribute to Heatmapper, or deploy it to a server.
For reference, you should look at Running Client-Side PyShiny, as those instructions are also applicable to hosting Heatmapper on a server.
The setup.sh
script located in the root of the repository can setup a complete environment for running Heatmapper, including setting up a virtual environment, installing dependencies, cloning Heatmapper, and resolving LFS files. It’s a bash script, so deployment on a Windows server will need to be done manually. From the repo, find setup.sh
from the file list. Upon clicking on it, GitHub should take you to a viewer, with a download button in the top-right corner. Or, if you only have access to a terminal, you can curl the script via:
curl -O https://raw.githubusercontent.com/WishartLab/heatmapper2/main/setup.sh
From there, make it executable:
chmod +x setup.sh
Then, place it into the directory you want Heatmapper to live in. setup.sh
will create two directories:
- The python virtual environment in
venv
- The Heatmapper source code in
heatmapper2
Once the script is finished, or you’ve manually handled dependencies and installation, you’ll next need to activate the Virtual Environment (Assuming you’re using a venv
and not just installing dependencies to the system). From the folder containing the venv
folder, run:
source venv/bin/activate
You can deactivate the virtual environment at any time by typing:
deactivate
Now, enter the heatmapper2
directory. For batch deployment, there are two scripts which automate the process:
-
deploy.sh
will deploy each application on the host, starting at port8000
for Expression, and ending with8006
for Spatial. Each process will run in a separate process, so the script (and user session) can be closed without tearing down the applications themselves. -
teardown.sh
will send aKILL
signal to all applications listening on the ports 8000-8006. If you’re only selectively hosting Heatmapper’s applications, this might kill non-related applications if they’re listening to that port.
However, if you want to be more selective about which applications are run, you have two primary options:
- Running it as a PyShiny application. To do this, navigate into the project to run, such as
expression
, and enter itssrc
directory. From there, execute:shiny run --host 0.0.0.0
. Thehost
argument is important to be listening on all network interfaces. If you want to enable reloading, so that changes to thesrc
folder will be transparently noted and changed within the application—without needing to stop it—add--reload
. To specify a port, use the--port
argument - Running it as a Static, WebAssembly application. This mode will instead server a connecting client with the WebAssembly files, which are then run on their computer. From the project folder
expression
, there are two sub-folders,src
andsite
. Simply runpython3 -m http.server --directory site --bind localhost 8008
, where the value8008
specifies the port.
This section outlines some general guidance on working within the Heatmapper repository.
For sake of consistency, Python Code should:
- Always use
from
imports, rather than importing the entire module: Dofrom shiny import App
, notimport shiny
- Use double quotes rather than single quotes for strings
- Use tabs, rather than spaces
- For naming convention:
- Local variables should use
snake_case
- Global variables, functions, classes, and Shiny IDs should use
PascalCase
- Local variables should use
- Prefer code that is more concise. If a function only has a single line, put in the function definition, such as
async def Reset(): await DataCache.Purge(input)
- Strive to consistently document the code-base. All non-trivial functions should have doc strings, which should follow Doxygen format.
- Use
shared.py
definitions over creating something custom. If functionality is missing, add it to theshared.py
implementation. - Always use the
Cache
object for handling input - Always use the
Filter
function to determine column names. - Always use the
NavBar
function to create a navigation bar shared across all applications. -
shared.py
should always be a symlink within thesrc
folder. Do not copy it.
When creating a new Application, there’s a few things to note:
- You should create a
DataCache
variable from theshared.Cache
class, which will handle all your user-input. This should be in theserver
function. - If you need to extend the
Cache
, such as adding more file-types, create a function that you can pass to theCache
call.- Treat it like a switch statement. You will be passed a single argument,
path
. Compare against the suffix to see if it matches your custom file type. If it doesn’t, returnDataCache.DefaultHandler(path)
. Do not modify the Default Handler, it bogs down all the applications.
- Treat it like a switch statement. You will be passed a single argument,
-
FileSelection
should be used to generate the UI for uploading/selecting input. Importantly:- It will create Shiny input IDs
SourceFile
for whether the user is selecting Upload/Example.File
for the user-uploaded file, andExample
for the selected example. Additionally, it will create theExampleInfoButton
andExampleInfo
IDs. ID conflicts cause Shiny to fail. - You will need to manually set
ExampleInfo
. The easiest way to is to make a reactive function that looks at a dictionary defined in theserver
:def ExampleInfo(): return Info[input.Example()]
- The
multiple
argument should be used with caution. It requires you do manually handle parsing input. See Spatial for an implementation
- It will create Shiny input IDs
- The
MainTab
function supports adding additional tabs via the*args
argument. See Spatial or Expression for implementations. It will create IDs:Interactive
, which should be your main page Heatmap,Table
, which you shouldn’t need to touch, as it handles creating all the associated values, and itself has an ID ofMainTab
. You may need to add ID’sUpdate
andReset
so that your reactive functions update when the user updates the table. - You will need to manually Filter columns. This involves calling
Filter
in a reactive function with the following arguments:- The input, usually
(await DataCache.Load(input).columns
- The type of column to look for, see
shared.py
for values. - A UI element to update, such as
NameColumn
- The input, usually
When changes are made within the code-base, they are not reflected in the WebAssembly site, which can cause incongruity when pushed to GitHub. Run the rebase.sh
script at the root of the repository to perform this action across all applications.
Heatmapper is designed to be easily deployed for different purposes, and to this effect most of the interface can be modified without having to modify the code itself (Technically you modify code, but that’s just so that configuration is bundled in web assembly).
Each project contains a config.py
file, a Python file which provides defaults and overrides to every configurable option in that program. However, the base config.py
is within Heatmapper’s version control system, which means that modification of it can cause clashes when attempting to update. For that reason, you should copy config.py
, creating a file named user.py
. Heatmapper will first check if user.py
exists, and use that for configuration, only falling back to config.py
if the former doesn’t exist. Do not modify config.py
Consider the configuration provided in Pairwise:
# Distance/Correlation
"MatrixType": Config(selected="Distance", visible=True),
This variable is attached to the associated input.MatrixType
which defines whether the user wants to select a Distance Matrix, or Correlation Matrix. Let’s break it down:
-
MatrixType
, the input name, and cannot be modified as it’s explicitly used within the main program. You cannot add new configurations (Every user input that can be modified is already present in the file) -
Config
is fromshared.py
, and is simply a class that wraps configuration. Every configuration is anConfig
object. -
selected
is the only required argument of any configuration. This specifies what Heatmapper should assign as the default value when loading the application. A comment above each Config outlines what your values can be. Some configurations usesvalue
instead, which is simply because some inputs “select” a value, such as the titularui.input_select
, whereas others simply have a value, such asui.input_checkbox
. The configuration already provides the correct keyword, so this has no impact on actually configuring the application so long as the original configuration keyword isn’t deleted. -
visible
is an optional argument that defaults toTrue
. Whenvisible
isTrue
, the associated user input in the sidebar will be visible when loading the application, and the user can make modifications to the value. Whenvisible
isFalse
, the input will be hidden from the sidebar, and the user will be unable to change theselected
value. This is useful where an application has no need for the option to be available (Such as only needing to display Distance Matrices) and helps declutter the sidebar and prevent user confusion. - Finally, something that is not shown in any of the default configurations, is that the
Config
class takes any key-word argument and stores it, applying them directly to the Shiny input object. Therefore, if we wanted to make sure theMatrixType
’s radio buttons are not inline, we could modify the configuration toMatrixType = Config(default="Distance", inline=False)
. You may notice that Heatmapper already definesinline=True
within Pairwise’s code, butConfig
objects will check for these conflicts, and will default to the Configuration. You can therefore override all of the parameters of the input, save the input type itself. Refer to Shiny’s excellent documentation if you want to make any such changes; note that you cannot change the input type itself, and some modifications may cause issues with the application (IE specifyingmultiple=True
where Heatmapper does not expect multiple inputs)
Heatmapper has some configurations that do not have an associated value. There are such types, both of which warrant additional explanation:
- Configurations that are only there for visibility. Consider:
"DownloadTable": Config()
. This is an input that doesn’t expose any “values,” it’s simply a button. These configurations exist to toggle visibility of features through thevisible
keyword. - Configurations that are dynamic inputs. Examples include
"Keys": Config()
in Spatial, and"KeyColumn": Config()
in Geomap. These inputs are dynamically updated by Heatmapper because input files often have different column names for different values, such as some files usingNAME
, others usingKEY
, etc. While These configurations support both setting aselected=
andvisible=
keyword, the behavior differs in important ways:- When
visible=True
, theselected
value will be defaulted to, so long as it exists in the data. If you defineselected="NAME"
, Heatmapper will default (Remember, the user can still change this value when the input is visible) to the selected value, case-sensitive, until a file is provided where the column does not exist. When that happens, Heatmapper will use its Filtering mechanism and automatically choose a more appropriate column name. - When
visible=False
, theselected
value is constant and unchanging. Even if the column doesn’t exist in the input data, Heatmapper will use it; this means that you need to be very careful with what you select for a default value, and what input you provide to the application, as if the column name doesn’t exist, Heatmapper will not rectify the incongruity and will simply fail to render.
- When
Column Filter is an important facet of Heatmapper’s design, so it’s recommended not to touch the dynamic inputs, especially disabling their visibility, as it encumbers the application to hard-coded values that are antithesis to its design. However, if your use-case requires very specific file formats, where the column names are known and will not change, disabling the Filtering can reduce user confusion.
If you’re working within the code-base, you may wonder how to actually work with Configuration values. In essence, they’re just wrappers on Shiny’s input values (If the input UI’s aren’t visible, that’s literally all they are). They can’t be used as reactive decorators, but with caching you shouldn’t need to use reactive decorators in the first place.
Configuration variables are optional. You can use regular Shiny input’s just as well as you can use configuration values, but while you don’t need to use the former to use the latter, the reverse is not true. To create a Configuration value, there are three steps:
- Define the
Config
class within theconfig.py
file. See the above Configuration section on its structure. - Wrap the
ui.input
value in theapp_ui
with the Configuration’sUI
members. For example, if you have a config"MatrixType": Config()
, you’ll want to take the Shiny input withid="MatrixType
within theapp_ui
, and change it to:config.MatrixType.UI(ui.input, id="MatrixType", ...)
Some things to note:- The
ui.input
object does not take the keyword arguments, don’t doui.input(id="MatrixType", ...))
- You must exclusively use keyword arguments, and they’ll be passed to the
ui.input
object
- The
- Replace uses of
input.X()
withconfig.X()
. Don’t use them in reactive decorators.
Heatmapper employs two types of Caching, Web Resource Caching and Computation Caching:
Web Resource Caching should always be utilized, and if you fetch information using the DataCache
it will be done automatically. You’ll need to use the FileSelection
function within your app_ui
. If you need to fetch more than just a single example, you can fetch any arbitrary content using the Cache. Consider an example from Geomap. Firstly, you need to define a reactive variable, and an updater function:
JSON = reactive.value(None)
#...
@reactive.effect
@reactive.event(input.JSONUpload, input.JSONSelection, input.JSONFile)
async def UpdateGeoJSON(): JSON.set(await DataCache.Load(
input,
source_file=input.JSONUpload(),
example_file=input.JSONSelection(),
source=URL,
input_switch=input.JSONFile(),
default=None
))
# ...
json = JSON()
Some things to note:
- Use a reactive variable. constantly querying the Cache is wasteful and inefficient.
- Ensure you have reactive decorators. This is one of the only functions in Heatmapper that you should have decorators, as this will cause the reactive variable to be modified, and will trigger all functions that rely on it.
- Make it asynchronous; as with decorators, this will be the only function where you should do this, and you should only let the server call this function. When you need the value, call the variable:
json = JSON()
. - You cannot use configuration values for the reactive values. You need to use regular Shiny input values.
- Note the arguments to the Cache:
-
source_file
is a Shinyui.input_file
. Shiny and Heatmapper handle taking user input and parsing it. -
example_file
dictates the name of the example file. You have two formats in this regard.- A file name relative to the
source
variable. By default, this points to your example directory, so if you have a file stored inexample_input/my_test
,input.JSONUpload()
can simply bemy_test
. - A URL. If the
source_file
starts withhttps://
,source
will be completely ignored and thesource_file
be fetched directly. Look at Geomap’s example files to see how one of the examples are fetched from outside the normal place, simply by using a URL.
- A file name relative to the
-
source
Defines whereexample_file
will be located. Usually, this theexample_input
folder for the application, but you can set it wherever you want. For this example, the URL points todata
within the Geomap folder. Importantly, this source has to be local when running as a server, or remote when running as WebAssembly. You’ll need to use thePyodide
variable in shared to know what more Heatmapper is running in; for Geomap, it sets the URL to../data
in server mode and a link to GitHub otherwise. Heatmapper expects example files to be located on disk when not running under Pyodide. -
input_switch
defines the input that defines whether we’re expecting an example file, or a user-uploaded file. If it’s equal to"Upload"
, it’ll be looking atsource_file
, otherwise it looks atexample_file
. -
default
defines what to return if there’s nothing to return. This defaults to a DataFrame, but you may want to change it so whatever type you expect to return, otherwise you might get unexpected objects when there is nothing to return.
-
Heatmapper also supports arbitrary computation caching, although you’ll need to go out of your way to use it. In essence, you’ll be using three functions in your Cache object: In()
, Get()
, and Store()
. Firstly, you’ll need to make a list of inputs that this computation uses. That way, changes to inputs will ensure that an invalid cached object isn’t return. Heatmapper makes no effort to ensure all your inputs are accounted for. Consider the Imaging Caching used by Pairwise, Expression, and Image. Firstly, at the start of each Heatmap
call, it creates a list of inputs:
inputs = [
input.File() if input.SourceFile() == "Upload" else input.Example(),
input.Image(),
config.ColorMap(),
config.Opacity(),
config.Algorithm(),
config.Levels(),
config.Features(),
config.TextSize(),
config.DPI(),
]
Notice that we take the value of these (IE it’s a list of strings, not a list of reactive objects), and that we can be conditional about what values truly make up the hash (We don’t need both File
and Example
, we just need whatever is selected). Then, we use the first function, In()
:
if not DataCache.In(inputs):
# ...
It’s recommended to check the absence of the object in the Cache, compute it and place it in the cache, and then return it so that both branches in the condition have the same return statement. In the case that the object isn’t in the Cache, the application will do the regular computation to create the output, and then stores it in the Cache:
b = BytesIO()
fig.savefig(b, format="png", dpi=config.DPI())
b.seek(0)
DataCache.Store(b.read(), inputs)
Note that we cannot store MatPlotLib plots directly, we save it as an image, and store the image’s bytes within the Cache, associating it with the inputs used to make it. Finally, we return the object within the Cache:
b = DataCache.Get(inputs)
with NamedTemporaryFile(delete=False, suffix=".png") as temp:
temp.write(b)
temp.close()
img: types.ImgData = {"src": temp.name, "height": f"{config.Size()}vh"}
return img
The Temporary File shenanigans aren’t important, what is important is that we use Get()
to retrieve that binary stream, and then return it appropriately.