Skip to content
Kyle Kernick edited this page Apr 4, 2024 · 6 revisions

Overview

Heatmapper largely works on tables, internally represented as pandas.DataFrame objects. For that reason, most of the applications support the following file types:

  • .csv via the pandas.read_csv() function.
  • .xls, .xlsx, .odf via the pandas.read_excel() function. It uses the default arguments, so the first sheet is taken.
  •  .txt, .dat, .tsv, or .tab via the pandas.read_table() function. This function acts as the default case if a file extension isn’t recognized by an application, so it technically supports every other file extension, so long as the input is a general delimited file.

All of these files will filter invalid values.

Heatmapper determine’s the content of columns using it’s column name; so, rather than looking at the first column for values, or will instead look for a column named “Values.” This means that the order of columns does not matter. Below outlines all the supported file for each program, and the expected information within those files (Where applicable). Unless otherwise outlined, an application supports the aforementioned file types, from now on referred to as as Table File.

This document refers to Column Types, which are a set of common column names associated with a respective type. Heatmapper will look for these column names, irrespective of case, and will use the column if found for the specified function. So, Heatmapper will look for a Name column to get the names associated with the data, which will look for column names “name”, “orf”, “uniqid”, “face”, and “triangle.” If no column name matches what Heatmapper expects, it will return a set difference of all the columns, minus those it knows are not valid (It won’t show a column named “longitude” when looking for a Name). These are then presented in the UI, so that the user can specify. The following table lists the Column Types used by Heatmapper, and values they explicitly look for. For a more up to date list, check the Source Code

Column Type Expected Values (Case-Insensitive)
Time TIME, DATE, YEAR
Name NAME, ORF, UNIQID, FACE, TRIANGLE
Value VALUE, WEIGHT, INTENSITY, IN_TISSUE
Longitude LONGITUDE, LONG
Latitude LATITUDE, LAT
X X
Y Y
Z Z
Cluster CELL TYPE, CELLTYPE_MAPPED_REFINED, CLUSTER, CELL_CLASS, CELL_SUBCLASS, CELL_CLUSTER
Free Free dictates that Heatmapper does not expect a standardized column name. However, it will still filter out all the other column names mentioned in this table
Spatial SPATIAL
This document will use italicized references to these column types. So when a Heatmap uses a Name column, Heatmapper will use any column that matches the above values. If multiple values exist, the heatmap usually offers a drop down in its respective sidebar to manually choose one.

Note: There is no limitation on the size of input, either file size, or dimensions. However, when running under WebAssembly, files will be loaded into RAM, which could cause performance issues if the client computer lacks sufficient free memory. Additionally, arbitrary limitations may be imposed on server in order to balance demand.

Expression

Expression supports reading Table Files. It will use the Name column for associating a name with the respective data. For an example, see Here All other columns will be plotted, with unique naming schemes applied if they do not exist or have repeated names.

Pairwise

Pairwise supports reading Table Files, .PDB files, and .FASTA files.

For table files, Heatmapper supports three different formats:

  1. A table with a Name, X, Y, and Z column. These will be turned into a distance matrix, labelled with the Name column. See here for an example.
  2. A table that contains a distance/correlation matrix. Column names are optional, and will be used to label the data if they exist. See here for an example. Additionally, the first column can also be names; see here for an example.

For .PDB files, Heatmapper will parse the file and grab the coordinates of all atoms in the specified Chain (See Interface) and compute a distance/correlation matrix from it. See here for an example.

For .FASTA files, Heatmapper will parse the file, partitioned into K-Mers (See Interface) , and generates a distance/correlation matrix based on the counts of each K-Mer. See here for an example.

Image

Image has two types of input, data as a Table Files, and images.

For table files, the format can either be:

  • A value matrix, where row and column are treated as X and Y coordinates on a 2D plane, with the value being treated as the respective value of that X, Y over the image. See here for an example.
  • A table with an explicit Value, X, and Y column. See here for an example.

For images, Heatmapper supports .bmp, .gif, .h5, .hdf, .ico, .jpeg, .jpg, .tif, .tiff, .webp, .png file formats. Heatmapper uses the PIL library, which means it supports more than just these file types. If you have an esoteric image format, there’s a good chance Heatmapper can work with it.

Geomap

Geomap has two types of input, data as Table Files, and GeoJSON files.

For table files, the format depends on whether Heatmapper is plotting Temporal or Static Choropleths:

  • For Static Choropleths, a Name and Value column are used, where the values in the key column align with the provided GeoJSON names. See here for an example. Heatmapper will automatically select 1990, 2000, and 2013 as possible value columns, that can be switched between in the sidebar.
  • For Temporal Choropleths, there are two operating modes:
    • Tables with an explicit Time column. In this case, values are grouped together based on this column, and then plotted linearly. Take this example, where “date” is the grouping column.
    • Tables where each column besides the Name column are the values for the associated names, at an associated date. Heatmapper will cut the column name and only treat the first “word” as a date, so examples such as this one work as expected. Here is another example.

GeoJSON is a standardized format, so any file with a .geojson extension should work. Make sure that the names in the file match the Name column in the table (You can view these names in the former in the GeoJSON tab, and can make modifications to the latter in the Table tab)

Geocoordinate

Geocoordinate takes a single Table File as input. Like Geomap, the formatting depends on whether its displaying static or temporal heatmaps:

  • For static heatmaps, a Longitude and Latitude column are required. If a Value column exists, it will be used accordingly, otherwise uniform values are assigned. See here for an example without an explicit Value column (Hence Heatmapper assigns uniform values), and here for one where values are provided. Also, compare the previous example to this one to notice that the ordering of the columns do not matter (Weight and Value can come before or after Long/Lat).
  • For temporal heatmaps an explicit Time must be provided. Values are then grouped by that column and plotted linearly.

3D

3D takes in two types of input, a model, and data.

For models, Heatmapper uses PyVista.read to load .obj files. Being a standard file format, any file with the extension should work in Heatmapper.

For data, there are two different types:

  • A Table File, where a Value column provides an explicit value for each face of the model. If a Name column is provided, it must be the numerical value of the face, otherwise the values will be applied linearly. See here for an example.
  • A texture, to which .png and .jpg are supported. They will applied uniformly to the model.

Spatial

Spatial has the more varied input out of all the Heatmaps; there are two operating modes for user input:

  • Preprocessed AnnData files. Files that have already been generated by AnnData, with file extensions .h5ad can be directly read without any other files. This output can be found floating around, and is what Heatmapper will output when you click Download Table
  • Multiple files from a Space Ranger output. Specifically, you’ll need to provide the following files:
    • A .h5 counts file
    • The contents of the spatial folder, specifically:
      • Two tissue_{hires/lowres}_image.png images
      • A scalefactors_json.json
      • A tissue_positions.csv
    • See here for SquidPy’s source, here for an explanation of Space Ranger Output, and here for Heatmapper’s wrapper
    • You’ll need to upload all these files at once, rather than in batches, which can be slightly awkward given the counts file is not usually contained in the spatial folder. To make things easier, copy the .h5 counts file into the spatial folder, and you can then either rubber-band all the files, or shift-click to select all of them.
Clone this wiki locally