Draft: Concatenate #38

fjhheras · 2021-11-04T14:50:52Z

No description provided.

codecov · 2021-11-04T15:11:13Z

Codecov Report

Merging #38 (c754f6b) into develop (7fe6ecd) will increase coverage by 0.23%.
The diff coverage is 76.19%.

@@             Coverage Diff             @@
##           develop      #38      +/-   ##
===========================================
+ Coverage    55.97%   56.21%   +0.23%     
===========================================
  Files           18       20       +2     
  Lines         1079     1119      +40     
===========================================
+ Hits           604      629      +25     
- Misses         475      490      +15

Flag	Coverage Δ
unittests	`56.21% <76.19%> (+0.23%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
trajectorytools/trajectories/concatenate.py	`64.10% <64.10%> (ø)`
trajectorytools/trajectories/trajectories.py	`91.03% <95.65%> (ø)`
trajectorytools/trajectories/__init__.py	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7fe6ecd...c754f6b. Read the comment docs.

antortjim · 2021-11-09T13:26:29Z

When testing the extend method defined here https://github.com/shaliulab/trajectorytools/blob/eac2e1a1d8cba9c2a5bef29c52c16d315effeb77/trajectorytools/trajectories.py#L317 (which takes another trajectory object, loaded with tt.Trajectories.from_idtrackerai, and concatenates them by matching identities using a distance metric)
I noticed calling tt.Trajectories.from_idtrackerai yields a Trajectory with 2 frames less than the input video. Reading the code I found the culprit here

trajectorytools/trajectorytools/interpolate.py

Line 187 in 7640b6e

def velocity_acceleration_backwards(t, k_v_history=0.0):

and I understood these 2 datapoints are missing because because how math works, we need at least 2 previous datapoints to compute an acceleration (ergo acceleration is not available for 2 datapoints). I solved this by just copying the closest velocity and acceleration measurement as shown here:

https://github.com/shaliulab/trajectorytools/blob/eac2e1a1d8cba9c2a5bef29c52c16d315effeb77/trajectorytools/interpolate.py#L187

This fulfilled my needs because I only really need the positions (s) for my application, and even if I needed v or a, is still fine to just pad 2 datapoints on every video. In my case my videos have 12 FPS for 5 minutes, so this results in 2 consecutive missing points interpolated, i.e. 1/6 of a second, every 5 minutes.

I have taken the changes in this PR to my fork of trajectorytools and tested it on my data. I made some extra changes:

I implement some logic to conveniently retrieve the path to the trajectory files in a experiment and filter them (to say only a few consecutive chunks for instance)
def pick_trajectory_file(session_folder)
def get_trajectories(idtrackerai_collection_folder)
I had to change the logic in _concatenate_two_np because the incoming shape of the arrays (at least when I run trajectorytools) is not (individuals, frames, 2) but (frames, individuals, 2). This latter shape is what I expected from using trajectorytools in the past, so I am inclined to think I fixed a silly bug, but let me know otherwise! https://github.com/shaliulab/trajectorytools/blob/aaf393ada6b2e456f4355c7e9b06999eccb4905d/trajectorytools/trajectories/concatenate.py#L33

After implementing these changes and running trajectorytools again, I get a trajectory with the expected number of individuals but still 2 frames missing. Note this is better than what I had before, because before I had 2 missing frames per chunk and now I just have 2 missing, regardless of how many chunks were loaded.

For my application I will need either

a trajectory with the same number of frames as the idtrackerai data
OR
a way to backtrack which frame does any trajectory datapoint belong to. This is possible if we add an extra key to the trajectories that just states the frame number of any given data point

I am now looking into why there are still 2 frames missing

Thank you!
Antonio

PS For completeness, I organize my data using my fork of https://github.com/loopbio/imgstore here https://github.com/shaliulab/imgstore . I have a folder with several .avi files of 5 minutes@12FPS named 000000.avi, 0000001.avi, ... I then run idtrackerai on each sequentially and I end up with new folders called session_000000, session_000001, ... This way what I call idtrackerai_collection_folder is the main folder containing all data (i.e. all the *.avi files and session_* folders) and session_folder is just one of the session_* folders

antortjim · 2021-11-09T13:43:06Z

I guess there are still 2 frames missing because the computation of v and a happens after the concatenation done in _concatenate_idtrackerai_dicts. It happens in the call to import_idtrackerai_dict when the Trajectory is initialized.
This is the behavior one would get if all the chunked data was a single chunk. However, wouldn't it make sense to not discard the position data, and instead keep it and interpolate v and a? This way we get a Trajectory with the same number of frames as the input, which is what most users could expect, I think (at least I would), and makes it easier to match datapoints with frames. As it is now, I will match the data from the second frame to the first frame, and so on (which is not a major issue, but weird, I think).

antortjim · 2021-11-09T13:57:40Z

I have made a MWE to test this PR here

with the dataset here (first 3 chunks) https://www.dropbox.com/sh/ww4b3cy3q8o21l4/AADXO-ICsOFAdZThSpan_C91a?dl=0

and this snippet

import trajectorytools.trajectories.concatenate as concatenate

experiment_folder = "ABSPATH_TO_DOWNLOADED_DATASET"

chunks = list(range(0, 3))
trajectories = concatenate.get_trajectories(experiment_folder)
trajectories = [trajectories[f"session_{str(chunk).zfill(6)}"] for chunk in chunks]
trajectories

trajectory = concatenate.from_several_idtracker_files(trajectories)
print(trajectory.s.shape)
# (10798, 6, 2)
# i.e. 3600 * 3 - 2 frames, 6 individuals

fjhheras · 2021-11-16T12:42:17Z

Hi! Thank you for looking at this. I am away for a few days, so I cannot look at it carefully.

I created this draft merging your changes from your fork, so we can work there: https://github.com/fjhheras/trajectorytools/pull/43/files

I do not see the tests passing. What is the output of pytest tests/ ?

Could you change

on:
push:
branches: [ master, develop ]
pull_request:
branches: [ master, develop ]

to

on:
push:
branches: [ master, develop ]
pull_request

so the test run automatically? I could change it as well, but you would need to rebase your branch

antortjim · 2021-11-16T13:26:38Z

I am not very familiar with the CI workflows, but if I understood you correctly, this is what I had to do now? shaliulab@13064cb Let me know otherwise

The output of the tests is the following
pytest.log

I think the issues are arising from the fact that I had to change the shape of the trajectories (as mentioned in my second bullet point here #38 (comment))

from

individualsxframesx2

to

framesxindividualsx2

because this latter shape is what I expected from my previous experience in trajectorytools. But the functions in your tests have hardcoded the shape individualsxframesx2 as the expectation. If individualsxframesx2 is indeed correct, I just need to undo this change of mine.

fjhheras · 2021-11-16T15:28:44Z

I am not very familiar with the CI workflows, but if I understood you correctly, this is what I had to do now? shaliulab@13064cb Let me know otherwise

You need to keep the line pull_requests, bu without the colon. I think that would do

If this does not trigger the automatic tests and linter, you can run them offline:

to autoformat your code using black (see https://github.com/fjhheras/trajectorytools/blob/master/.github/CONTRIBUTING.md): black -l 79 .
To run the linter : flake8 trajectorytools --max-line-length 87 --extend-ignore=E203,W503,F401,F403,E741 --show-source --statistics

(Please, skip the part about the CHANGELOG. I will update it and add you as contributor when we are about to merge concatenate)

fjhheras · 2021-11-16T15:45:02Z

I think the issues are arising from the fact that I had to change the shape of the trajectories (as mentioned in my second bullet point here #38 (comment))

You are completely right. The shape must be framesxindividualsx2 and your change fixes it.

I did it right the first time and then changed it back to the wrong shape because the artificial trajectories we used for another test were wrong! I would be grateful if you could add an axis = 1 as a last argument to function circular_trajectories in tests/trajectories/test_trajectories2.py and then change tests in tests/trajectories/test_concatenate.py accordingly.

If you do not have time, or you do not feel confident modifying the tests, we can always merge to concatenate branch and I can do the changes there. But I think it should be more or less straightforward (just correcting the shape mismatch!)

antortjim · 2021-11-23T10:55:31Z

I have fixed the .github/workflows/pythonpackage_coverage.yml and run black to standardise the syntax. But running the flake8 line is giving me the following error:
flake8: error: no such option: --extend-ignore

without the flag it seems to work without errors but it does not change anything (I think it's just reporting non seriuous stuff)
I have installed flake8 by issuing a sudo apt install flake8 on an Ubuntu 20.04

flake8 --version
3.5.0 (mccabe: 0.6.1, pycodestyle: 2.3.1, pyflakes: 1.6.0) CPython 3.6.9 on Linux

fjhheras · 2021-11-23T11:02:12Z

I have installed flake8 by issuing a sudo apt install flake8 on an Ubuntu 20.04

Yes, I think your flake8 is a bit old. I suggest installing it using conda or pip in your environment (instead of the ubuntu package). For reference, this is my version:

3.9.2 (mccabe: 0.6.1, pycodestyle: 2.7.0, pyflakes: 2.3.1) CPython 3.7.10 on
Linux

fjhheras added 4 commits November 4, 2021 13:39

trajectories as submodule

3f30188

Organizing trajectories submodule

0f641c1

Reorganizing tests for trajectories

07d0974

concatenate from_several_positions and test

8fded57

fjhheras requested a review from pacorofe November 4, 2021 14:50

fjhheras changed the title ~~Concatenate~~ Draft: Concatenate Nov 4, 2021

Flake8 complaints

c6b598b

fjhheras added 3 commits November 4, 2021 15:22

Separating idtrackerai import to file

c577913

Concatenate idtracker ai files

b2e7b1f

removed idtracker file because import problems

c754f6b

antortjim added a commit to shaliulab/trajectorytools that referenced this pull request Nov 16, 2021

Follow fjhheras#38 (comment)

13064cb

antortjim mentioned this pull request Nov 23, 2021

Draft: update concatenate #43

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Draft: Concatenate #38

Draft: Concatenate #38

fjhheras commented Nov 4, 2021

codecov bot commented Nov 4, 2021 •

edited

Loading

antortjim commented Nov 9, 2021 •

edited

Loading

antortjim commented Nov 9, 2021 •

edited

Loading

antortjim commented Nov 9, 2021 •

edited

Loading

fjhheras commented Nov 16, 2021 •

edited

Loading

antortjim commented Nov 16, 2021 •

edited

Loading

fjhheras commented Nov 16, 2021 •

edited

Loading

fjhheras commented Nov 16, 2021 •

edited

Loading

antortjim commented Nov 23, 2021

fjhheras commented Nov 23, 2021 •

edited

Loading

Draft: Concatenate #38

Are you sure you want to change the base?

Draft: Concatenate #38

Conversation

fjhheras commented Nov 4, 2021

codecov bot commented Nov 4, 2021 • edited Loading

Codecov Report

antortjim commented Nov 9, 2021 • edited Loading

antortjim commented Nov 9, 2021 • edited Loading

antortjim commented Nov 9, 2021 • edited Loading

fjhheras commented Nov 16, 2021 • edited Loading

antortjim commented Nov 16, 2021 • edited Loading

fjhheras commented Nov 16, 2021 • edited Loading

fjhheras commented Nov 16, 2021 • edited Loading

antortjim commented Nov 23, 2021

fjhheras commented Nov 23, 2021 • edited Loading

codecov bot commented Nov 4, 2021 •

edited

Loading

antortjim commented Nov 9, 2021 •

edited

Loading

antortjim commented Nov 9, 2021 •

edited

Loading

antortjim commented Nov 9, 2021 •

edited

Loading

fjhheras commented Nov 16, 2021 •

edited

Loading

antortjim commented Nov 16, 2021 •

edited

Loading

fjhheras commented Nov 16, 2021 •

edited

Loading

fjhheras commented Nov 16, 2021 •

edited

Loading

fjhheras commented Nov 23, 2021 •

edited

Loading