Skip to content
This repository has been archived by the owner on Jun 30, 2024. It is now read-only.

Draft: Concatenate #38

Open
wants to merge 8 commits into
base: develop
Choose a base branch
from
Open

Draft: Concatenate #38

wants to merge 8 commits into from

Conversation

fjhheras
Copy link
Owner

@fjhheras fjhheras commented Nov 4, 2021

No description provided.

@fjhheras fjhheras changed the title Concatenate Draft: Concatenate Nov 4, 2021
@codecov
Copy link

codecov bot commented Nov 4, 2021

Codecov Report

Merging #38 (c754f6b) into develop (7fe6ecd) will increase coverage by 0.23%.
The diff coverage is 76.19%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop      #38      +/-   ##
===========================================
+ Coverage    55.97%   56.21%   +0.23%     
===========================================
  Files           18       20       +2     
  Lines         1079     1119      +40     
===========================================
+ Hits           604      629      +25     
- Misses         475      490      +15     
Flag Coverage Δ
unittests 56.21% <76.19%> (+0.23%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
trajectorytools/trajectories/concatenate.py 64.10% <64.10%> (ø)
trajectorytools/trajectories/trajectories.py 91.03% <95.65%> (ø)
trajectorytools/trajectories/__init__.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7fe6ecd...c754f6b. Read the comment docs.

@antortjim
Copy link

antortjim commented Nov 9, 2021

When testing the extend method defined here https://github.com/shaliulab/trajectorytools/blob/eac2e1a1d8cba9c2a5bef29c52c16d315effeb77/trajectorytools/trajectories.py#L317 (which takes another trajectory object, loaded with tt.Trajectories.from_idtrackerai, and concatenates them by matching identities using a distance metric)
I noticed calling tt.Trajectories.from_idtrackerai yields a Trajectory with 2 frames less than the input video. Reading the code I found the culprit here

def velocity_acceleration_backwards(t, k_v_history=0.0):

and I understood these 2 datapoints are missing because because how math works, we need at least 2 previous datapoints to compute an acceleration (ergo acceleration is not available for 2 datapoints). I solved this by just copying the closest velocity and acceleration measurement as shown here:

https://github.com/shaliulab/trajectorytools/blob/eac2e1a1d8cba9c2a5bef29c52c16d315effeb77/trajectorytools/interpolate.py#L187

This fulfilled my needs because I only really need the positions (s) for my application, and even if I needed v or a, is still fine to just pad 2 datapoints on every video. In my case my videos have 12 FPS for 5 minutes, so this results in 2 consecutive missing points interpolated, i.e. 1/6 of a second, every 5 minutes.

I have taken the changes in this PR to my fork of trajectorytools and tested it on my data. I made some extra changes:

  • I implement some logic to conveniently retrieve the path to the trajectory files in a experiment and filter them (to say only a few consecutive chunks for instance)
    def pick_trajectory_file(session_folder)
    def get_trajectories(idtrackerai_collection_folder)

  • I had to change the logic in _concatenate_two_np because the incoming shape of the arrays (at least when I run trajectorytools) is not (individuals, frames, 2) but (frames, individuals, 2). This latter shape is what I expected from using trajectorytools in the past, so I am inclined to think I fixed a silly bug, but let me know otherwise! https://github.com/shaliulab/trajectorytools/blob/aaf393ada6b2e456f4355c7e9b06999eccb4905d/trajectorytools/trajectories/concatenate.py#L33

After implementing these changes and running trajectorytools again, I get a trajectory with the expected number of individuals but still 2 frames missing. Note this is better than what I had before, because before I had 2 missing frames per chunk and now I just have 2 missing, regardless of how many chunks were loaded.

For my application I will need either

  1. a trajectory with the same number of frames as the idtrackerai data
    OR
  2. a way to backtrack which frame does any trajectory datapoint belong to. This is possible if we add an extra key to the trajectories that just states the frame number of any given data point

I am now looking into why there are still 2 frames missing

Thank you!
Antonio

PS For completeness, I organize my data using my fork of https://github.com/loopbio/imgstore here https://github.com/shaliulab/imgstore . I have a folder with several .avi files of 5 minutes@12FPS named 000000.avi, 0000001.avi, ... I then run idtrackerai on each sequentially and I end up with new folders called session_000000, session_000001, ... This way what I call idtrackerai_collection_folder is the main folder containing all data (i.e. all the *.avi files and session_* folders) and session_folder is just one of the session_* folders

@antortjim
Copy link

antortjim commented Nov 9, 2021

I guess there are still 2 frames missing because the computation of v and a happens after the concatenation done in _concatenate_idtrackerai_dicts. It happens in the call to import_idtrackerai_dict when the Trajectory is initialized.
This is the behavior one would get if all the chunked data was a single chunk. However, wouldn't it make sense to not discard the position data, and instead keep it and interpolate v and a? This way we get a Trajectory with the same number of frames as the input, which is what most users could expect, I think (at least I would), and makes it easier to match datapoints with frames. As it is now, I will match the data from the second frame to the first frame, and so on (which is not a major issue, but weird, I think).

@antortjim
Copy link

antortjim commented Nov 9, 2021

I have made a MWE to test this PR here

with the dataset here (first 3 chunks) https://www.dropbox.com/sh/ww4b3cy3q8o21l4/AADXO-ICsOFAdZThSpan_C91a?dl=0

and this snippet

import trajectorytools.trajectories.concatenate as concatenate

experiment_folder = "ABSPATH_TO_DOWNLOADED_DATASET"

chunks = list(range(0, 3))
trajectories = concatenate.get_trajectories(experiment_folder)
trajectories = [trajectories[f"session_{str(chunk).zfill(6)}"] for chunk in chunks]
trajectories

trajectory = concatenate.from_several_idtracker_files(trajectories)
print(trajectory.s.shape)
# (10798, 6, 2)
# i.e. 3600 * 3 - 2 frames, 6 individuals

@fjhheras
Copy link
Owner Author

fjhheras commented Nov 16, 2021

Hi! Thank you for looking at this. I am away for a few days, so I cannot look at it carefully.

I created this draft merging your changes from your fork, so we can work there: https://github.com/fjhheras/trajectorytools/pull/43/files

I do not see the tests passing. What is the output of pytest tests/ ?

Could you change

on:
push:
branches: [ master, develop ]
pull_request:
branches: [ master, develop ]

to

on:
push:
branches: [ master, develop ]
pull_request

so the test run automatically? I could change it as well, but you would need to rebase your branch

antortjim added a commit to shaliulab/trajectorytools that referenced this pull request Nov 16, 2021
@antortjim
Copy link

antortjim commented Nov 16, 2021

I am not very familiar with the CI workflows, but if I understood you correctly, this is what I had to do now? shaliulab@13064cb Let me know otherwise

The output of the tests is the following
pytest.log

I think the issues are arising from the fact that I had to change the shape of the trajectories (as mentioned in my second bullet point here #38 (comment))

from

individualsxframesx2

to

framesxindividualsx2

because this latter shape is what I expected from my previous experience in trajectorytools. But the functions in your tests have hardcoded the shape individualsxframesx2 as the expectation. If individualsxframesx2 is indeed correct, I just need to undo this change of mine.

@fjhheras
Copy link
Owner Author

fjhheras commented Nov 16, 2021

I am not very familiar with the CI workflows, but if I understood you correctly, this is what I had to do now? shaliulab@13064cb Let me know otherwise

You need to keep the line pull_requests, bu without the colon. I think that would do

If this does not trigger the automatic tests and linter, you can run them offline:

(Please, skip the part about the CHANGELOG. I will update it and add you as contributor when we are about to merge concatenate)

@fjhheras
Copy link
Owner Author

fjhheras commented Nov 16, 2021

I think the issues are arising from the fact that I had to change the shape of the trajectories (as mentioned in my second bullet point here #38 (comment))

You are completely right. The shape must be framesxindividualsx2 and your change fixes it.

I did it right the first time and then changed it back to the wrong shape because the artificial trajectories we used for another test were wrong! I would be grateful if you could add an axis = 1 as a last argument to function circular_trajectories in tests/trajectories/test_trajectories2.py and then change tests in tests/trajectories/test_concatenate.py accordingly.

If you do not have time, or you do not feel confident modifying the tests, we can always merge to concatenate branch and I can do the changes there. But I think it should be more or less straightforward (just correcting the shape mismatch!)

@antortjim
Copy link

I have fixed the .github/workflows/pythonpackage_coverage.yml and run black to standardise the syntax. But running the flake8 line is giving me the following error:
flake8: error: no such option: --extend-ignore

without the flag it seems to work without errors but it does not change anything (I think it's just reporting non seriuous stuff)
I have installed flake8 by issuing a sudo apt install flake8 on an Ubuntu 20.04

flake8 --version
3.5.0 (mccabe: 0.6.1, pycodestyle: 2.3.1, pyflakes: 1.6.0) CPython 3.6.9 on Linux

@fjhheras
Copy link
Owner Author

fjhheras commented Nov 23, 2021

I have installed flake8 by issuing a sudo apt install flake8 on an Ubuntu 20.04

Yes, I think your flake8 is a bit old. I suggest installing it using conda or pip in your environment (instead of the ubuntu package). For reference, this is my version:

3.9.2 (mccabe: 0.6.1, pycodestyle: 2.7.0, pyflakes: 2.3.1) CPython 3.7.10 on
Linux

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants