Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow time-based data-splitting with multiple datasources as 'split.basis' #261

Merged
merged 8 commits into from
May 27, 2024

Conversation

MaLoefUDS
Copy link
Contributor

@MaLoefUDS MaLoefUDS commented Apr 27, 2024

Prerequisites

  • I adhere to the coding conventions (described here) in my code.
  • I have updated the copyright headers of the files I have modified.
  • I have written appropriate commit messages, i.e., I have recorded the goal, the need, the needed changes, and the location of my code modifications for each commit. This includes also, e.g., referencing to relevant issues.
  • I have put signed-off tags in all commits.
  • I have updated the changelog file NEWS.md appropriately.
  • I have checked whether I need to adjust the showcase file showcase.R with respect to my changes.
  • The pull request is opened against the branch dev.

Scope

This PR aims to resolve issue #254

Description

Allow time-based data-splitting with multiple datasources as split.basis

Changelog

Added

  • Add test for new functionality

Changed

  • Adjust handling of split.basis in split.data.by.time.or.bins and split.data.time.based to allow for splitting by multiple datasources

Copy link
Collaborator

@bockthom bockthom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR @MaLoefUDS.

Before I start with my actual review, just a general comment: Could you please reference the issue number somewhere in the PR? Either in the description of the PR or inside the commit messages of suitable commits, such that we have a connection between PR and issue? Thanks!

In general, this PR looks good. I just have a few comments.

util-split.R Outdated Show resolved Hide resolved
util-split.R Outdated Show resolved Hide resolved
util-split.R Outdated Show resolved Hide resolved
util-split.R Show resolved Hide resolved
util-split.R Outdated Show resolved Hide resolved
tests/test-split-data-time-based.R Outdated Show resolved Hide resolved
tests/test-split-data-time-based.R Outdated Show resolved Hide resolved
bockthom
bockthom previously approved these changes May 15, 2024
Copy link
Collaborator

@bockthom bockthom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me now.

@MaLoefUDS
Copy link
Contributor Author

I will update the NEWS.md later today 👍

Copy link
Collaborator

@bockthom bockthom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I just have two comments regarding comments.

NEWS.md Outdated Show resolved Hide resolved
showcase.R Show resolved Hide resolved
Allow to explicitly specify 'split.basis' as a vector when using
'split.data.time.based'. The resulting splits will then be constructed
from the union of elements from all datasources in 'split.basis'.

Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
* Concretize the roxygen documentation for 'split.data.by.bins' and
  'split.data.by.time.or.bins' regarding the 'split.basis' parameter
* Reword comments in 'split.data.activity.based' to not include
  'split basis' but rather 'activity type' for consistency with the
  parameter name
* Remove unecessary default value for 'split.basis' in
  'split.data.by.time.or.bins' since defaults are now handled in all
  wrapper functions

Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
Copy link

codecov bot commented May 23, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 79.89%. Comparing base (4d35daa) to head (06d4d6a).

Additional details and impacted files
@@            Coverage Diff             @@
##              dev     #261      +/-   ##
==========================================
+ Coverage   79.84%   79.89%   +0.04%     
==========================================
  Files          16       16              
  Lines        4898     4905       +7     
==========================================
+ Hits         3911     3919       +8     
+ Misses        987      986       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@MaLoefUDS
Copy link
Contributor Author

🔔 Ping! Do not miss this PR :)

@bockthom
Copy link
Collaborator

🔔 Ping! Do not miss this PR :)

Thanks for the ping. However, this time, the ping was not necessary 😄 Christian and I planned to have a look at this PR today anyway. So, I assume you will either receive comments later today or we will merge it later today.

hechtlC
hechtlC previously approved these changes May 27, 2024
Copy link
Contributor

@hechtlC hechtlC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me thank you @MaLoefUDS !
I have one formatting comment that you have to address before merging but I approve the PR nonetheless. Please fix this and anything @bockthom might still find.

util-split.R Outdated Show resolved Hide resolved
Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
@bockthom
Copy link
Collaborator

Please fix this and anything @bockthom might still find.

There is nothing else to find (at least, for me) 😄
So, I will merge right away after the missing space is fixed.

@bockthom bockthom merged commit a87ff24 into se-sic:dev May 27, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants