Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce issue-based artifact-network creation, new tests and fix current tests + minor bug fixes #244

Merged
merged 20 commits into from
Dec 20, 2023

Commits on Sep 5, 2023

  1. Create edges for issue-based artifact-networks

    Introduce edge construction into the 'get.artifact.network.issue' function.
    Connect 'add_link' and 'referenced_by' issue-events by an edge.
    
    This works towards se-sic#239.
    
    Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
    MaLoefUDS committed Sep 5, 2023
    Configuration menu
    Copy the full SHA
    98a93ee View commit details
    Browse the repository at this point in the history
  2. Adjust testing data and add tests for issue-based artifact-networks

    Add new 'add_link' and 'referenced_by' issue events to the testing data to
    allow for new tests. Add a test for the construction of an issue-based
    artifact-network with 'issues.only.comments' turned on and off.
    
    This works towards se-sic#239.
    
    Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
    MaLoefUDS committed Sep 5, 2023
    Configuration menu
    Copy the full SHA
    9f840c0 View commit details
    Browse the repository at this point in the history
  3. Adjust tests to the changed testing data

    This works towards se-sic#239.
    
    Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
    MaLoefUDS committed Sep 5, 2023
    Configuration menu
    Copy the full SHA
    56e1b35 View commit details
    Browse the repository at this point in the history
  4. Reformat 'event.info.1' column of issue-events if necessary

    For some issue-events the 'event.info.1' column is to be interpreted as
    a reference to an issue. In this case it is more consistent to reformat
    the column entry into the <issue-[issue.source]-[event.info.1]> format.
    
    This works towards se-sic#239.
    
    Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
    MaLoefUDS committed Sep 5, 2023
    Configuration menu
    Copy the full SHA
    62ff9d0 View commit details
    Browse the repository at this point in the history
  5. Replace 'IssueEvent' vertex attribute by 'Issue' in multi-networks

    To be consistent with bipartite networks it is necessary to rename the
    vertex attribute 'IssueEvent' to 'Issue' in multi-networks.
    
    This works towards se-sic#239.
    
    Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
    MaLoefUDS committed Sep 5, 2023
    Configuration menu
    Copy the full SHA
    26d7b7e View commit details
    Browse the repository at this point in the history
  6. Add 'split.data.by.bins.vector' and fix miscellaneous bugs in splitting

    Modify 'split.data.time.based' to be able to split by activity-based bins.
    Rename the function to 'split.data.by.time.or.bins'. Introduce wrapper
    functions 'split.data.by.bins.vector' and 'split.data.time.based' to call
    'split.data.by.time.or.bins'.
    
    Add 'include.duplicate.ids' parameter in 'split.get.bins.activity.based'
    to obtain bins covering all data elements from 'df' by which the split
    is being performed, regardless of the elements ids uniqueness.
    
    In 'split.data.activity.based', after calculating the bins to place data
    elements into, replace the time-based splitting by
    'split.data.by.bins.vector'. Time-based splitting is incorrect for the
    case that the date of the last element in a bin is the same as the date
    of the first element of the next bin.
    
    Adjust calculation of 'offset.end' in 'split.data.activity.based' to fix
    a bug where because of a short last window the end offset would cross
    the border of the last window, overlapping into the second last. Because
    of this overlap the last sliding windows would not be calculated as expected.
    
    This works towards se-sic#239.
    
    Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
    MaLoefUDS committed Sep 5, 2023
    Configuration menu
    Copy the full SHA
    ece569c View commit details
    Browse the repository at this point in the history

Commits on Oct 12, 2023

  1. Readd previously removed testing data and adjust tests

    This works towards se-sic#239.
    
    Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
    MaLoefUDS committed Oct 12, 2023
    Configuration menu
    Copy the full SHA
    9112db0 View commit details
    Browse the repository at this point in the history

Commits on Oct 18, 2023

  1. Introduce constant for issue id formatting

    This works towards se-sic#239.
    
    Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
    MaLoefUDS committed Oct 18, 2023
    Configuration menu
    Copy the full SHA
    0e2df73 View commit details
    Browse the repository at this point in the history
  2. Get directedness of issue-based artifact-networks from configuration

    Instead of creating only undirected issue-based artifact-networks, we now
    take the directedness information out of the network config. Edges are
    already created in a way that they can be interpreted as directed
    edges from the issue referencing to the referenced issue.
    
    Replace for loop in edge creation by more efficient mclapply and fix
    some minor formatting inconsistencies.
    
    This works towards se-sic#239.
    
    Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
    MaLoefUDS committed Oct 18, 2023
    Configuration menu
    Copy the full SHA
    771bcc8 View commit details
    Browse the repository at this point in the history
  3. Rework sliding window approach of 'split.data.activity.based'

    Rework the algorithm to create sliding windows in activity-based splitting.
    Instead of cutting off half a range many elements at the end before building
    sliding windows (which creates a lot of edge cases), build sliding windows with
    every element up to the last one. Then remove the last incomplete range.
    The contents of the last incomplete range will be fully included in the second
    last range and therefore redundant.
    Sometimes the last incomplete range is a regular range. Previously the last
    range always had to be a regular range. This means that removing the last
    incomplete range requires updating the tests.
    
    Additionally fix and improve documentation of splitting methods and fix
    minor spelling bugs.
    
    This works towards se-sic#239.
    
    Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
    MaLoefUDS committed Oct 18, 2023
    Configuration menu
    Copy the full SHA
    48ef4fa View commit details
    Browse the repository at this point in the history
  4. Rename 'split.data.by.bins' to 'split.datafame.by.bins'

    This renaming makes sense as the method only splits dataframes. Rename
    'split.data.by.bins.vector' to 'split.data.by.bins' as it is more readable
    and easier to understand.
    
    This works towards se-sic#239.
    
    Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
    MaLoefUDS committed Oct 18, 2023
    Configuration menu
    Copy the full SHA
    ed5feb2 View commit details
    Browse the repository at this point in the history

Commits on Nov 15, 2023

  1. Fix comments and simplify logical condition

    Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
    MaLoefUDS committed Nov 15, 2023
    Configuration menu
    Copy the full SHA
    1018fbd View commit details
    Browse the repository at this point in the history

Commits on Nov 19, 2023

  1. Validate the format of the 'bins' parameter when splitting data by bins

    Check type of the 'bins' parameter and its components in the wrapper
    functions 'split.data.by.time' and 'split.data.by.bins'. Move
    'split.data.by.time.or.bins' into a new category for internal helper
    functions to discourage direct invocation.
    
    Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
    MaLoefUDS committed Nov 19, 2023
    Configuration menu
    Copy the full SHA
    ed0a530 View commit details
    Browse the repository at this point in the history
  2. Parameterize issue-based artifact-network tests using directedness

    Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
    MaLoefUDS committed Nov 19, 2023
    Configuration menu
    Copy the full SHA
    cdc00f0 View commit details
    Browse the repository at this point in the history

Commits on Dec 2, 2023

  1. Improve input validation

    Check if the 'bins' parameter of the 'split.data.by.bins' actually
    contains 'bins' component. Use 'get.date.from.string' instead of
    accessing 'lubridate::ymd_hms' directly, to encapsulate date conversion.
    Allow 'vector' component of 'bins' to be of any subclass of
    'numeric' instead of explicitly 'numeric'.
    
    Disallow lists that contain elements that are not representing a date
    in 'split.data.time.based' as they do not comply with the expected
    format of bins for 'split.data.by.time.or.bins'.
    
    Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
    MaLoefUDS committed Dec 2, 2023
    Configuration menu
    Copy the full SHA
    5e5ecba View commit details
    Browse the repository at this point in the history

Commits on Dec 10, 2023

  1. Test input validation of splitting functions

    Add tests that call 'split.data.time.based' and 'split.data.by.bins'
    with various malformed 'bins' parameters and expect failure.
    
    Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
    MaLoefUDS committed Dec 10, 2023
    Configuration menu
    Copy the full SHA
    958f272 View commit details
    Browse the repository at this point in the history

Commits on Dec 19, 2023

  1. Adjust test data to be consistent and realistic

    Reverse order of reference from Jira issue 332 and Jira issue 328, since
    previously, Jira issue 332 was referenced before its creation.
    Copy GitHub and Jira issue data from 'test_feature/feature' also to
    'test_proximity/proximity' to keep the data consistent.
    
    Adjust all effected tests to comply with the changed testing data.
    
    Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
    MaLoefUDS committed Dec 19, 2023
    Configuration menu
    Copy the full SHA
    ea4fe8d View commit details
    Browse the repository at this point in the history

Commits on Dec 20, 2023

  1. Add and update a few comments in tests

    Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
    MaLoefUDS committed Dec 20, 2023
    Configuration menu
    Copy the full SHA
    eeb0a12 View commit details
    Browse the repository at this point in the history
  2. Update 'NEWS.md'

    Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
    MaLoefUDS committed Dec 20, 2023
    Configuration menu
    Copy the full SHA
    cdb3cb2 View commit details
    Browse the repository at this point in the history
  3. Move changes in 'NEWS.md' to new unversioned release

    Also use backticks instead of single ticks for proper markdown highlighting.
    Improve changelog messages by clarifing and properly focusing on the
    important changes as well as adding more relevant commit hashes.
    
    Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland>
    MaLoefUDS committed Dec 20, 2023
    Configuration menu
    Copy the full SHA
    7f78966 View commit details
    Browse the repository at this point in the history