Skip to content

Commit

Permalink
Merge pull request #184 from bockthom/thomas-updates
Browse files Browse the repository at this point in the history
Fix sliding-window creation in various situations & add tests for sliding-window functionality

Reviewed-by: Christian Hechtl <hechtl@cs.uni-saarland.de>
Reviewed-by: Claus Hunsen <hunsen@fim.uni-passau.de>
  • Loading branch information
hechtlC authored Dec 1, 2020
2 parents f8c7cd2 + 4afe8b8 commit 78e99a1
Show file tree
Hide file tree
Showing 15 changed files with 1,593 additions and 200 deletions.
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -129,7 +129,7 @@ The current build status is as follows:
* Code must be reviewed by one other project member and, if needed, be properly adapted/fixed.
* We add the `Reviewed-by` tag only for the merge commit.

There will be another checklist for you when you open an actual pull request provided by [the corresponding template](.github/PULL_REQUEST_TEMPLATE/pull-request.md).
There will be another checklist for you when you open an actual pull request provided by [the corresponding template](.github/PULL_REQUEST_TEMPLATE.md).

## Style Conventions

Expand Down
6 changes: 6 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,18 @@
- Add a new file `util-tensor.R` containing the class `FourthOrderTensor` to create (author x relation x author x relation) tensors from a list of networks (with each network having a different relation) and its corresponding utility function `get.author.networks.for.multiple.relations` (PR #173, c136b1f6127d73c25f08ae2f317246747aa9ea2b, e4ee0dc926b22ff75d5fd801c1f131bcff4c22eb, 051a5f0287022f97e2367ed0e9591b9df9dbdb3d)
- Add function `calculate.EDCPTD.centrality` for calculating the EDCPTD centrality for a fourth-order tensor in the above described form (c136b1f6127d73c25f08ae2f317246747aa9ea2b, e4ee0dc926b22ff75d5fd801c1f131bcff4c22eb, 051a5f0287022f97e2367ed0e9591b9df9dbdb3d)
- Add new file `util-networks-misc.R` which contains miscellaneous functions for processing network data and creating and converting various kinds of adjacency matrices: `get.author.names.from.networks`, `get.author.names.from.data`, `get.expanded.adjacency`, `get.expanded.adjacency.matrices`, `get.expanded.adjacency.matrices.cumulated`, `convert.adjacency.matrix.list.to.array` (051a5f0287022f97e2367ed0e9591b9df9dbdb3d)
- Add tests for sliding-window functionality and make parameterized tests possible (a3ad0a81015c7f23bce958d5c1922e3b82b28bda, 2ed84ac55d434f62341297b1aa9676c12e383491, PR #184)

### Changed/Improved
- Adjust the function `get.authors.by.data.source`: Rename its single parameter to `data.sources` and change the function so that it can extract the authors for multiple data sources at once. The default value of the parameter is a vector containing all the available data sources (commits, mails, issues) (051a5f0287022f97e2367ed0e9591b9df9dbdb3d)
- Adjust recommended R version to 3.6.3 in README (92be262514277acb774ab2885c1c0d1c10f03373)
- Add R version 4.0 to test suite and adjust package installation in `install.R` to improve compatibility with Travis CI (40aa0d80e2a94434a8be75925dbefbde6d3518b2, 1ba036758a63767e2fcef525c98f5a4fd6938c39, #161)

### Fixed
- Fix sliding-window creation in various splitting functions (`split.network.time.based`, `split.networks.time.based`, `split.data.time.based`, `split.data.activity.based`, `split.network.activity.based`) and also fix the computation of overlapping ranges in the function `construct.overlapping.ranges` to make sure that the last and the second-last range do not cover the same range) (1abc1b8dbfc65ccad0cbbc8e33b209e39d2f8118, c34c42aef32a30b82adc53384fd6a1b09fc75dee, 097cebcc477b1b65056d512124575f5a78229c3e, 9a1b6516f490b72b821be2d5365d98cac1907b2f, 0fc179e2735bec37d26a68c6c351ab43770007d2, cad28bf221f942eb25e997aaa2de553181956680, 7602af2cf46f699b2285d53819dec614c71754c6, PR #184)
- Fix off-by-1 error in the function `get.data.cut.to.same.date` (f0744c0e14543292cccb1aa9a61f822755ee7183)
- Fix missing or wrongly set layout when plotting networks (#186, 720cc7ba7bdb635129c7669911aef8e7c6200a6b, 877931b94f87ca097c2f8f3c55e4b4bcc6087742)


## 3.6

Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,7 @@ Alternatively, you can run `Rscript install.R` to install the packages.
- `logging`: Logging
- `sqldf`: For advanced aggregation of `data.frame` objects
- `testthat`: For the test suite
- `patrick`: For the test suite
- `ggplot2`: For plotting of data
- `ggraph`: For plotting of networks (needs `udunits2` system library, e.g., `libudunits2-dev` on Ubuntu!)
- `markovchain`: For core/peripheral transition probabilities
Expand Down
1 change: 1 addition & 0 deletions install.R
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ packages = c(
"logging",
"sqldf",
"testthat",
"patrick",
"ggplot2",
"ggraph",
"markovchain",
Expand Down
4 changes: 3 additions & 1 deletion tests.R
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
## 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
##
## Copyright 2017, 2019 by Claus Hunsen <hunsen@fim.uni-passau.de>
## Copyright 2020 by Thomas Bock <bockthom@cs.uni-saarland.de>
## All Rights Reserved.

## / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
Expand Down Expand Up @@ -42,8 +43,9 @@ sessionInfo()

logging::loginfo("Running test suite.")

## load package 'testthat'
## load packages 'testthat' and 'patrick'
requireNamespace("testthat")
requireNamespace("patrick")

## starting tests
do.tests = function(dir) {
Expand Down
17 changes: 9 additions & 8 deletions tests/test-data-cut.R
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
## Copyright 2018 by Claus Hunsen <hunsen@fim.uni-passau.de>
## Copyright 2018 by Barbara Eckl <ecklbarb@fim.uni-passau.de>
## Copyright 2018 by Thomas Bock <bockthom@fim.uni-passau.de>
## Copyright 2020 by Thomas Bock <bockthom@cs.uni-saarland.de>
## Copyright 2018 by Jakob Kronawitter <kronawij@fim.uni-passau.de>
## All Rights Reserved.

Expand Down Expand Up @@ -62,14 +63,14 @@ test_that("Cut commit and mail data to same date range.", {
artifact.type = c("Feature", "Feature"),
artifact.diff.size = as.integer(c(1, 1)))

mail.data.expected = data.frame(author.name = c("Thomas"),
author.email = c("thomas@example.org"),
message.id = c("<65a1sf31sagd684dfv31@mail.gmail.com>"),
date = get.date.from.string("2016-07-12 16:04:40"),
date.offset = as.integer(c(100)),
subject = c("Re: Fw: busybox 2 tab"),
thread = sprintf("<thread-%s>", c(9)),
artifact.type = "Mail")
mail.data.expected = data.frame(author.name = c("Thomas", "Olaf"),
author.email = c("thomas@example.org", "olaf@example.org"),
message.id = c("<65a1sf31sagd684dfv31@mail.gmail.com>", "<9b06e8d20801220234h659c18a3g95c12ac38248c7e0@mail.gmail.com>"),
date = get.date.from.string(c("2016-07-12 16:04:40", "2016-07-12 16:05:37")),
date.offset = as.integer(c(100, 200)),
subject = c("Re: Fw: busybox 2 tab", "Re: Fw: busybox 10"),
thread = sprintf("<thread-%s>", c(9, 9)),
artifact.type = c("Mail", "Mail"))

commit.data = x.data$get.data.cut.to.same.date(data.sources = data.sources)$get.commits()
rownames(commit.data) = 1:nrow(commit.data)
Expand Down
17 changes: 9 additions & 8 deletions tests/test-networks-cut.R
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
## Copyright 2017 by Christian Hechtl <hechtl@fim.uni-passau.de>
## Copyright 2018 by Claus Hunsen <hunsen@fim.uni-passau.de>
## Copyright 2018 by Thomas Bock <bockthom@fim.uni-passau.de>
## Copyright 2020 by Thomas Bock <bockthom@cs.uni-saarland.de>
## Copyright 2018 by Jakob Kronawitter <kronawij@fim.uni-passau.de>
## All Rights Reserved.

Expand Down Expand Up @@ -62,14 +63,14 @@ test_that("Cut commit and mail data to same date range.", {
artifact.type = c("Feature", "Feature"),
artifact.diff.size = as.integer(c(1, 1)))

mail.data.expected = data.frame(author.name = c("Thomas"),
author.email = c("thomas@example.org"),
message.id = c("<65a1sf31sagd684dfv31@mail.gmail.com>"),
date = get.date.from.string(c("2016-07-12 16:04:40")),
date.offset = as.integer(c(100)),
subject = c("Re: Fw: busybox 2 tab"),
thread = sprintf("<thread-%s>", c(9)),
artifact.type = "Mail")
mail.data.expected = data.frame(author.name = c("Thomas", "Olaf"),
author.email = c("thomas@example.org", "olaf@example.org"),
message.id = c("<65a1sf31sagd684dfv31@mail.gmail.com>", "<9b06e8d20801220234h659c18a3g95c12ac38248c7e0@mail.gmail.com>"),
date = get.date.from.string(c("2016-07-12 16:04:40", "2016-07-12 16:05:37")),
date.offset = as.integer(c(100, 200)),
subject = c("Re: Fw: busybox 2 tab", "Re: Fw: busybox 10"),
thread = sprintf("<thread-%s>", c(9, 9)),
artifact.type = c("Mail", "Mail"))

commit.data = x$get.project.data()$get.commits()
rownames(commit.data) = 1:nrow(commit.data)
Expand Down
41 changes: 28 additions & 13 deletions tests/test-networks-equal-constructions.R
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
##
## Copyright 2018 by Christian Hechtl <hechtl@fim.uni-passau.de>
## Copyright 2018 by Claus Hunsen <hunsen@fim.uni-passau.de>
## Copyright 2020 by Thomas Bock <bockthom@cs.uni-saarland.de>
## All Rights Reserved.


Expand Down Expand Up @@ -86,7 +87,8 @@ compare.edge.and.vertex.lists = function(split.author.networks.one = NULL, split
}
}

test_that("Compare the bipartite and author network constructed in two ways with author/artifact relation 'cochange'", {
patrick::with_parameters_test_that("Compare the bipartite and author network constructed in two ways
with author/artifact relation 'cochange', ", {

## configuration object for the datapath
proj.conf = ProjectConf$new(CF.DATA, CF.SELECTION.PROCESS, CASESTUDY, ARTIFACT)
Expand All @@ -106,7 +108,7 @@ test_that("Compare the bipartite and author network constructed in two ways with

## split the networks
split.networks = split.networks.time.based(networks = list(author.network, bipartite.network),
time.period = splitting.period, sliding.window = FALSE)
time.period = splitting.period, sliding.window = test.sliding.window)

## separate the author and bipartite networks
split.author.networks.one = split.networks[[1]]
Expand All @@ -116,7 +118,8 @@ test_that("Compare the bipartite and author network constructed in two ways with
multi.network = network.builder$get.multi.network()

## split the network
multi.network.split = split.network.time.based(network = multi.network, time.period = splitting.period)
multi.network.split = split.network.time.based(network = multi.network, time.period = splitting.period,
sliding.window = test.sliding.window)

split.author.networks.two = list()
split.bipartite.networks.two = list()
Expand All @@ -134,10 +137,13 @@ test_that("Compare the bipartite and author network constructed in two ways with
## created with different approaches
compare.edge.and.vertex.lists(split.author.networks.one, split.author.networks.two,
split.bipartite.networks.one, split.bipartite.networks.two)
})
}, patrick::cases(
"sliding window: FALSE" = list(test.sliding.window = FALSE),
"sliding window: TRUE" = list(test.sliding.window = TRUE)
))

test_that("Compare the bipartite and author network constructed in two ways with author relation 'mail' and artifact relation
'cochange'", {
patrick::with_parameters_test_that("Compare the bipartite and author network constructed in two ways
with author relation 'mail' and artifact relation 'cochange', ", {

## configuration object for the datapath
proj.conf = ProjectConf$new(CF.DATA, CF.SELECTION.PROCESS, CASESTUDY, ARTIFACT)
Expand All @@ -158,7 +164,7 @@ test_that("Compare the bipartite and author network constructed in two ways with

## split the networks
split.networks = split.networks.time.based(networks = list(author.network, bipartite.network),
time.period = splitting.period, sliding.window = FALSE)
time.period = splitting.period, sliding.window = test.sliding.window)

## separate the author and bipartite networks
split.author.networks.one = split.networks[[1]]
Expand All @@ -168,7 +174,8 @@ test_that("Compare the bipartite and author network constructed in two ways with
multi.network = network.builder$get.multi.network()

## split the network
multi.network.split = split.network.time.based(network = multi.network, time.period = splitting.period)
multi.network.split = split.network.time.based(network = multi.network, time.period = splitting.period,
sliding.window = test.sliding.window)

split.author.networks.two = list()
split.bipartite.networks.two = list()
Expand All @@ -187,9 +194,13 @@ test_that("Compare the bipartite and author network constructed in two ways with
## created with different approaches
compare.edge.and.vertex.lists(split.author.networks.one, split.author.networks.two,
split.bipartite.networks.one, split.bipartite.networks.two)
})
}, patrick::cases(
"sliding window: FALSE" = list(test.sliding.window = FALSE),
"sliding window: TRUE" = list(test.sliding.window = TRUE)
))

test_that("Compare the bipartite and author network constructed in two ways with author and artifact relation 'mail'", {
patrick::with_parameters_test_that("Compare the bipartite and author network constructed in two ways
with author and artifact relation 'mail', ", {

## configuration object for the datapath
proj.conf = ProjectConf$new(CF.DATA, CF.SELECTION.PROCESS, CASESTUDY, ARTIFACT)
Expand All @@ -210,7 +221,7 @@ test_that("Compare the bipartite and author network constructed in two ways with

## split the networks
split.networks = split.networks.time.based(networks = list(author.network, bipartite.network),
time.period = splitting.period, sliding.window = FALSE)
time.period = splitting.period, sliding.window = test.sliding.window)

## separate the author and bipartite networks
split.author.networks.one = split.networks[[1]]
Expand All @@ -220,7 +231,8 @@ test_that("Compare the bipartite and author network constructed in two ways with
multi.network = network.builder$get.multi.network()

## split the network
multi.network.split = split.network.time.based(network = multi.network, time.period = splitting.period)
multi.network.split = split.network.time.based(network = multi.network, time.period = splitting.period,
sliding.window = test.sliding.window)

split.author.networks.two = list()
split.bipartite.networks.two = list()
Expand All @@ -239,4 +251,7 @@ test_that("Compare the bipartite and author network constructed in two ways with
## created with different approaches
compare.edge.and.vertex.lists(split.author.networks.one, split.author.networks.two,
split.bipartite.networks.one, split.bipartite.networks.two)
})
}, patrick::cases(
"sliding window: FALSE" = list(test.sliding.window = FALSE),
"sliding window: TRUE" = list(test.sliding.window = TRUE)
))
Loading

0 comments on commit 78e99a1

Please sign in to comment.