Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Commit Networks #263

Merged
merged 16 commits into from
Aug 28, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,17 +12,21 @@
- Add line-based code coverage reports into CI pipeline. Coverage reports are generated by `coverage.R` (PR #262, 10cac49d005e87c3964cc61711e7f5acef749626, b3b9f4ac7a9911bd00293c68fac88e0f9033bdfb, c815d18dc6266d620a7a145493417b87ac08679e, e8093525fdaf46e54f2f7fcc6358ca7892e795e5, 32d04823e2007c63d2a43ce59bea3057327c19a7)
- Add the possibility to split data time-based by multiple data sources (PR #261, 1088395f46b84028c8d7c463ca86b5dc38500c26, e1f79fc9e40cd6f41c946be42db364b2101cfe10, 0bb187fec0fd801d7634bf8d5180525770f6ab0b, 371a97ac6ebf3de4fe9360dea79d62e2ed3ef585)
- Add tests for uncovered functionality in `util-misc.R` and `util-networks.R` (PR #264, ff30f3238b1bf2539280d0d055a5d925c197c271, af80551d0615a49b86e45ff596bd75941ee88f91)
- Add commit network as a new type of network. It uses commits as vertices and connects them either via cochange or commit interactions. This includes adding new config parameters and a function for adding vertex attributes to a commit network(PR #263, ab73271781e8e9a0715f784936df4b371d64c338, ab73271781e8e9a0715f784936df4b371d64c338, cd9a930fcb54ff465c2a5a7c43cfe82ac15c134d)
bockthom marked this conversation as resolved.
Show resolved Hide resolved

### Changed/Improved

- Change the default value for the `issues.from.source` configuration parameter. Instead of reading JIRA and GitHub issues together, which was the previous default, the new default value causes only GitHub issue data to be read. To restore the previous default behavior and read data from both issue sources, this now needs to be manually configured when needed. (PR #264, 5ff83c364f6bfc1e6ff95e9c5f1087e031c48a5d, 8c8080cb9caf115f19d9f145ad6e6c108b131a67, 8bcbc81db521877908d2e5c2989082ed672f2a3b)
- Replace deprecated `igraph` functions by their preferred alternatives (PR #264, 0df9d5bf6bafbb5d440f4c47db4ec901cf11f037)
- Deprecate support for R version 3.6 (PR #264, c8e6f45111e487fadbe7f0a13c7595eb23f3af6e, fb3f5474259d4a88f4ff545691cca9d1ccde90e3)
- Explicitly add R version 4.4 to the CI test pipeline (c8e6f45111e487fadbe7f0a13c7595eb23f3af6e)
- Refactor function `construct.edge.list.from.key.value.list` to be more readable(PR #263, 05c3bc09cb1d396fd59c34a88030cdca58fd04dd)
bockthom marked this conversation as resolved.
Show resolved Hide resolved

### Fixed

- Fix the creation of edgelists for issue-based artifact-networks by correctly iterating over the issue data (PR #264, 321d85043112971c04998249c14a0677a32c9004)
- Fix networks based upon commit interaction data to also have the attribute `artifact.type`(PR #263, 849123a8b7d898fbb1343745ecffc1f6000c9367)
bockthom marked this conversation as resolved.
Show resolved Hide resolved
- Fix endless recursion that could occur when commit interaction data was configured and commit data is empty (PR #263, 3fb7437b68950303916b62984fa449732c70353e)
bockthom marked this conversation as resolved.
Show resolved Hide resolved

## 4.4

Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -630,7 +630,7 @@ Updates to the parameters can be done by calling `NetworkConf$update.variables(.
- `author.relation`
* The relation(s) among authors, encoded as edges in an author network
* **Note**: The author--artifact relation in bipartite and multi networks is configured by `artifact.relation`!
* possible values: [*`"mail"`*, `"cochange"`, `"issue"`]
* possible values: [*`"mail"`*, `"cochange"`, `"issue"`, `commit.interaction`]
bockthom marked this conversation as resolved.
Show resolved Hide resolved
- `author.directed`
* The directedness of edges in an author network
* [`TRUE`, *`FALSE`*]
Expand All @@ -649,7 +649,7 @@ Updates to the parameters can be done by calling `NetworkConf$update.variables(.
- `artifact.relation`
* The relation(s) among artifacts, encoded as edges in an artifact network
* **Note**: Additionally, this relation configures also the author--artifact relation in bipartite and multi networks!
* possible values: [*`"cochange"`*, `"callgraph"`, `"mail"`, `"issue"`]
* possible values: [*`"cochange"`*, `"callgraph"`, `"mail"`, `"issue"`, `commit.interaction`]
bockthom marked this conversation as resolved.
Show resolved Hide resolved
- `artifact.directed`
* The directedness of edges in an artifact network
* **Note**: This parameter does only affect the `issue` relation, as the `cochange` relation is always undirected, while the `callgraph` relation is always directed. For the `mail`, we currently do not have data available to exhibit edge information.
Expand Down
2 changes: 2 additions & 0 deletions util-conf.R
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,8 @@ ARTIFACT.CODEFACE = list(
"file" = "File"
)

ARTIFACT.COMMIT.INTERACTION = "CommitInteraction"


## / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
## Conf --------------------------------------------------------------------
Expand Down
12 changes: 7 additions & 5 deletions util-data.R
Original file line number Diff line number Diff line change
Expand Up @@ -415,11 +415,13 @@ ProjectData = R6::R6Class("ProjectData",
#'
#' This method should be called whenever the field \code{commit.interactions} is changed.
update.commit.interactions = function() {
stacktrace = get.stacktrace(sys.calls())
caller = get.second.last.element(stacktrace)
if (self$is.data.source.cached("commit.interactions") &&
(is.na(caller)|| paste(caller, collapse = " ") != "self$set.commits(commit.data)")) {
if (!self$is.data.source.cached("commits.unfiltered")) {
if (self$is.data.source.cached("commit.interactions")) {
## check if caller was 'set.commits'. If so, or if commits are already filtered,
## do not get the commits again.
stacktrace = get.stacktrace(sys.calls())
caller = get.second.last.element(stacktrace)
if (!self$is.data.source.cached("commits.unfiltered") &&
(is.na(caller) || paste(caller, collapse = " ") != "self$set.commits(commit.data)")) {
self$get.commits()
}

Expand Down
4 changes: 2 additions & 2 deletions util-networks.R
Original file line number Diff line number Diff line change
Expand Up @@ -249,7 +249,7 @@
colnames(edges)[2] = "from"
colnames(edges)[4] = "hash"
if (nrow(edges) > 0) {
edges[["artifact.type"]] = "CommitInteraction"
edges[["artifact.type"]] = ARTIFACT.COMMIT.INTERACTION
}
author.net.data = list(vertices = vertices, edges = edges)
## construct the network
Expand Down Expand Up @@ -700,8 +700,8 @@

## do not compute anything more than once
if (!is.null(private$commits.network.commit.interaction)) {
logging::logdebug("get.commit.network.commit.interaction: finished. (already existing)")
return(private$commits.network.commit.interaction)

Check warning on line 704 in util-networks.R

View check run for this annotation

Codecov / codecov/patch

util-networks.R#L703-L704

Added lines #L703 - L704 were not covered by tests
}

## get the hashes that appear in the commit-interaction data as the vertices of the network
Expand All @@ -715,7 +715,7 @@
edges = edges[, c("base.hash", "commit.hash", "func", "interacting.author",
"file", "base.author", "base.func", "base.file")]
if (nrow(edges) > 0) {
edges[["artifact.type"]] = "CommitInteraction"
edges[["artifact.type"]] = ARTIFACT.COMMIT.INTERACTION
}
colnames(edges)[1] = "to"
colnames(edges)[2] = "from"
Expand Down Expand Up @@ -1443,9 +1443,9 @@

## Skip artifacts with many, many edges
if (number.edges > network.conf$get.value("skip.threshold")) {
logging::logwarn("Skipping edges for %s '%s' due to amount (> %s).",
attr(set, "group.type"), attr(set, "group.name"), network.conf$get.value("skip.threshold"))
return(NULL)

Check warning on line 1448 in util-networks.R

View check run for this annotation

Codecov / codecov/patch

util-networks.R#L1446-L1448

Added lines #L1446 - L1448 were not covered by tests
}

if (network.type == "commit") {
Expand Down Expand Up @@ -1523,9 +1523,9 @@

## Skip artifacts with many, many edges
if (number.edges > network.conf$get.value("skip.threshold")) {
logging::logwarn("Skipping edges for %s '%s' due to amount (> %s).",
attr(set, "group.type"), attr(set, "group.name"), network.conf$get.value("skip.threshold"))
return(NULL)

Check warning on line 1528 in util-networks.R

View check run for this annotation

Codecov / codecov/patch

util-networks.R#L1526-L1528

Added lines #L1526 - L1528 were not covered by tests
}

## get vertex data
Expand All @@ -1533,7 +1533,7 @@

## break if there is no author
if (length(vertices) < 1) {
return(NULL)

Check warning on line 1536 in util-networks.R

View check run for this annotation

Codecov / codecov/patch

util-networks.R#L1536

Added line #L1536 was not covered by tests
}

## if there is only one author, just create the vertex, but no edges
Expand Down