Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FileHistory Feature #963

Merged
merged 1 commit into from
Apr 14, 2015
Merged

FileHistory Feature #963

merged 1 commit into from
Apr 14, 2015

Conversation

ThomasBarnekow
Copy link
Contributor

Based on my research, this feature was discussed in the past but I could not find the functionality in libgit2sharp. As I needed it as well, I implemented it and would like to share it.

The FileHistory class provides a history of commits in which a given file was created, changed, or renamed. Based on that commit history, it also provides the collection of Blobs representing a change of the file's contents, i.e., ignoring renames.

I've also implemented a FileHistoryExtensionsclass that provides extension methods for the Repositoryclass. These could potentially be moved to the RepositoryExtensionsclass.

Usage would be as follows:

using (Repository repo = new Repository(repoPath))
{
    FileHistory history = repo.GetFileHistory(relativePath);
    IEnumerable<FileHistoryEntry> relevantCommits = history.RelevantCommits();
    IEnumerable<Blob> relevantBlobs = history.RelevantBlobs();
}

@@ -0,0 +1,144 @@
<?xml version="1.0" encoding="utf-8"?>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't have landed here. 🍊

@ThomasBarnekow
Copy link
Contributor Author

I reverted the undesired changes to the solution and project files and also deleted the NuGet files that were added. The question is whether git settings should ignore these files.

@nulltoken
Copy link
Member

Hey @ThomasBarnekow thanks for this. It looks like the CI servers cringe a bit 😉 In order oavoid this, please specify some Signatures in your Commit calls. Otherwise, the signature is going to be guessed from the ambient .gitconfig configuration store (which doesn't exist on the CI servers).

@nulltoken
Copy link
Member

Related topics #893 and #89

@ThomasBarnekow
Copy link
Contributor Author

Hey @nulltoken, you are welcome. Can you be more specific on "specify[ing] some Signatures in your Commit calls)?

@nulltoken
Copy link
Member

@ThomasBarnekow I think leveraging this overload would do the trick.

@ThomasBarnekow
Copy link
Contributor Author

I also saw that the MetaFixture test complained about missing constructors to support abstraction in test contexts. It also wanted the public methods and properties to be virtual. In addition to adding the Signature to the test commits, I also made those required changes.

When I ran the tests on my machine, my test cases and the MetaFixture test cases all passed. However, other test cases failed but I can't see any connection between my stuff and those test cases.

@ThomasBarnekow
Copy link
Contributor Author

The last change fixes an error I've seen on the CI system.

One challenge is that my tests passed nicely on my machine but failed on the CI server. Thus, it would be helpful to have some guide saying how tests should be set up to make sure they also run on the CI server, i.e., they should also fail on my machine if they would fail on the CI server.

BTW, a few of the tests, e.g., LibGit2Sharp.Tests.RefSpecFixture.SettingInvalidRefSpecsThrows fail on my machine due to an UnauthorizedAccessException. For example:

System.UnauthorizedAccessException : Access to the path 'C:\Visual Studio 2010\Projects\GitHub\LibGit2Sharp\LibGit2Sharp.Tests\bin\Debug\TestRepos\1xokwaiz.xg3\testrepo_wd\.git\objects\pack\pack-a81e489679b7d3418f9ab594bda8ceb37dd4c695.idx' is denied.

How do I deal with that?

@nulltoken
Copy link
Member

BTW, a few of the tests, e.g., LibGit2Sharp.Tests.RefSpecFixture.SettingInvalidRefSpecsThrows fail on my machine due to an UnauthorizedAccessException. For example:

System.UnauthorizedAccessException : Access to the path 'C:\Visual Studio
2010\Projects\GitHub\LibGit2Sharp\LibGit2Sharp.Tests\bin\Debug\TestRepos\1xokwaiz.xg3\testrepo
_wd\.git\objects\pack\pack-a81e489679b7d3418f9ab594bda8ceb37dd4c695.idx' is denied.

How do I deal with that?

Hmmm. This is unexpected. Can you please share the whole stack trace? Would it possible that a local service (Antivirus, Search indexer, ...) may hold a temporary lock of those files?

@nulltoken
Copy link
Member

The last change fixes an error I've seen on the CI system.

One challenge is that my tests passed nicely on my machine but failed on the CI server. Thus, it
would be helpful to have some guide saying how tests should be set up to make sure they also run
on the CI server, i.e., they should also fail on my machine if they would fail on the CI server.

The only case I can think of relies on the methods/tests that would depend on configuration. The bulletproof approach to cope with this is to isolate the tests from ambient configuration by injecting some fake configuration stores. Another approach (as hinted here, is to pass explicit parameters to the methods and not let the library retrieve the missing required piece of data from config.

@ThomasBarnekow
Copy link
Contributor Author

Here's a full stack trace:

LibGit2Sharp.Tests.DiffTreeToTreeFixture.CanCompareTwoVersionsOfAFileWithATrailingNewlineDeletion(contextLines: 4, expectedPatchLength: 193) [FAIL]
   System.IO.IOException : The process cannot access the file 'C:\Visual Studio 2010\Projects\GitHub\LibGit2Sharp\LibGit2Sharp.Tests\bin\Debug\TestRepos\eodajdlt.kwj\testrepo_wd\README' because it is being used by another process.
   Stack Trace:
      at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
      at System.IO.File.InternalDelete(String path, Boolean checkHost)
      C:\Visual Studio 2010\Projects\GitHub\LibGit2Sharp\LibGit2Sharp.Tests\TestHelpers\DirectoryHelper.cs(63,0): at LibGit2Sharp.Tests.TestHelpers.DirectoryHelper.DeleteDirectory(String directoryPath)
      C:\Visual Studio 2010\Projects\GitHub\LibGit2Sharp\LibGit2Sharp.Tests\TestHelpers\DirectoryHelper.cs(68,0): at LibGit2Sharp.Tests.TestHelpers.DirectoryHelper.DeleteDirectory(String directoryPath)
      C:\Visual Studio 2010\Projects\GitHub\LibGit2Sharp\LibGit2Sharp.Tests\TestHelpers\BaseFixture.cs(212,0): at LibGit2Sharp.Tests.TestHelpers.BaseFixture.Dispose()
      at Xunit.Sdk.LifetimeCommand.Execute(Object testClass)
      at Xunit.Sdk.ExceptionAndOutputCaptureCommand.Execute(Object testClass)

It is possible, though, that this is related to a local service. I've not found out which one that might be, because I still had 2-3 errors like the one above after deactivating my antivirus and malware services.

One further thing I noted is that 42 "temporary" folders are left in the TestRepos subdirectory after one test run and a previously empty TestRepos folder. My folder contained a few hundred of these, though. Not sure whether that's expected.

@nulltoken
Copy link
Member

@ThomasBarnekow The stack trace shows that the failure happened upon the removal of the test repo.

Running a full test suite would create a lot more than 42 temporary folders. So I'd say most of them have been successfully removed. Take a look at the DirectoryHelper type. We guard against IOExceptions and output a message when such race conditions happen. Adding a clause for your specific exception may help you while you hunt for the evil process.

@ThomasBarnekow
Copy link
Contributor Author

@nulltoken Indeed, I saw that many test folders were created and successfully removed. The only "issue" I'd have is that, after a while, the leftovers might become a problem.

Your guarding against IOException does not mean the test will be passed, though. They are reported as a failure. Not sure whether that should be considered a failed test case.

Would it make sense to include a "housekeeping" test case that is run at the very end to remove whatever folders are left? I would not have any issue with some leftover folders. But during the limited time I've been using this, hundreds of folders have been created that way.

@nulltoken
Copy link
Member

after a while, the leftovers might become a problem.

I agree. But to my best knowledge, this should only happen when an external process (anti-virus, ...) races (and wins) against the test runner.

Your guarding against IOException does not mean the test will be passed, though. They are reported as a failure. Not sure whether that should be considered a failed test case.

Actually, when an IOException occur, it's swallowed and a Trace message is output to the console. In your case, it looks like you're fighting against a different exception that doesn't derive from IOException.

Not sure whether that should be considered a failed test case.

IMHO I don't think that failing to clean the test repo once the test is done should be seen as a failure.

Would it make sense to include a "housekeeping" test case that is run at the very end to remove whatever folders are left?

Why not? But I think this may be a little out of the scope of this PR. How about logging a new issue in order to keep track of it? As it may take some time to iron in a way that's sensible (For instance, NCrunch and XUnit v2 run tests in parallel. In that perspective, it may be tricky to determine the last test. Another option would be to add a finalizer to BaseFixture, but if it takes too long to execute, it will be shut down before it's done.)

@ThomasBarnekow
Copy link
Contributor Author

All agreed. And these failures are totally unrelated to my code. I'd be interested to have it pulled as soon as possible and don't see any issue standing in its way.

Mit freundlichen Grüßen / Kind regards

Dr. Thomas Barnekow

Sent from my iPhone

On 17.02.2015, at 19:24, nulltoken notifications@github.com wrote:

after a while, the leftovers might become a problem.

I agree. But to my best knowledge, this should only happen when an external process (anti-virus, ...) races (and wins) against the test runner.

Your guarding against IOException does not mean the test will be passed, though. They are reported as a failure. Not sure whether that should be considered a failed test case.

Actually, when an IOException occur, it's swallowed and a Trace message is output to the console. In your case, it looks like you're fighting against a different exception that doesn't derive from IOException.

Not sure whether that should be considered a failed test case.

IMHO I don't think that failing to clean the test repo once the test is done should be seen as a failure.

Would it make sense to include a "housekeeping" test case that is run at the very end to remove whatever folders are left?

Why not? But I think this may be a little out of the scope of this PR. How about logging a new issue in order to keep track of it? As it may take some time to iron in a way that's sensible (For instance, NCrunch and XUnit v2 run tests in parallel. In that perspective, it may be tricky to determine the last test. Another option would be to add a finalizer to BaseFixture, but if it takes too long to execute, it will be shut down before it's done.)


Reply to this email directly or view it on GitHub.

@nulltoken
Copy link
Member

I'd be interested to have it pulled as soon as possible and don't see any issue standing in its way.

I'll start to review the code in the following hours. I also think some other contributors may chime in as this is a very interesting feature (@Aimeast, @Jameson, ...).

if (lastSha != null)
{
CommitFilter endOfPreviousRangeFilter = GetCommitFilter(filter, commitRange.Last());
Commit nextCommit = commitLog.QueryBy(endOfPreviousRangeFilter).Skip(1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Can we have these like one linq chain each line?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably could and I actually thought about that (because I've written a good number of pure functional transforms for XML and like that style). But I decided to go down this route because I felt it would be easier to read.

What benefits do you see in "having these like one line chain each line"?

@nulltoken
Copy link
Member

Thanks a lot for this contribution!

I'm actually not sure about the implementation (as the git native --follow option copes with many corner cases). So if that's ok with you, I'd like to start by improving a bit the test coverage of this feature.

Could you please take a look at the very basic use cases that are described here and create some additional unit tests that would verify them? You can take a peek at CloneFixture.cs to see how we usually work with remote test repositories.

return repo.Commit("Changed " + relativePath, repo.Config.BuildSignature(DateTimeOffset.Now));
}

protected string CreateFile(string repoPath, string relativePath, string text)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've already got a very similar helper method named Touch(). Could you please rather leverage it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@ThomasBarnekow
Copy link
Contributor Author

@nulltoken I've just cloned your follow-test repo and quickly run the three test cases using my interactive test program (so no unit tests yet). With the exception of the first one, the results are the same.

The first one is currently not handled by my feature because it includes a rename AND a change in one commit. Thus, it only shows the top-most commit but not the second one. The question is how I can safely establish that these two commits or objects are part of the latest object's (so-renamed.txt) history? Currently, I'm only using the file name and the SHA, both of which are changed when renaming AND updating.

@ThomasBarnekow
Copy link
Contributor Author

@nulltoken Another question would be whether this feature should do exactly what git log --follow <filename> does. For example, that does not show merge commits. If, however, the objective is to see all commits in which a file was indeed changed, which can include merge commits, these should be included as well.

My application, for example, would be to identify all commits in which a Word document's contents were changed and then show potentially all document revisions side by side (with track changes highlighting changes between commits). If a merge commit included manual merge activities, that would be relevant as well. git log --follow would ignore that (and my feature ignores these as well but probably shouldn't or should at least be configurable to that extent).

@nulltoken
Copy link
Member

The first one is currently not handled by my feature because it includes a rename AND a change in one commit. Thus, it only shows the top-most commit but not the second one. The question is how I can safely establish that these two commits or objects are part of the latest object's (so-renamed.txt) history? Currently, I'm only using the file name and the SHA, both of which are changed when renaming AND updating.

The Diff.Compare() method could be quite handy to do this. See the DiffTreeToTreeFixture.cs suite for a better grasp. It's far possible leveraging it may be a bit more efficient than recursively harvesting the Trees

@ThomasBarnekow
Copy link
Contributor Author

Thanks, that's currently the most important part for me. Does that also work for binaries? I'd be specifically interested in Word documents, which are essentially zipped collections of XML files.

@ThomasBarnekow
Copy link
Contributor Author

@nulltoken Squashed all commits into one and pushed.

Quick question. I have ReSharper installed and this always updates the existing LibGit2Sharp.sln.DotSettings. I've never committed those changes. Do you have an idea why that happens? Would it be safe to add the changes? It adds the following line:

<s:Boolean x:Key="/Default/Environment/SettingsMigration/IsMigratorApplied/=JetBrains_002EReSharper_002EPsi_002ECSharp_002ECodeStyle_002ESettingsUpgrade_002EMigrateThisQualifierSettings/@EntryIndexedValue">True</s:Boolean>

nulltoken added a commit that referenced this pull request Apr 14, 2015
@nulltoken nulltoken merged commit 0a0dce7 into libgit2:vNext Apr 14, 2015
@nulltoken
Copy link
Member

It's in. Thanks for this amazing job!

✨ ✨ ✨ ✨ ✨ ✨ ✨

@ThomasBarnekow
Copy link
Contributor Author

Thanks! And you are most welcome.

@nulltoken
Copy link
Member

@nulltoken Squashed all commits into one and pushed.

Quick question. I have ReSharper installed and this always updates the existing LibGit2Sharp.sln.DotSettings. I've never committed those changes. Do you have an idea why that happens? Would it be safe to add the changes? It adds the following line:

<s:Boolean x:Key="/Default/Environment/SettingsMigration/IsMigratorApplied/=JetBrains_002EReSharper_002EPsi_002ECSharp_002ECodeStyle_002ESettingsUpgrade_002EMigrateThisQualifierSettings/@EntryIndexedValue">True</s:Boolean>

I'm also regularly bothered by this. It's happening when a new version of Resharper is installed.

Let's merge the change it. Could you please send a PR to take care of this?

@ThomasBarnekow
Copy link
Contributor Author

Sure, will do.

On 14.04.2015, at 19:50, nulltoken notifications@github.com wrote:

@nulltoken https://github.com/nulltoken Squashed all commits into one and pushed.

Quick question. I have ReSharper installed and this always updates the existing LibGit2Sharp.sln.DotSettings. I've never committed those changes. Do you have an idea why that happens? Would it be safe to add the changes? It adds the following line:

<s:Boolean x:Key="/Default/Environment/SettingsMigration/IsMigratorApplied/=JetBrains_002EReSharper_002EPsi_002ECSharp_002ECodeStyle_002ESettingsUpgrade_002EMigrateThisQualifierSettings/@EntryIndexedValue">True/s:Boolean
I'm also regularly bothered by this. It's happening when a new version of Resharper is installed http://stackoverflow.com/questions/29096939/resharper-dotsettings-file-getting-modified-every-time.

Let's merge the change it. Could you please send a PR to take care of this?


Reply to this email directly or view it on GitHub #963 (comment).

@nulltoken
Copy link
Member

Published as NuGet pre-release package LibGit2Sharp.0.22.0-pre20150415174523

@ThomasBarnekow ThomasBarnekow deleted the file-history branch April 15, 2015 22:14
@linkerro
Copy link

linkerro commented Jul 9, 2015

Has the usage for this feature changed from the first comment?

@quisse
Copy link

quisse commented Jul 14, 2015

@linkerro
I think the usage now is:

using(var repo = new Repository(somePath))
{
    IEnumerable<LogEntry> history = repo.Commits.QueryBy(filepath);
}

(Please correct me if I'm wrong)

@linkerro
Copy link

That is correct. I started reading the tests and got it working, but this should stay here for reference, if not in the wiki :)

@ThomasBarnekow
Copy link
Contributor Author

@linkerro and @quisse Sorry for not commenting earlier. The usage described by @quisse is essentially correct (would have new Repository(somePath)). And the wiki should be updated with at least a short description.

@quisse
Copy link

quisse commented Jul 14, 2015

@ThomasBarnekow edited

@carlosmn carlosmn mentioned this pull request Apr 19, 2016
@deepak-arya
Copy link

deepak-arya commented Jul 4, 2016

Hi @ThomasBarnekow thanks this gr8 library. Unfortunately i didn't find FileHistory class. i looked into code and found this call as "internal" access. Have this class been removed from public interface? i am creating a VS addin which requires to have history of files. i would like to know if this class is still there.
Thanks...

@txdv
Copy link
Contributor

txdv commented Jul 4, 2016

His last comment literally points out that @quisse last comment is the current way of achieving that.

@deepak-arya
Copy link

deepak-arya commented Jul 5, 2016

yahh.. thanks @txdv . i got it. i am using below code which always return zero length LogEntry enumerator. Am i using it correct way @quisse

public static void GetFileHistory(string filepath,string repopath)
{
using (var repo = new Repository(repopath))
{
// here i am using absolute file path
IEnumerable history = repo.Commits.QueryBy(filepath);
//problem is here. it does not go into loop.
foreach(LogEntry e in history)
{
Console.WriteLine(e.Commit.Message);
}
}
}

/// here is what i get
GetEnumerator() result view is always null
image

@txdv
Copy link
Contributor

txdv commented Jul 13, 2016

Are you passing the entire path? If you have a file in a subdirectory, you need to pass Path.Combine("diectory", "file") to the QueryBy

@mirsaeedi
Copy link

As @deepak-arya said Repository.Commits.QueryBy always returns an empty collection. What is the resolution to that? I pass the absolute path to the method.

@ethomson
Copy link
Member

ethomson commented Jan 5, 2018

@mirsaeedi Please ask a question on StackOverflow for support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.