Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with manipulating filesystem entries that are symlinks #60851

Open
hamarb123 opened this issue Oct 26, 2021 · 10 comments
Open

Issues with manipulating filesystem entries that are symlinks #60851

hamarb123 opened this issue Oct 26, 2021 · 10 comments
Milestone

Comments

@hamarb123
Copy link
Contributor

This is more of an issue now that symlinks are/are becoming officially supported by .NET.

Discussed somewhat here.

On Windows, you can know if a particular path is a file or directory and cache the value (assuming it isn't deleted). Because of this, you can use either the FileInfo class, DirectoryInfo class, File class, or Directory class - you can even just use a FileSystemInfo for any common APIs (eg. Exists, Delete, Move, etc.). It would be nice for completeness' sake if there was a static equivalent of FileSystemInfo (like the File class), but it is not much of an issue on Windows since you can just one or the other.

On Unix however, since we're now starting to support symlinks properly, there is an issue with the way that directory symlinks are treated which is caused by Unix having no distinction between file and directory symlinks.

Consider the following code:

var path = "/path/to/directory";
Directory.CreateDirectory(path);
var fsi = Directory.CreateSymbolicLink(path + ".link", path);
Directory.Delete(path);
Console.WriteLine(fsi.Exists);

On Windows, this code would return true, but on Unix this will return false because it is now considered a file symlink (you'd have to run fsi = new FileInfo(fsi.FullName) to get it to say true).

On Unix, you might cache a FileSystemInfo and then try to use it later, only to discover that it "doesn't exist", which could be bad for logic. You might be trying to delete a FileSystemInfo and discover that it "doesn't exist" which isn't good if it does exist. I don't think there's much that can be done here since it would be a breaking change, but perhaps it could allow common apis on both FileInfos and DirectoryInfos on Unix if it is a symlink (but this could have other issues).

On the other hand (ie. static apis), if you want to delete a path that is a symlink that you know exists, then you could try using File.Delete(path), but that wouldn't work if a directory got created as the target while you weren't looking, so you could try:

if (File.Exists(path)) File.Delete(path);
else if (Directory.Exists(path)) Directory.Delete(path);

which would work most of the time, but can cause 4x api calls than necessary and can cause a race condition if another process deletes/creates an unrelated file/directory while your code is running.

I think that there should be both static and instance apis to address this, but personally, the static apis are a bit more important to me, and there are possibly some issues with the instance apis since there is likely code that checks if a FileSystemInfo is of type FileInfo and DirectoryInfo to determine what it's meant to represent which possibly should be different for reparse points after their target is deleted/created.

@jozkee suggested that the File class could be enabled to for this functionality for more than just what .NET currently considers a file, which would work for most APIs, except Exists since changing what it returns to be true for any sort of directory is a massive breaking change, perhaps there could be a new overload for this method and other methods with a similar issue with an extra parameter specifying when it should return true. It would also be quite useful if there was an exists-like method that could return No, File, Directory, and possibly ReparsePoint (perhaps as an overload with out bool) as well rather than just yes or no.

My personal use-case is a folder backing up program. In this program there are possible significant changes between enumerating the FileSystemInfos and using them which is fine for normal files and directories since at most they will cease to exist, but for the symlinks, they may become another type at any point. I currently use a helper class for the reparse points which is unavoidable and provides some benefit anyway (since they need different behaviour for some APIs), but the issue is when it comes to doing something with the reparse point, since I have to (whenever I suspect changes ie. I call APIs on any other files/directory in-between) first check whether it is a file or a directory before calling APIs. Note: these API calls can be delayed at any point due to requiring user input but not wanting to pause the program, so another check is required if it is delayed.
Consider the following code:

var entries = ...; //enumerate the entries of a folder, modified by other functions (eg. sorting)
foreach (var entry in entries)
{
//do something depending on source directory eg. Delete, Move etc.
}

During this code, you would have to check if the entry is a reparse point and then check if it is still a file/folder before you do anything since you could have done something to its target, and then if so, make a new instance before you pass it to your other functions, meanwhile you still have a race condition and ugly code that should be unnecessary.

If you wanted to delete it for example, then it could be solved with an API that just deletes it without worrying if you're claiming it's a file (on Unix at least). And many of the other APIs (specifically common APIs) would also easily be able to do the same on Unix (could be easily emulated on Windows I think).

So essentially what I'm asking for is some kind of FileSystemEntry.Delete(path) (and other common APIs), whether it's a new set of APIs or just allowing the File class to do this.

Thanks for considering this and reading to here :)

@dotnet-issue-labeler dotnet-issue-labeler bot added area-System.IO untriaged New issue has not been triaged by the area owner labels Oct 26, 2021
@ghost
Copy link

ghost commented Oct 26, 2021

Tagging subscribers to this area: @dotnet/area-system-io
See info in area-owners.md if you want to be subscribed.

Issue Details

This is more of an issue now that symlinks are/are becoming officially supported by .NET.

Discussed somewhat here.

On Windows, you can know if a particular path is a file or directory and cache the value (assuming it isn't deleted). Because of this, you can use either the FileInfo class, DirectoryInfo class, File class, or Directory class - you can even just use a FileSystemInfo for any common APIs (eg. Exists, Delete, Move, etc.). It would be nice for completeness' sake if there was a static equivalent of FileSystemInfo (like the File class), but it is not much of an issue on Windows since you can just one or the other.

On Unix however, since we're now starting to support symlinks properly, there is an issue with the way that directory symlinks are treated which is caused by Unix having no distinction between file and directory symlinks.

Consider the following code:

var path = "/path/to/directory";
Directory.CreateDirectory(path);
var fsi = Directory.CreateSymbolicLink(path + ".link", path);
Directory.Delete(path);
Console.WriteLine(fsi.Exists);

On Windows, this code would return true, but on Unix this will return false because it is now considered a file symlink (you'd have to run fsi = new FileInfo(fsi.FullName) to get it to say true).

On Unix, you might cache a FileSystemInfo and then try to use it later, only to discover that it "doesn't exist", which could be bad for logic. You might be trying to delete a FileSystemInfo and discover that it "doesn't exist" which isn't good if it does exist. I don't think there's much that can be done here since it would be a breaking change, but perhaps it could allow common apis on both FileInfos and DirectoryInfos on Unix if it is a symlink (but this could have other issues).

On the other hand (ie. static apis), if you want to delete a path that is a symlink that you know exists, then you could try using File.Delete(path), but that wouldn't work if a directory got created as the target while you weren't looking, so you could try:

if (File.Exists(path)) File.Delete(path);
else if (Directory.Exists(path)) Directory.Delete(path);

which would work most of the time, but can cause 4x api calls than necessary and can cause a race condition if another process deletes/creates an unrelated file/directory while your code is running.

I think that there should be both static and instance apis to address this, but personally, the static apis are a bit more important to me, and there are possibly some issues with the instance apis since there is likely code that checks if a FileSystemInfo is of type FileInfo and DirectoryInfo to determine what it's meant to represent which possibly should be different for reparse points after their target is deleted/created.

@jozkee suggested that the File class could be enabled to for this functionality for more than just what .NET currently considers a file, which would work for most APIs, except Exists since changing what it returns to be true for any sort of directory is a massive breaking change, perhaps there could be a new overload for this method and other methods with a similar issue with an extra parameter specifying when it should return true. It would also be quite useful if there was an exists-like method that could return No, File, Directory, and possibly ReparsePoint (perhaps as an overload with out bool) as well rather than just yes or no.

My personal use-case is a folder backing up program. In this program there are possible significant changes between enumerating the FileSystemInfos and using them which is fine for normal files and directories since at most they will cease to exist, but for the symlinks, they may become another type at any point. I currently use a helper class for the reparse points which is unavoidable and provides some benefit anyway (since they need different behaviour for some APIs), but the issue is when it comes to doing something with the reparse point, since I have to (whenever I suspect changes ie. I call APIs on any other files/directory in-between) first check whether it is a file or a directory before calling APIs. Note: these API calls can be delayed at any point due to requiring user input but not wanting to pause the program, so another check is required if it is delayed.
Consider the following code:

var entries = ...; //enumerate the entries of a folder, modified by other functions (eg. sorting)
foreach (var entry in entries)
{
//do something depending on source directory eg. Delete, Move etc.
}

During this code, you would have to check if the entry is a reparse point and then check if it is still a file/folder before you do anything since you could have done something to its target, and then if so, make a new instance before you pass it to your other functions, meanwhile you still have a race condition and ugly code that should be unnecessary.

If you wanted to delete it for example, then it could be solved with an API that just deletes it without worrying if you're claiming it's a file (on Unix at least). And many of the other APIs (specifically common APIs) would also easily be able to do the same on Unix (could be easily emulated on Windows I think).

So essentially what I'm asking for is some kind of FileSystemEntry.Delete(path) (and other common APIs), whether it's a new set of APIs or just allowing the File class to do this.

Thanks for considering this and reading to here :)

Author: hamarb123
Assignees: -
Labels:

area-System.IO, untriaged

Milestone: -

@Symbai
Copy link

Symbai commented Oct 26, 2021

#53577

@hamarb123
Copy link
Contributor Author

#53577

Thanks. That issue is different to this one, this one is mainly about having a unified API for common methods for files and directories, with symlinks being a specific example as to why it is needed (IMO).

@iSazonov
Copy link
Contributor

This comes us back to question what is File/Directory design intention either processing of object or its target? If the design intention is to process target (that is intuitively) let's design new API for object - symlink itself.

@hamarb123
Copy link
Contributor Author

I'm not sure I follow. I'm specifically referring to how you can't be guaranteed to delete (or check if it exists, move it etc.) a symlink that you know exists in the amount of operations it should take since you have to check if it's still a file or now a directory symlink on Unix.
It could also be useful in other cases since it probably takes an operation to check if it's a file or directory in File.Delete and Directory.Delete etc. on Unix that can be saved.
But I certainly don't disagree with the idea of a symlink type in .NET.

@carlossanlop
Copy link
Member

Thanks for opening an issue, @hamarb123. It seems there are multiple things reported in the main description, so allow me to summarize your concerns, and let me know if they were accurate:

A) There is a mismatch in the behavior of Exists between Windows and Unix when the target of a dir symlink does not exist.
B) To delete a file or directory, you have to check if they exist, then delete them if they do.
C) To check if a symlink exists, we should just need to use File.Exists or FileInfo.Exists, independently of the target.

Below is some code I tested in Windows and Ubuntu.

  • Regarding A: There is only a mismatch in Directory.Exists. The rest of the APIs seem to work fine.
  • Regarding B: You don't need to call File.Exists or FileInfo.Exists before deleting a file. Those APIs fail silently if the file does not exist. You do have to call it if you need to call Directory.Exists or DirectoryInfo.Exists, or wrap it in a try catch, because we throw DirectoryNotFoundException.
  • Regarding C: I think we should look into the code of DirectoryInfo.Exists and understand why that code has the expected behavior in both Windows and Unix, and determine if we should do the same in Directory.Exists. The alternative proposed by @jozkee could also be a way to solve it, but I think that's a workaround, not an actual fix of the root cause.

Windows

FileInfo
var file = new FileInfo("file.txt");
var link = new FileInfo("symlink");

file.Create().Dispose();
link.CreateAsSymbolicLink("file.txt");

file.Exists // True
link.Exists // True

file.Delete();

file.Exists // False
link.Exists // True

file.Delete(); // Succeeds
File
File.Create("file.txt").Dispose();
File.CreateSymbolicLink("link", "file.txt");

File.Exists("file.txt") // True
File.Exists("link") // True

File.Delete("file.txt");

File.Exists("file.txt") // False
File.Exists("link") // True

File.Delete("file.txt"); // Succeeds
DirectoryInfo
var dir = new DirectoryInfo("dir");
var link = new DirectoryInfo("link");

dir.Create();
link.CreateAsSymbolicLink("dir");

dir.Exists // True
link.Exists // True

dir.Delete();

dir.Exists // False
link.Exists // True

dir.Delete(); // Throws DirectoryNotFoundException
Directory
Directory.CreateDirectory("dir");
Directory.CreateSymbolicLink("link", "dir");

Directory.Exists("dir") // True
Directory.Exists("link") // True

Directory.Delete("dir");

Directory.Exists("dir") // False
Directory.Exists("link") // TRUE (expected behavior)

Directory.Delete("dir"); // Throws DirectoryNotFoundException

Unix

FileInfo
var file = new FileInfo("file.txt");
var link = new FileInfo("symlink");

file.Create().Dispose();
link.CreateAsSymbolicLink("file.txt");

file.Exists // True
link.Exists // True

file.Delete();

file.Exists // False
link.Exists // True

file.Delete(); // Succeeds
File
File.Create("file.txt").Dispose();
File.CreateSymbolicLink("link", "file.txt");

File.Exists("file.txt") // True
File.Exists("link") // True

File.Delete("file.txt");

File.Exists("file.txt") // False
File.Exists("link") // True

File.Delete("file.txt"); // Succeeds
DirectoryInfo
var dir = new DirectoryInfo("dir");
var link = new DirectoryInfo("link");

dir.Create();
link.CreateAsSymbolicLink("dir");

dir.Exists // True
link.Exists // True

dir.Delete();

dir.Exists // False
link.Exists // True

dir.Delete(); // Throws DirectoryNotFoundException
Directory (problem)
Directory.CreateDirectory("dir");
Directory.CreateSymbolicLink("link", "dir");

Directory.Exists("dir") // True
Directory.Exists("link") // True

Directory.Delete("dir");

Directory.Exists("dir") // False
Directory.Exists("link") // FALSE (Mismatch with Windows, unexpected behavior)

Directory.Delete("dir"); // Throws DirectoryNotFoundException

@carlossanlop carlossanlop removed the untriaged New issue has not been triaged by the area owner label Oct 27, 2021
@hamarb123
Copy link
Contributor Author

hamarb123 commented Oct 27, 2021

@carlossanlop, in those tests, what happens with running the last delete on the link. This is what my issue is (you've got it showing deleting the source again, not sure if this is a mistake or not).

eg. end of Directory:

Directory.Exists("dir");
Directory.Exists("link"); //mismatch

Directory.Delete("link"); // This is what I'm interested in

@carlossanlop
Copy link
Member

you've got it showing deleting the source again, not sure if this is a mistake or not

I did it on purpose to show that deleting a directory when it doesn't exist, throws exception, but deleting a file when it doesn't exist, does not throw any exception (point B).

Here's an updated set of tests, which include deleting the link after the target was deleted.

Windows

FileInfo
var file = new FileInfo("file.txt");
var link = new FileInfo("symlink");

file.Create().Dispose();
link.CreateAsSymbolicLink("file.txt");

file.Exists // True (Expected)
link.Exists // True (Expected)

file.Delete(); // Succeeds (Expected)

file.Exists // False (Expected)
link.Exists // True (Expected)

link.Delete(); // Succeeds (Expected)

file.Exists // False (Expected)
link.Exists // False (Expected)

file.Delete(); // File did not exist, but doesn't throw (Expected)
File
File.Create("file.txt").Dispose();
File.CreateSymbolicLink("link", "file.txt");

File.Exists("file.txt") // True (Expected)
File.Exists("link") // True (Expected)

File.Delete("file.txt"); // Succeeds (Expected)

File.Exists("file.txt") // False (Expected)
File.Exists("link") // True (Expected)

File.Delete("link"); // Succeeds (Expected)

File.Exists("file.txt") // False (Expected)
File.Exists("link") // False (Expected)

File.Delete("file.txt"); // File did not exist, but doesn't throw (Expected)
DirectoryInfo
var dir = new DirectoryInfo("dir");
var link = new DirectoryInfo("link");

dir.Create();
link.CreateAsSymbolicLink("dir");

dir.Exists // True (Expected)
link.Exists // True (Expected)

dir.Delete(); // Succeeds (Expected)

dir.Exists // False (Expected)
link.Exists // True (Expected)

link.Delete(); // Succeeds (Expected)

dir.Exists // False (Expected)
link.Exists // False (Expected)

dir.Delete(); // Dir did not exist, throws DirectoryNotFoundException (Expected)
Directory
Directory.CreateDirectory("dir");
Directory.CreateSymbolicLink("link", "dir");

Directory.Exists("dir") // True (Expected)
Directory.Exists("link") // True (Expected)

Directory.Delete("dir"); // Succeeds (Expected)

Directory.Exists("dir") // False (Expected)
Directory.Exists("link") // True (Expected)

Directory.Delete("link"); // Succeeds (Expected)

Directory.Exists("dir") // False (Expected)
Directory.Exists("link") // False (Expected)

Directory.Delete("dir"); // Dir did not exist, throws DirectoryNotFoundException (Expected)

Unix

FileInfo
var file = new FileInfo("file.txt");
var link = new FileInfo("symlink");

file.Create().Dispose();
link.CreateAsSymbolicLink("file.txt");

file.Exists // True (Expected)
link.Exists // True (Expected)

file.Delete(); // Succeeds (Expected)

file.Exists // False (Expected)
link.Exists // True (Expected)

link.Delete(); // Succeeds (Expected)

file.Exists // False (Expected)
link.Exists // False (Expected)

file.Delete(); // File did not exist, but doesn't throw (Expected)
File
File.Create("file.txt").Dispose();
File.CreateSymbolicLink("link", "file.txt");

File.Exists("file.txt") // True (Expected)
File.Exists("link") // True (Expected)

File.Delete("file.txt"); // Succeeds (Expected)

File.Exists("file.txt") // False (Expected)
File.Exists("link") // True (Expected)

File.Delete("link"); // Succeeds (Expected)

File.Exists("file.txt") // False (Expected)
File.Exists("link") // False (Expected)

File.Delete("file.txt"); // File did not exist, but doesn't throw (Expected)
DirectoryInfo
var dir = new DirectoryInfo("dir");
var link = new DirectoryInfo("link");

dir.Create();
link.CreateAsSymbolicLink("dir");

dir.Exists // True (Expected)
link.Exists // True (Expected)

dir.Delete(); // Succeeds (Expected)

dir.Exists // False (Expected)
link.Exists // True (Expected)

link.Delete(); // PROBLEM: link exists, but throws DirectoryNotFoundException

dir.Exists // False (Expected)
link.Exists // True (Delete above did not delete the link)

dir.Delete(); // Dir did not exist, throws DirectoryNotFoundException (Expected)
Directory
Directory.CreateDirectory("dir");
Directory.CreateSymbolicLink("link", "dir");

Directory.Exists("dir") // True (Expected)
Directory.Exists("link") // True (Expected)

Directory.Delete("dir"); // Succeeds (Expected)

Directory.Exists("dir") // False (Expected)
Directory.Exists("link") // PROBLEM: False (Mismatch with Windows, unexpected behavior)

Directory.Delete("link"); // PROBLEM: link exists, but throws DirectoryNotFoundException

Directory.Exists("dir") // False (Expected)
Directory.Exists("link") // True (Delete above did not delete the link)

Directory.Delete("dir"); // Dir did not exist, throws DirectoryNotFoundException (Expected)

@hamarb123
Copy link
Contributor Author

A) There is a mismatch in the behavior of Exists between Windows and Unix when the target of a dir symlink does not exist.

Yes, I'd like an API that tells me if a path exists - this could just be a new API like File.Exists(string path, bool includeDirectories) (and for FileInfo too, perhaps it could also have a variant like File.Exists(string path, bool includeDirectories, out bool wasFile and/or some indication it was a link). It might be better for this possible new API to have a struct defining what to include, ie. yes/no for normal files, yes/no for normal directories, yes/no for reparse points (I'm counting reparse points as not being normal files or folders in terms of the inclusion here) and use that as the argument instead of includeDirectories, and then instead of wasFile, it could return whether it was a each of a file, directory and a reparse point with another flags enum. It may be useful to add this variant to some sort of enumerate function as well.

B) To delete a file or directory, you have to check if they exist, then delete them if they do.

I thought I had to do this, but thanks to your code examples, it seems like I can just use the correct one on Windows and always use FileInfo for symlinks on Unix as long as I don't need to check Exists (this will result in a nice performance boost for symlinks since I don't have to check it constantly 😁).

C) To check if a symlink exists, we should just need to use File.Exists or FileInfo.Exists, independently of the target.

Yes, the directory ones should also always return whether the link exists or not too IMO since the symlinks can be returned as a DirectoryInfo, or be listed in EnumerateDirectories (I think). ie. what I'm suggesting is possibly that File.Exists, FileInfo.Exists, Directory.Exists and DirectoryInfo.Exists could just return true if it's a symlink that exists on Unix - but this could be avoided with the API suggestion above (assuming the API is also on DirectoryInfo).

@hamarb123
Copy link
Contributor Author

hamarb123 commented Oct 27, 2021

In my project, I'm now using FileInfo for all my symlinks and it works well. The only issue is obviously the Exists check, luckily in most of my code existence is assumed and exceptions are all handled, so there were only a few cases where I had to update that logic.
I do still think that Exists should be changed/added to though.

@jeffhandley jeffhandley added this to the Future milestone Jul 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants