-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(turborepo): Size comes from the thing being read. #6092
Changes from 1 commit
803fdd0
45ef65e
cda506c
22ccf73
4137dd2
e41afc1
542f7e5
ec94789
41992d5
f5246a7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
use std::{fs::Metadata, io::Read}; | ||
use std::io::{ErrorKind, Read}; | ||
|
||
use globwalk::fix_glob_pattern; | ||
use hex::ToHex; | ||
|
@@ -9,13 +9,13 @@ use wax::{any, Glob, Pattern}; | |
|
||
use crate::{package_deps::GitHashes, Error}; | ||
|
||
fn git_like_hash_file(path: &AbsoluteSystemPath, metadata: &Metadata) -> Result<String, Error> { | ||
fn git_like_hash_file(path: &AbsoluteSystemPath) -> Result<String, Error> { | ||
let mut hasher = Sha1::new(); | ||
let mut f = path.open()?; | ||
let mut buffer = Vec::new(); | ||
f.read_to_end(&mut buffer)?; | ||
let size = f.read_to_end(&mut buffer)?; | ||
hasher.update("blob ".as_bytes()); | ||
gsoltis marked this conversation as resolved.
Show resolved
Hide resolved
|
||
hasher.update(metadata.len().to_string().as_bytes()); | ||
hasher.update(size.to_string().as_bytes()); | ||
hasher.update([b'\0']); | ||
hasher.update(buffer.as_slice()); | ||
let result = hasher.finalize(); | ||
|
@@ -30,17 +30,19 @@ pub(crate) fn hash_files( | |
let mut hashes = GitHashes::new(); | ||
for file in files.into_iter() { | ||
let path = root_path.resolve(file.as_ref()); | ||
let metadata = match path.symlink_metadata() { | ||
Ok(metadata) => metadata, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Point of failure one: |
||
Err(e) => { | ||
if allow_missing && e.is_io_error(std::io::ErrorKind::NotFound) { | ||
continue; | ||
match git_like_hash_file(&path) { | ||
Ok(hash) => hashes.insert(file.as_ref().to_unix(), hash), | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So, now we just attempt to hash the file, and let We can then investigate the returned error to see if it is |
||
Err(e) => match e { | ||
Error::Io(ref io_error, _) => { | ||
if allow_missing && io_error.kind() == ErrorKind::NotFound { | ||
continue; | ||
} else { | ||
return Err(e); | ||
} | ||
} | ||
return Err(e.into()); | ||
} | ||
_ => return Err(e), | ||
}, | ||
}; | ||
nathanhammond marked this conversation as resolved.
Show resolved
Hide resolved
|
||
let hash = git_like_hash_file(&path, &metadata)?; | ||
hashes.insert(file.as_ref().to_unix(), hash); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Point of failure two: |
||
} | ||
Ok(hashes) | ||
} | ||
|
@@ -109,7 +111,7 @@ pub(crate) fn get_package_file_hashes_from_processing_gitignore<S: AsRef<str>>( | |
if metadata.is_symlink() { | ||
continue; | ||
} | ||
let hash = git_like_hash_file(path, &metadata)?; | ||
Comment on lines
113
to
115
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This line of code + I believe that we should be following symlinks, and if it is a broken symlink, hash it the same way There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This would also mean that if we What should we do about There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Interesting behaviour, did a little test and it seems that ls-tree does indeed not expose symlinks. Git does at least give the file mode so we can manually check for symlinks ( I think in this case, there is an optimisation we could apply with a caveat:
These two behaviours conflict but I don't think we should support symlinks outside the repo root anyways so perhaps erroring is sufficient. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The intention with the change was to hash symlink files, not their targets. There's been a series of differently-incorrect behavior here:
In terms of what I think should happen is the following:
|
||
let hash = git_like_hash_file(path)?; | ||
hashes.insert(relative_path, hash); | ||
} | ||
Ok(hashes) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
read_to_end
already attempts to do this optimally in terms of syscall count, we don't really need to help out.read_to_end
is this:buffer_capacity_required
is this: