Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Skip already installed modules #2176

Closed
wants to merge 2 commits into from
Closed

[WIP] Skip already installed modules #2176

wants to merge 2 commits into from

Conversation

voliva
Copy link

@voliva voliva commented Dec 7, 2016

yarn install was very slow on Windows because it takes very long to copy files from the cache folder to node_modules folder, specially if it's not whitelisted in Windows Defender. See #990 for further details.
This made yarn very slow vs npm when adding new packages, specially if it had installed huge packages such as Webpack or React (or both), as all previous packages were re-coppied.

This PR solves this problem by skipping those packages that are already installed and have the exact same version that the ones that are about to be copied.

I'm not too sure about the correctness of this because I didn't go too deep into yarn's src architecture, so I can't be sure if I'm missing something... But for a normal use case it seems to work: Only the new packages or those that are needed to update are copied, while the previous ones are just ignored.

@@ -137,6 +138,15 @@ export default class PackageLinker {
phantomFiles.push(path.join(dest, file));
}

try {
let obj = await fs.readJson(dest + "\\package.json");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The path shouldn't contain \\ as that is only valid for Windows. This should use path.join(dest, 'package.json') instead to work for all OSes.

@@ -147,6 +157,7 @@ export default class PackageLinker {
},
});
}
this.reporter.info('skipped ' + skipped.length + ' modules because they are already installed');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be a string here, it should use this.reporter.lang like everywhere else.

@voliva voliva changed the title Skip already installed modules [WIP] Skip already installed modules Dec 12, 2016
@voliva
Copy link
Author

voliva commented Dec 12, 2016

Fixed comments from @wyze and some lint warnings/errors. Will still need to check the tests (Removes extraneous files is failing)

@bestander
Copy link
Member

The good thing about yarn is that I can play around with node_modules - add console.log, edit code - and then clear everything up with yarn install --force.
I would not want to have a skip behavior by default.
Also, any ideas how we could whitelist the file operations in windows defender and file indexer?

@voliva
Copy link
Author

voliva commented Dec 13, 2016

@bestander I think most of the times when someone runs yarn add / yarn install, node_modules folder can be assumed unchanged, so it makes sense that the default behaviour is just to skip the installed folders.

What's a good point though is that when calling yarn install --force it should indeed replace all the installed modules, as that would be the expected behaviour (the user is asking yarn to install everything regardless of the current node_modules state).

@bestander
Copy link
Member

bestander commented Dec 13, 2016

Since 0.17 we fixed integrity checking that runs during install, so it won't do installation if integrity file is in place and dependencies did not change.
So this optimization is only for cases when many dependencies changed since last install because yarn does not do actual copy if individual files did not change.

@bestander
Copy link
Member

Why I am a bit opposed to this is that this is another point of state in the system.
We already have quite complicated caches, copy that does not rewrite files and integrity checks that makes it easy for bugs to creep in.
Adding one more point where installation can be incorrect because of the state of the system is adding more risk.
I would rather us investigate all other possible ways to optimise first.

@xantorres
Copy link

@bestander Yarn is terribly slow in Windows, even without windows defender activated and having a single dependency updated, at the point where it is totally impractical and unusable in a daily basis. We should either merge this PR or find another solution otherwise no Windows user will ever use this application

@bestander
Copy link
Member

There are some changes that are coming or landed in the latest version of Yarn:

  • number of IO operations reduced with improved hoisting algorithm
  • you can opt-in to use hard links with --link-duplicates if there are many modules installed more than once
  • soon install command will do a structure check similar to the one in this PR before running fetch and link phases
  • we improve linker and fetcher code to make less IO operations all the time, feel free to revise the code and send a PR to remove redundant disk IO calls

Besides that, read #990 with tests regarding Windows IO operations compared to Linux on the same machine.
In general NTFS is just slower than Ext3 everywhere: npm, Yarn and basic file copy.

I'll close this PR for the reasons in a previous comment

@bestander bestander closed this Mar 3, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants