Skip to content

Add a node-install goal to Pants for installing node_modules#6367

Merged
stuhood merged 21 commits into
pantsbuild:masterfrom
nsaechao:nsaechao/issue-6332_install_goal
Sep 7, 2018
Merged

Add a node-install goal to Pants for installing node_modules#6367
stuhood merged 21 commits into
pantsbuild:masterfrom
nsaechao:nsaechao/issue-6332_install_goal

Conversation

@nsaechao

@nsaechao nsaechao commented Aug 18, 2018

Copy link
Copy Markdown
Contributor

Problem

Adding node-install is the first step to support javascript source-level dependencies. See #6332.
node-install is a gap-fill for setting up the development environment similar to the environment that is produced by the working dir in .pants.d. This goal should resolve all third party dependencies in addition to the source deps and then be consumable by NPM/Yarn as if the developer has ran "npm install".

Solution

Add a bare-bones node-install goal that will use the target-specific package manager to run the install task in the target directory.
Currently, only node_modules with the 'yarn' package manager will be supported for the node-install goal.

Result

A new node-install goal is added.
The node-install goal will be able to take node_module targets as input.
When executing the node-install goal, walk down the dependency graph and install each node_module target into the target root.

The NodeResolve task now exposes new optional product NodePathsLocal product. The existing NodePaths product remains the same. The NodePathsLocal provides similar mapping of targets and PATHS but installs in the same directory as the target definition rather than in the hidden pants working environment. Despite being seemingly similar, both products are exclusive and don't share any work done. See review discussion for more details.

Example

src/node/target/package.json
"dependencies": {
  "example": "0.0.0"
}

./pants node-install src/node/target:target

output:
src/node/target/node_modules

@nsaechao

Copy link
Copy Markdown
Contributor Author

@stuhood, @baroquebobcat: This is an initial WIP ticket for adding the node-install task. I would greatly appreciate early feedback on the design if you have the time...

The main focus so far is how the new goal is added and how the product types are specified. Things I still have not added are testing and checks against non-source node_modules.

@stuhood stuhood left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Ny.

I think both the class docs and the PR description need more info on what node-install is, how it is used, etc. A link to the issue you opened would be good too.



class NodePathsLocal(NodePaths):
"""Maps Node package targets to their local NODE_PATH chroot."""

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference between NodePathsLocal and NodePaths needs more docs/comments... possibly in both the parent class and the subclass. It would be good to discuss why they are not the same class.

And in general, it's not a great idea to subclass concrete classes, as it makes the interactions between the parent and child harder to reason about.

from pants.contrib.node.tasks.node_resolve import NodeResolve


class NodeResolveLocal(NodeResolve):

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto subclassing concrete classes: rather than subclassing a concrete class, you should almost always prefer to extract an abstract base class with two (in this case) subclasses. That helps you to define the API, and clarifies why the classes actually need to be different (basically: it's exactly the abstract members).

@nsaechao nsaechao changed the title [WIP] Add a node-install goal to Pants for installing node_modules Add a node-install goal to Pants for installing node_modules Aug 24, 2018
@nsaechao

Copy link
Copy Markdown
Contributor Author

I am having some difficulties in isolating the task execution order. The current result is that the "resolve" task execution is executed before the "node-resolve-local" task. This causes the node-install goal to install the node_packages twice. Once in the .pants.d dir and another in the source target dir. The underlying issue is that node-install consumes products from resolve.node (like node/npm/yarn distributions).

@stuhood

stuhood commented Aug 27, 2018

Copy link
Copy Markdown
Member

I am having some difficulties in isolating the task execution order. The current result is that the "resolve" task execution is executed before the "node-resolve-local" task. This causes the node-install goal to install the node_packages twice. Once in the .pants.d dir and another in the source target dir. The underlying issue is that node-install consumes products from resolve.node (like node/npm/yarn distributions).

The design doc and PR description don't really explain why node-resolve-local is a separate task from node-resolve. I know we had discussed this potentially being two optional products exposed by the same task... is that worth exploring? Happy to talk offline about this, or to discuss it on #6332.

@stuhood stuhood left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Comment thread contrib/node/src/python/pants/contrib/node/register.py Outdated
Comment thread contrib/node/src/python/pants/contrib/node/register.py Outdated
Comment thread contrib/node/src/python/pants/contrib/node/register.py Outdated
Comment thread contrib/node/src/python/pants/contrib/node/subsystems/resolvers/npm_resolver.py Outdated
Comment thread contrib/node/src/python/pants/contrib/node/tasks/node_resolve.py Outdated
resolver_for_target_type = self._resolver_for_target(target).global_instance()
resolver_for_target_type.resolve_target(self, target, vt.results_dir, node_paths)
node_paths.resolved(target, vt.results_dir)
if self.context.products.is_required_data(NodePathsLocal):

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any cases in which this should represent a symlink into .pants.d rather than always double installing things?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By exposing two products and treating them both as optional, we effectively removed the "double installation" problem.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except when they've both been requested?

It's not critical, it's just unfortunate that ./pants node-install can never (re)use the work done by the private copy.

topological_order=True):
"""Returns an invalidation_check with all targets invalidated.
"""
# Generate a cache_key that never matches to ensure work is always done.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're going to do this, then you don't actually need to use an invalidation check.

This usage of the invalidation check is incorrect anyway though I think... it's meant to be used as context manager, such that when the context exits, targets are marked valid.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, I admit this is a hack job to re-use topological sort for targets. I can copy-paste the topological sort and it should make more sense.

resolve_locally=True,
force=True,
install_optional=True,
frozen_lockfile=False)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

frozen_lockfile=False is really fishy... is the default behaviour of yarn install to ignore the lockfile...?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--frozen-lockfile=False is the default behavior of yarn install. We turn on frozen-lockfile for .pants.d installs because we want them to atomic. Outside of .pants.d, it is meaningful to produce a new lockfile such as the case of adding a new dependency.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. It makes some sense to align with Yarn's default. But this is still a pretty wacky default, IMO. It might be worth diverging from the default to be a bit more sane, but up to you.

@nsaechao

Copy link
Copy Markdown
Contributor Author

Thanks @stuhood for the reviews, I've expanded the documentations and got rid of the awkward invalidation check and replaced it with topological sort which is what I only intended to use.

@stuhood stuhood requested review from benjyw and removed request for jsirois September 4, 2018 18:08
@benjyw

benjyw commented Sep 4, 2018

Copy link
Copy Markdown
Contributor

This looks fine as far as my limited understanding permits, but to clarify, what is the difference between what the new task does and what the existing NodeResolve does?

@stuhood stuhood left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, no blockers that I see. One last round of feedback, and then will merge on green CI, unless Benjy has followup questions/feedback.

@@ -4,9 +4,8 @@ Hello!

You can also run one of the following command to greet with name:

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/run one of the/run the/

'--install-optional', type=bool, default=False, fingerprint=True,
help='If enabled, install optional dependencies.')
register(
'--production-only', type=bool, default=False, fingerprint=True,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an --install-optional flag, which implies that maybe this should be --install-production-only? Alternatively, could rename that flag to --optional and mark the old one deprecated? Example: https://github.com/pantsbuild/pants/blob/master/testprojects/src/python/plugins/dummy_options/tasks/dummy_options.py#L20-L21



class NodeInstall(NodeTask):
"""Installs a node_module target into the source directory

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first line of a task doc is the one that shows up in ./pants help $task, so replacing the word "source" with something more meaningful would be good. Example:

$ ./pants help list

list options:
Lists all targets matching the target specs.
..

round_manager.require_data(NodePathsLocal)

@classmethod
def supports_passthru_args(cls):

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still relevant.



class NodePathsLocal(NodePathsBase):
"""Maps Node package targets to their local NODE_PATH chroot.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "local" bit is I think still not clear, as evidenced by Benjy's question.

Maybe you could have some/all of these docstrings refer to the docstring of NodeResolve, which you expanded?

class NodePathsLocal(NodePathsBase):
"""Maps Node package targets to their local NODE_PATH chroot.

NodePathsLocal is exposed as a product type for node-resolve-local.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still relevant.

pants working directory.

NodePathsLocal is similar to NodePaths with the difference that the resolved location
is local to the target source root.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"local to" doesn't have much meaning... could just say "next to", or refer explicitly to "the directory that the target is defined in".

resolver_for_target_type = self._resolver_for_target(target).global_instance()
resolver_for_target_type.resolve_target(self, target, vt.results_dir, node_paths)
node_paths.resolved(target, vt.results_dir)
if self.context.products.is_required_data(NodePathsLocal):

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except when they've both been requested?

It's not critical, it's just unfortunate that ./pants node-install can never (re)use the work done by the private copy.

resolve_locally=True,
force=True,
install_optional=True,
frozen_lockfile=False)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. It makes some sense to align with Yarn's default. But this is still a pretty wacky default, IMO. It might be worth diverging from the default to be a bit more sane, but up to you.

@nsaechao

nsaechao commented Sep 6, 2018

Copy link
Copy Markdown
Contributor Author

@benjyw

This looks fine as far as my limited understanding permits, but to clarify, what is the difference between what the new task does and what the existing NodeResolve does?

Previously NodeResolve only exposed a single product NodePaths that contains a mapping of targets and PATHS to chroots that has all third party dependencies "installed" into the virtual pants working directory. With this change, the NodeResolve task exposes both the NodePaths and NodePathsLocal products as optional. The existing NodePaths product remains the same. The NodePathsLocal provides similar mapping of targets and PATHS but installs in the same directory as the target definition.

The stick is that, both products are exclusive and don't share any work done.

@nsaechao

nsaechao commented Sep 6, 2018

Copy link
Copy Markdown
Contributor Author

@stuhood: After some more thought and experimentation, setting --force=True by default doesn't make sense. My main concern would be that Yarn copies file dependencies rather than installs so it would not pick up local changes. This concern was unfounded as I will be wiring the install, bypassing yarn's interpretation of file:/ deps. I've removed the --force flag from node-install.

@benjyw benjyw left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, all seems reasonable as far as I can tell (which, admittedly, isn't very far...)

@stuhood stuhood left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Will merge on green CI.

@stuhood

stuhood commented Sep 7, 2018

Copy link
Copy Markdown
Member

Will merge this as soon as #6469 goes in. Thanks!

@stuhood stuhood merged commit 371504e into pantsbuild:master Sep 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants