Using Packwerk to Modularize a Large Monolith - Seeking Feedback and Suggestions

This message was imported from the Ruby/Rails Modularity Slack server. Find more info in the import thread.

Message originally sent by slack user U7832IJ3CGA

Hello friends. I’m a developer at Gusto. We’ve just begun our journey to modularize our large monolith. Unsurprisingly, getting started is very difficult. I thought I’d share a vision I have for using packwerk to iterate towards modularity with a repeatable developer workflow. I’d love feedback from this community about how this idea. Thanks!

https://docs.google.com/document/d/16yR9LOFIDId3PAUqBQ7sE6q-dKjkAnpaKhZPh3Tg8b0/edit?usp=sharing (comments welcome!)

A little context for the readers (this document was originally intended for an internal audience). Gusto has “team annotations” as comments at the top of all our Ruby files. However, these don’t really get used in many developer workflows, so there is low confidence that the annotations are “correct”. Additionally the annotations are almost certainly not a reflection of the boundaries we want in components. But it’s a starting point we can use to get conversations going between teams!

Message originally sent by slack user U7832IJ3CGA

<@U783MOJYF8Z> and <@U78EWU4EQKZ> for this proposal to be implemented as is, I think there are two features I would need added to packwerk:

• The ability to identify an arbitrary list of files as being part of a package’s public API
• The ability to ask packwerk check to only evaluate a specified list of files.
Would you be open to these modifications?

Message originally sent by slack user U783MOJYF8Z

Hey Brian!

the ability to identify an arbitrary list of files as being part of a package’s public API
This we discussed previously. I would rather not support it in packwerk, because we want the structure of the package to be obvious to humans. If it’s arbitrary files, the only way to know whether something is public or not is to look through a potentially big list in the package manifest (package.yml).

What were planning to do, but then shelved because we don’t currently need it, is to make the public folder configurable. Currently that’s always app/public, but it could be any folder. The only drawback of that is that depending on your autoloading setup you may need to create a corresponding namespace.

Would that help? If not, can you elaborate on why you want to specify an arbitrary list of files?

The ability to ask packwerk check to only evaluate a specified list of files

I’ll let <@U78EWU4EQKZ> answer this one because I’m not completely up to date on that aspect of packwerk.

Message originally sent by slack user U78EWU4EQKZ

You can run packwerk check for a folder as so, packwerk check bloop/blah. Assuming you want packwerk check to only run on CI for a specified list of files perhaps you can specify the folders as such. I would be curious to understand why you’d only want the check to run on a subset of files and not the whole application though.

Message originally sent by slack user U7832IJ3CGA

Both of your answers make perfect sense given what I understand about the state of Shopify’s monolith. You already have these folders in place, so that’s all Shopify needs to support. However, for others who are at an early stage of their modularization journey, file-level package definition would allow package boundary lines to begin being drawn without the very large cost of moving all the files around.

Does this make sense?

Message originally sent by slack user U7832IJ3CGA

Of course, the goal would be to eventually extract these packages into folders. However, we all know that it’s very easy to get the boundaries wrong at first. That’s why it’s so important to keep the cost of experimentation and change as low as possible.

Message originally sent by slack user U783MOJYF8Z

Package == folder is one of the basic assumptions that packwerk is based on.

Why is it expensive to move files around?

Message originally sent by slack user U7832IJ3CGA

Why is it expensive to move files around?

Rails conventions mostly. But maybe I’m wrong?

Message originally sent by slack user U783MOJYF8Z

It all really depends on how you want to partition your app into packages.

But what we did was really essentially just adding components/*/app/* to the autoload paths and then move files around.

The only drawback being that git was a little confused about the file history, but with git log --follow it follows across moves.

Of course you don’t have to adop the components pattern, you could also start by adding more autoload paths to your existing app folder (like app/some_package)

Message originally sent by slack user U783MOJYF8Z

You only get into trouble with namespaces if the file path relative to the autoload path changes.

Message originally sent by slack user U783MOJYF8Z

I may be overlooking some problem though - curious to her if the above makes sense to you.

Message originally sent by slack user U7832IJ3CGA

Best way to find out is to experiment. I’ll get back to you.

However, consider how much easier it would be to just define packages via a list of files. Would this be very difficult to implement in packwerk?

Message originally sent by slack user U783MOJYF8Z

I think it could probably be implemented without too much effort, but it doesn’t make sense to me. I imagine packages that are invisible to humans to be very confusing to use.

Can you provide an example for what that would look like?

Message originally sent by slack user U783MOJYF8Z

I mean what the structure of your application would look like, not the implementation in packwerk

Message originally sent by slack user U7832IJ3CGA

I 100% agree that packages that are invisible to humans don’t make sense. However, I think these “virtual packages” can be a useful intermediate state for large monoliths that don’t have the folder structure in place to use packwerk.

For those monoliths, “virtual packages” defined by individual files, give teams an opportunity to start conversations about what boundaries should be while keeping the cost to experiment very very low.

I’d encourage you to re-read my proposal and think about how packwerk could be used to enable these conversations. https://docs.google.com/document/d/16yR9LOFIDId3PAUqBQ7sE6q-dKjkAnpaKhZPh3Tg8b0/edit?usp=sharing

As these conversations happen, and people start to feel confident that the boundaries are correct, then it makes sense to do the work to extract the files into folders and make the package structure visible to humans.

In short, the hard work is deciding where to draw the boundaries. We should assume we’re not going to get this quite right at first. “Virtual packages” give us a forcing function to have these conversations and learn with a very very low cost.

Does that make sense?

Message originally sent by slack user U783MOJYF8Z

I agree that finding out where to draw the boundaries is the hardest part. I think we just have different assumptions about the cost of moving files around

Message originally sent by slack user U7832IJ3CGA

Very reasonable. I’ll validate this and get back. Thanks so much for listening and great work with this project! Thank you!

Message originally sent by slack user U783MOJYF8Z

certainly it is nice to be able to experiment with the position of the boundary without having to refactor things

Message originally sent by slack user U783MOJYF8Z

I think bin/rails dev:generate_package_yml_from_team_annotations could even create the folder and move the files for you

Message originally sent by slack user U7832IJ3CGA

Agreed. That’s what we’d do.