Best Practices for Modularization: When to Use Engines and Gems in Addition to Packwerk?

gpassero · August 27, 2022, 2:48pm

This message was imported from the Ruby/Rails Modularity Slack server. Find more info in the import thread.

I’ve read through @stephan books, and of course, the blog articles coming out of Shopify and Gusto… it seems like the best practices in this area are still leveraging both packages (packwerk) and engines for modularization. Besides reuse (other projects), and given packwerk has already been accepted as the primary modularization mechanism, is there another reason that is keeping engines/gems on the table as a tool for some situations? What are those situations?

stephan · August 28, 2022, 12:47am

Given the capabilities of packwerk plus additional tooling, I think these days: Use gems+engines iff you want to publish them to a gem server.

system · August 29, 2022, 1:59am

Message excluded from import.

palkan · August 29, 2022, 7:37pm

I would say that choosing gems/engines vs packwerk-only is like choosing between strong and eventual consistency.

Engines/gems bring actual separation/isolation (which could be called strong modularization); packwerk-driven modularization still allow packages “leaking”, so, I consider it to be eventual modularization indicating that it’s an intermediate step towards engines.

gpassero · August 29, 2022, 9:25pm

@palkan Have you found a way to reduce/minimize the negatives of strong modularization via engines (eg dependency mgmt confusion)?

palkan · August 29, 2022, 9:44pm

I heard about the engines bloat problem (though never experienced myself): you may end up creating a new engine every time you need to break a cyclic dependency which you didn’t expect at the time of designing the initial schema; re-arranging engines could be tough, dropping a mediator is much easier; until you find yourself dealing with dozens of engines.

There could be problems with dev tools, which are not monorepo-compatible. For example, VS Code Steep plugin doesn’t support monorepos yet.

system · August 29, 2022, 10:35pm

Message excluded from import.

AlexEvanczuk · August 30, 2022, 1:51pm

I consider it to be eventual modularization indicating that it’s an intermediate step towards engines.

Where I work (Gusto), we’ve been working on and open sourced package_protections. The idea here is that packwerk can represent a “terminal state.” That is — we want packwerk to be able to solve all of the problems engines do and more, and for cheaper. I compared engines and packages in a Gusto blog post and in a conference talk.

AlexEvanczuk · August 30, 2022, 1:53pm

<@U78EF4XFY8W> Yes, model classes in separate packages coupling to each other is one instance of a cyclic dependency.

Here’s a simple example:

# packs/users/app/models/user.rb
class User < ApplicationRecord
  has_many :blog_posts
end

# packs/blog_posts/app/models/blog_post.rb
class BlogPost < ApplicationRecord
  belongs_to :user
end

AlexEvanczuk · August 30, 2022, 1:54pm

We can also get cyclic dependencies anytime code between packages references each other bidirectionally. It doesn’t have to be the same code/files either.

system · August 30, 2022, 1:57pm

Message excluded from import.

system · August 30, 2022, 1:57pm

Message excluded from import.

AlexEvanczuk · August 30, 2022, 2:01pm

why would cyclic dependencies be created between packages? It seems, on the surface, to evince a design problem that should otherwise be solved

For the same reason poor design is ever introduced — because it was the cheapest or most sensible option given the knowledge/constraints when the decision was made. These can really easily occur organically and without something like packwerk, a cyclic dependency is largely invisible. In fact, if you have a single monolithic codebase, there is nothing to indicate something is cyclic — everything is part of the same “mega-package.” Cyclic dependencies only show up conceptually once you declare (with packwerk or otherwise) that one subsystem belongs to package A and another subsystem belongs to package B.

AlexEvanczuk · August 30, 2022, 2:04pm

why are engines more expensive than static analysis alternatives?
By static analysis alternatives I’m assuming you mean packwerk (which uses static analysis to determine if system boundaries are being violated).

There are a ton of reasons (this might make a good topic for us to discuss over zoom!). Here are some general points:
• Cheaper to change. With packwerk, moving a file to a different package costs almost nothing (1 minute of work, very low risk). With engines, it can create a huge rabbit hole of changes required when the other engine doesn’t have the required dependencies of that file.
◦ A huge part of discovering boundaries is getting it wrong. Packwerk allows you to state your ideal packages and then provides a template to get you there. This is just not possible with engines. Packwerk lets you state the ideal, adjust the ideal, and quickly make small, visible iterations towards the ideal boundaries. With engines, every step takes longer, and it forces you to solve modularization problems in areas of potentially low business value.
• Cheaper to create. With packwerk + stimpack, creating a package requires less time and boiler plate.

• Cheaper to maintain. Packwerk packages don’t need their own test helpers, their own gem dependency manifest (i.e. gemspec), and more. They share that of the overarching application.

system · August 30, 2022, 2:09pm

Message excluded from import.

system · August 30, 2022, 2:41pm

Message excluded from import.

AlexEvanczuk · August 30, 2022, 2:50pm

I hear ya! If we could go back in time and do things differently, perhaps we would. On the other hand — perhaps our initial choices are what led to the success we have now. In a very real way, organizations will incur debt now for the purpose of faster growth. I’m sure there are ways to have our cake and eat it too (design things “the right way from the start” without having a hit to velocity), but hard to say.

I think there are two problems that have a lot of overlap, but perhaps should be solved independently:
• You are starting a new rails app (or technical organization in general) — what patterns should you use from the start to prevent problems of these categories (scaling complexity/lifespan/number of contributions in the application)
• You already have a huge, entangled monolith. Knowing what you should have done is not helpful to you. How do we gradually and systematically move towards a well-factored system?
I’m talking about problem (2). I wonder if we should start a channel #greenfield-development to capture solutions to problem (1).

system · August 30, 2022, 2:55pm

Message excluded from import.

palkan · August 30, 2022, 4:50pm

Packwerk packages don’t need their own test helpers,

What about running package tests in isolation? (Without loading the whole monilith)

AlexEvanczuk · August 30, 2022, 4:56pm

Yes! That’s definitely a long-term plan. There are actually two categories of package test isolation:

When running tests, only load the code that your test depends on. I believe this is what we’re talking about.
Only run the tests that could break (conditional builds). That is, if I change A, only run CI tests for things that depend on A to reduce compute costs and time. Just wanted to include this for completeness.
At Gusto, we’ve begun exploring (2), but we haven’t yet explored (1). I believe Rails itself, with eager loading, already does (1) automatically. That is, it loads certain categories of things (initializers, rspec helpers, and more) no matter what. However, it only loads application code when its referenced, so if your code (and its dependencies) don’t reference something, they won’t load.

For things that automatically load, it’s a matter of extracting those into something that doesn’t get loaded automatically, and only requiring/loading that as necessary. We haven’t done a ton of exploration into this as we believe there are a lot of prerequisites we need to focus on first to make this more valuable.