How can I use packs to limit the specs run on a PR/branch to only affected tests?

system · April 4, 2023, 8:02pm

This message was imported from the Ruby/Rails Modularity Slack server. Find more info in the import thread.

Message originally sent by slack user U729K0K9ZF5

Hey all, just getting into the ecosystem and excited about what I am seeing!

One thing I haven’t seen yet - has anyone used packs to limit the specs run on a PR/branch/etc. to only affected tests? I am thinking something like:
• Get file list from git diff comparing to origin/main
• Reduce that list to a list of packages with changes
• Get the list of all recursive dependencies of those packages (and add to the previous list)
• Run specs for only those packages (covered by https://github.com/rubyatscale/packs-rails#rspec-integration)
Thanks!!

Related: Suggestions may belong in a different channel (<#C01P94J78V6|future-of-packwerk>?), but it would be really powerful to generalize this ability similar to how affected works in nx (https://nx.dev/concepts/affected)

system · April 4, 2023, 8:04pm

Message originally sent by slack user U717GMJTWHJ

Alex can probably answer this better than me, but we definitely do that at Gusto… to some extent. Because our pack boundaries are far from perfect, we still run most packs in a CI run, but for draft PRs, we only run the specs for a subset of packs.

system · April 4, 2023, 8:10pm

Message originally sent by slack user U729K0K9ZF5

Thanks! Ya, I am imagining this as kind of a carrot for making progress on gradual modularity - if you improve your module boundaries you will get faster PR builds!

system · April 4, 2023, 8:14pm

Message originally sent by slack user U717GMJTWHJ

Yeah, for large apps, that’s a meaningful amount of time!

system · April 4, 2023, 8:30pm

Message originally sent by slack user U729K0K9ZF5

Do you happen to know if there is part of the Public packwerk API (https://github.com/Shopify/packwerk/blob/main/lib/packwerk.rb#L16) that easily gives you all dependents of a package? Looks like at min you can get the PackageSet of all packages, then filter by package.dependencies?

system · April 4, 2023, 8:30pm

Message originally sent by slack user U717GMJTWHJ

Not off the top of my head, sorry.

system · April 4, 2023, 8:32pm

Message originally sent by slack user U717GMJTWHJ

Yeah… that’ll give you declared dependencies.

system · April 4, 2023, 8:32pm

Message originally sent by slack user U717GMJTWHJ

Not sure how to get actual dependencies.

system · April 4, 2023, 8:54pm

Message originally sent by slack user U729K0K9ZF5

Right! Cool, will see if anyone else jumps on the thread and I will keep digging. Appreciate it!

AlexEvanczuk · April 5, 2023, 12:37pm

Hi <@U729K0K9ZF5>! Let me try answer some of your questions:
• Packwerk can only give the declared/stated dependencies for a single package
• At Gusto, during draft builds, we only run tests for the package(s) changed and the package(s) affected by that change, as determined by traversing the stated dependency list upwards (i.e. all ancestors can be affected by a descendant change)
• During non-draft builds, we do the same, except we use dependency violations AND stated dependencies as edges to determine what tests to run.
There isn’t an open source API as far as I know that can do this, but it’d be interesting if we open-source this. However in general, you’ll want to do something like this:

AlexEvanczuk · April 5, 2023, 12:37pm


require 'parse_packwerk' # Gusto open sourced this to make it easier to query packwerk YML files <https://github.com/rubyatscale/parse_packwerk>
files_that_changed = ['packs/my_pack/path/to/file.rb', 'path/to/other/file.rb'] # getting this list will be custom to your setup
packs_that_changed = files_that_changed.map{|f| ParsePackwerk.package_from_path(f) }.uniq
packs_to_test = []
packs_to_test += packs_that_changed

class Traverser
  def initialize(visited)
    @visited = visited
    @incoming_dependency_list = {}
    @incoming_violation_list = {}
    Packs.all.each do |ancestor|
      p.dependencies.each do |descendant_name|
        descendant = Packs.find(descendant_name)
        @incoming_dependency_list[descendant] = ancestor
      end

      p.violations.select(&:dependency).map(&:to_package_name).uniq.each do |descendant_name|
        descendant = Packs.find(descendant_name)
        @incoming_violation_list[descendant] = ancestor
      end
    end
  end

  def get_ancestors(pack, visited: [])
    return [] if visited.include?(pack)
    @visited << pack
    ancestors = []

    @incoming_dependency_list[pack].each do |ancestor|
      ancestors << ancestor
    end

    if !draft_build
      @incoming_violation_list[pack].each do |ancestor|
        ancestors << ancestor
      end
    end

    ancestors
  end
end

traverser = Traverser.new

packs_that_changed.each do |pack|
  packs_to_test += traverser.get_ancestors(pack)
end

run_tests_for(packs_to_test) # this will also be custom to your setup

AlexEvanczuk · April 5, 2023, 12:38pm

You’ll need to verify + write some tests for that – I didn’t execute it, but that’s the general idea of what we do.

AlexEvanczuk · April 5, 2023, 12:38pm

Really excited about other folks using packs and packwerk for conditional builds, so let me know how it goes and if I can help!

system · April 5, 2023, 1:59pm

Message originally sent by slack user U729K0K9ZF5

Amazing! Was not expecting code, so appreciate the detailed response!

we use dependency violations
This is from the CLI output? That is what I assumed from the statement that packwerk will only give declared deps.

Though, it seems if one set things up to fail a build on dep violations, the stated ones would be sufficient. At the trade-off of more rigidity, less “gradual-arity”.

AlexEvanczuk · April 5, 2023, 2:02pm

Not the CLI output per-se, but the package_todo.yml files generated by the CLI. “statement that packwerk will only give declared deps.” Which statement do ya mean exactly? Packwerk IMO doesn’t have great APIs to query YML information, but declared deps/violations are all queryable for sure.

the stated ones would be sufficient

They are different use cases. While a build will fail on new violations, the user can address them or add them as todos with bin/packwerk update . Regardless of their decision, the conditional build system will need to query these to get a more accurate picture of what tests to run. Ultimately whether to run conditional builds based on stated dependencies and/or violations is up to you, since they both have tradeoffs. Even when running with both, some things will still be missed (e.g. I can change something in a way packwerk doesn’t pick up that affects other packs), so you’ll want to make sure some environment runs all tests until you’re confident enough in your conditional build system.

system · April 5, 2023, 4:13pm

Message originally sent by slack user U729K0K9ZF5

Ah, right, sorry for the confusion. That is super clear. So it is more like:
• Make sure the CLI has updated package.yml files before reading them
• Get package dep info from yml files (declared + violations)
• Assume that deps may have been missed and run the full suite pre-prod (or other important environments)
Thanks again, really appreciate the detailed and clear answers!

AlexEvanczuk · April 5, 2023, 4:32pm

Yes that about sums it up! Happy to chat through more specifics (or general stuff too!) anytime

AlexEvanczuk · April 5, 2023, 5:42pm

I almost forgot: if any of your packs do not enforce dependencies, you may want to ALWAYS run those packs’ tests to be safe (again based on your own trade offs).

system · April 8, 2023, 3:20am

Message originally sent by slack user U7209900GOJ

I’ll have to see if I can dredge up some of my helpers - I did a refactor of our smallest app at work to Engines like 2y ago as an experiment, and got I think 4 engines running specs through GitHub actions based on which engine had files modified from master + all engines that depended on it. Not sure how easy it’d adapt to packwerk though.

jamie_ca · April 8, 2023, 10:13pm

Yeah, so definitely this is engine-specific, but https://gist.github.com/jamie/a23ac99f04ce4ec784699810558594ba
Theoretically should just need the logic in ComponentSupport#scan_dependencies updated to scan packs instead of gemspecs to build up the dependency graph, and then everything else flows to build up a tree of CI jobs that only run if (a) they or their dependencies changed, and (b) their dependencies build passed.