Using Graph Theory to Break Up Monoliths in Rails: Seeking Feedback and Suggestions

This message was imported from the Ruby/Rails Modularity Slack server. Find more info in the import thread.

Message originally sent by slack user U7213XMGS3H

I’ve been thinking a lot about how to break up monoliths and I started wondering if Graph Theory could help. I’m motivated by the fact that large rails monoliths are very entangled and it can be tricky to decide where the boundaries should go. Not knowing a great deal about Graph Theory, I gave myself a weekend to try to determine model clusters as a proxy for packs at my company. The results were very encouraging for a weekend, so I thought I’d share.

My steps were to

  1. Make a rake task that created a JSON file of all of our AR models and their relationships
  2. Throw that into Neo4j
  3. Use the Louvain community detection algo (because I didn’t want to specify in advance how many clusters to identify, I had a directed graph, I didn’t have weights in my graph, and I knew a fair amount about the domain. I think there’s lots of room to play here and figure out some weights or try different approaches.)
  4. Play with the output…
    The community sizes weren’t perfect
[247,75,60,42,34,31,30,27,20,9,9,8,5,4,3,3,3,3,2,2,2,2,2,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]

247 models is too many for one pack, as is 75 or 60. But the medium sized ones were really interesting. They looked like they actually represented reasonable boundaries.

Has anyone tried anything like this before? Any ideas for me? I feel like this would be an incredible blog post if my clusters were ~30% better

What <@U78EF4XFY8W> asked was also going to be my question: the 247 is probably a bunch of the biggest AR models, right?

Message originally sent by slack user U7213XMGS3H

it’s the most tangled

Message originally sent by slack user U7213XMGS3H

they’re not all big

one thing to try: take only the 247 and remove everything else and do the analysis again. You’ll probably get another power-law distribution

I have done quite a bit of these kinds of analyses… today is busy, but I will remind myself to add more thoughts and experiments

Message originally sent by slack user U7213XMGS3H

Thanks!

Message originally sent by slack user U7213XMGS3H

When you’re less busy i’m very interested in the tools you used

Message originally sent by slack user U7213U6VS93

I also wonder if we took out some of the “god object” ones, would we see some more localized clusters?

Message originally sent by slack user U7213XMGS3H

I tried that! I didn’t spend as much time looking the domains but they didn’t seem much better

Message originally sent by slack user U7213XMGS3H

like maybe they were incrementally better but I don’t have any objective way to compare clusters and just scrolling through them they didn’t seem obviously any different

Message excluded from import.

Message excluded from import.

I regularly look up “efferent” when in threads with Scott :smile: … (efferent > outgoing, afferent > incoming)… 100% agree with Scott on the last point. Note that on top of not being in the aggregate, these outgoing connections tend to be of the worts kind: lots of has_many

Message originally sent by slack user U7213XMGS3H

I think I follow what you’re both saying but some of this terminology is new to be

Message originally sent by slack user U7213XMGS3H

I may be overstating it… if you were to explain it to my toddler what would you say

ok. here goes nothing.

(not sure if toddlers are into octopusses…)

You know how most octopusses have eight arms that are about as long as your arm? That way the can feel the important things that are right around them. They live a pretty happy, normal life in the sea. Well in the software sea there is an evil type of octopus… It is called go(d)ctopus (I am really trying here)… It grows hundreds of arms and the grow really long (like, all the way to school)… These arms are so long that they have to be thicker than normal octopus arms (to hold the weight)… And they tightly hold onto things far and wide. When these go(d)ctopusses grow, normal octopusses start to have a hard time moving around normally, because the they constantly bump into these arms gripping everything. That makes most octopusses pretty unhappy, because they don’t like it when their software sea is totally entangled from all these arms…

Message originally sent by slack user U72S4H5OTOI

I’m curious how different that would end up being from the Strong Connected Components of the graph