Using Graph Theory to Break Up Monoliths in Rails: Seeking Feedback and Suggestions

system · August 24, 2023, 5:01pm

This message was imported from the Ruby/Rails Modularity Slack server. Find more info in the import thread.

Message originally sent by slack user U7213XMGS3H

I’ve been thinking a lot about how to break up monoliths and I started wondering if Graph Theory could help. I’m motivated by the fact that large rails monoliths are very entangled and it can be tricky to decide where the boundaries should go. Not knowing a great deal about Graph Theory, I gave myself a weekend to try to determine model clusters as a proxy for packs at my company. The results were very encouraging for a weekend, so I thought I’d share.

My steps were to

Make a rake task that created a JSON file of all of our AR models and their relationships
Throw that into Neo4j
Use the Louvain community detection algo (because I didn’t want to specify in advance how many clusters to identify, I had a directed graph, I didn’t have weights in my graph, and I knew a fair amount about the domain. I think there’s lots of room to play here and figure out some weights or try different approaches.)
Play with the output…
The community sizes weren’t perfect

[247,75,60,42,34,31,30,27,20,9,9,8,5,4,3,3,3,3,2,2,2,2,2,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]

247 models is too many for one pack, as is 75 or 60. But the medium sized ones were really interesting. They looked like they actually represented reasonable boundaries.

Has anyone tried anything like this before? Any ideas for me? I feel like this would be an incredible blog post if my clusters were ~30% better

stephan · August 24, 2023, 5:29pm

What <@U78EF4XFY8W> asked was also going to be my question: the 247 is probably a bunch of the biggest AR models, right?

system · August 24, 2023, 5:30pm

Message originally sent by slack user U7213XMGS3H

it’s the most tangled

system · August 24, 2023, 5:30pm

Message originally sent by slack user U7213XMGS3H

they’re not all big

stephan · August 24, 2023, 5:31pm

one thing to try: take only the 247 and remove everything else and do the analysis again. You’ll probably get another power-law distribution

stephan · August 24, 2023, 5:31pm

I have done quite a bit of these kinds of analyses… today is busy, but I will remind myself to add more thoughts and experiments

system · August 24, 2023, 5:33pm

Message originally sent by slack user U7213XMGS3H

Thanks!

system · August 24, 2023, 5:33pm

Message originally sent by slack user U7213XMGS3H

When you’re less busy i’m very interested in the tools you used

system · August 24, 2023, 6:05pm

Message originally sent by slack user U7213U6VS93

I also wonder if we took out some of the “god object” ones, would we see some more localized clusters?

system · August 24, 2023, 6:15pm

Message originally sent by slack user U7213XMGS3H

I tried that! I didn’t spend as much time looking the domains but they didn’t seem much better

system · August 24, 2023, 6:16pm

Message originally sent by slack user U7213XMGS3H

like maybe they were incrementally better but I don’t have any objective way to compare clusters and just scrolling through them they didn’t seem obviously any different

system · August 24, 2023, 9:27pm

Message excluded from import.

system · August 25, 2023, 4:50pm

Message excluded from import.

stephan · August 25, 2023, 8:02pm

I regularly look up “efferent” when in threads with Scott … (efferent > outgoing, afferent > incoming)… 100% agree with Scott on the last point. Note that on top of not being in the aggregate, these outgoing connections tend to be of the worts kind: lots of has_many

system · August 25, 2023, 8:07pm

Message originally sent by slack user U7213XMGS3H

I think I follow what you’re both saying but some of this terminology is new to be

system · August 25, 2023, 8:08pm

Message originally sent by slack user U7213XMGS3H

I may be overstating it… if you were to explain it to my toddler what would you say

stephan · August 25, 2023, 9:32pm

ok. here goes nothing.

stephan · August 25, 2023, 9:32pm

(not sure if toddlers are into octopusses…)

stephan · August 25, 2023, 9:40pm

You know how most octopusses have eight arms that are about as long as your arm? That way the can feel the important things that are right around them. They live a pretty happy, normal life in the sea. Well in the software sea there is an evil type of octopus… It is called go(d)ctopus (I am really trying here)… It grows hundreds of arms and the grow really long (like, all the way to school)… These arms are so long that they have to be thicker than normal octopus arms (to hold the weight)… And they tightly hold onto things far and wide. When these go(d)ctopusses grow, normal octopusses start to have a hard time moving around normally, because the they constantly bump into these arms gripping everything. That makes most octopusses pretty unhappy, because they don’t like it when their software sea is totally entangled from all these arms…

system · August 25, 2023, 11:16pm

Message originally sent by slack user U72S4H5OTOI

I’m curious how different that would end up being from the Strong Connected Components of the graph