Message originally sent by slack user U72S4H5OTOI
The path I’ve been trying to go down is to build up a graph, pull out the strongly connected components, and then try to find ways to split those up.
Message originally sent by slack user U72S4H5OTOI
The path I’ve been trying to go down is to build up a graph, pull out the strongly connected components, and then try to find ways to split those up.
Message originally sent by slack user U72S4H5OTOI
So far have tried just iterating over every edge in the SCC and seeing how many SCCs remain if I remove it, and it definitely highlighted some things that were load bearing that shouldn’t have been
Message originally sent by slack user U7213XMGS3H
@stephan, incredible
Message originally sent by slack user U72TYO4E1TM
<@U7213XMGS3H> do you mind sharing the code for the steps that you mentioned (at least the rake task). Thanks!
Message originally sent by slack user U7213XMGS3H
I just removed some company specific stuff, and so this literal code hasn’t been run, but it should work, or be very close
task :export_models => :environment do
Rails.application.eager_load!
def index_models
models = ApplicationRecord.descendants.reject(&:abstract_class)
model_info = []
errors = []
models.each do |model|
begin
columns = model.columns.map do |column|
{
name: column.name,
type: column.type.to_s
}
end
associations = model.reflect_on_all_associations.map do |assoc|
{
name: assoc.name,
type: assoc.macro,
class_name: assoc&.class_name,
foreign_key: assoc&.foreign_key,
through: assoc&.options[:through]
}
end
model_info << {
name: model.name,
table_name: model.table_name,
columns: columns,
associations: associations
}
rescue => e
errors << {
model: model.name,
error: e
}
end
end
{
models: model_info.index_by { |m| m[:name] },
errors: errors,
}
end
indexed_models = index_models
File.write('models.json', JSON.pretty_generate(indexed_models[:models]))
end
Message originally sent by slack user U7213XMGS3H
Then here’s a docker-compose file for a blank neo4j
version: "3"
services:
neo4j:
image: neo4j:5.11.0
container_name: neo4j
ports:
- "7474:7474"
- "7667:7667"
- "7687:7687"
environment:
NEO4J_ACCEPT_LICENSE_AGREEMENT: true
# This plugin is important for the analysis!
NEO4J_PLUGINS: '["graph-data-science"]'
volumes:
- ./neo4j/data:/data
- ./neo4j/logs:/logs
- ./neo4j/import:/import
- ./neo4j/neo4j.conf:/conf/neo4j.conf
Message originally sent by slack user U7213XMGS3H
Then you need to add dbms.security.procedures.unrestricted=gds.*
to conf/neo4j.conf
Message originally sent by slack user U7213XMGS3H
Then I did this for getting the data into neo4j
Message originally sent by slack user U7213XMGS3H
I wrote it node because i’m still faster in js
const modelsJson = require("./schema.json");
const neo4j = require("neo4j-driver");
const driver = neo4j.driver(
"<bolt://localhost:7687>",
neo4j.auth.basic("neo4j", "the password")
);
async function importSchema() {
const session = driver.session();
for (const tableKey of Object.keys(modelsJson)) {
const table = modelsJson[tableKey];
await session.run("CREATE (:Table {id: $name, tableName: $tableName})", {
name: table.name,
tableName: table.table_name,
});
for (const association of table.associations) {
await session.run(
`
MATCH (source:Table {id: $sourceId}), (target:Table {id: $targetId})
CREATE (source)-[r:ASSOCIATION]->(target)
SET r.type = $type
SET r.name = $name
SET r.foreignKey = $foreignKey
`,
{
sourceId: table.name,
targetId: association.class_name,
type: association.type,
name: association.name,
foreignKey: association.foreign_key,
}
);
}
}
session.close();
}
importSchema().finally(() => driver.close());
Message originally sent by slack user U7213XMGS3H
Once neo4j is up and running, I ran
call gds.graph.project('schema', 'Table', 'ASSOCIATION')
Message originally sent by slack user U7213XMGS3H
Then
call gds.labelPropagation.stream('schema')
yield nodeId, communityId as community
return gds.util.asNode(nodeId).id as Name, community
order by community, Name
Message originally sent by slack user U7213XMGS3H
I found neo4j’s default graph tool wasn’t so great at visualizing the community results so i used a js library called neoviz
to do it
<!DOCTYPE html>
<html>
<head>
<style type="text/css">
html,
body {
font: 16pt arial;
margin: 0;
padding: 0;
}
#viz {
width: 100vw;
background: grey;
height: 100vh;
font: 20pt arial;
}
</style>
</head>
<body onload="draw()">
<div id="viz"></div>
<script src="<https://unpkg.com/neovis.js@2.0.2>"></script>
<script type="text/javascript">
let neoViz;
function draw() {
const config = {
containerId: "viz",
neo4j: {
serverUrl: "<bolt://localhost:7687>",
serverUser: "neo4j",
serverPassword: "pass",
},
interaction: {
click: true,
hover: true,
},
visConfig: {
nodes: {
shape: "dot",
},
edges: {
arrows: {
to: { enabled: true },
},
},
},
labels: {
Table: {
group: "louvainCommunity",
[NeoVis.NEOVIS_ADVANCED_CONFIG]: {
shape: "dot",
function: {
title: NeoVis.objectToTitleHtml,
},
},
},
},
relationships: {
ASSOCIATION: {
value: "weight",
},
},
initialCypher: "MATCH (n:Table)-[r:ASSOCIATION]->(m) RETURN *",
};
neoViz = new NeoVis.default(config);
neoViz.render();
}
</script>
</body>
</html>
Message originally sent by slack user U7213XMGS3H
That’s all I got! LMK if this doesn’t work for you. I also might turn this into a blog post. I would have already if the results were like 25% better, so if you make any progress toward that and want to collaborate, LMK!
Message originally sent by slack user U72TYO4E1TM
thank you so much! I’ll give it a try next week