May 15, 2020

Generating SQL Diagrams with Golang

My team maintains a database that has close to one hundred tables, with hundreds of foreign key relationships between those tables. I asked around on Slack, but nobody knew of any schema diagrams hidden away in our documentation folders. There are tools online that can prepare a diagram for you from a sql dump such as [dbdiagram.io]. I was worried about pasting my team’s DB schema into this web interface, so I embarked on generating the diagram myself.

Getting a sql dump

Before we can generate a diagram, we need to dump our SQL schema to a file. If you/your team are following good version control practices, you should already have a checked-in version of your schema as a .sql file, which you can use to recreate your production database.

If you can’t find this, run the following command to dump your schema to the file defs.sql:

mysqldump --no-data <DB_NAME> > defs.sql

Generating a table diagram in Golang

To get started with a new Golang project, create a new directory.

$ export PROJECT = $HOME/go/src/github.com/<YOUR_USERNAME>/sql-graphviz`
$ mkdir $PROJECT
$ cd $PROJECT

First, let’s define a Golang representation of our sql schema. Since we’re interested in exploring the relationships between tables, we can keep our representation simple and limited to tables and foreign keys between tables. Create a new file, schema.go, and define a struct for our tables. A String() method will assist with future debugging.

// schema.go
package main

// Table defines a subset of the sql table schema
type Table struct {
	name    string
    // columns maps from the column name to the column 
    // type (i.e. VARCHAR(11))
	columns map[string]string
}

func (t Table) String() string {
    cols := make([]string, 0)
    for k := range t.columns {
        cols = append(cols, k)
    }
    return fmt.Sprintf("<table name=%s, columns=%v>", t.name, cols)
}

Next define a struct for our foreign keys. A foreign key is the most common way of defining a relationhsip between two tables; it is a field “or collection of fields” in one table that refers to the PRIMARY KEY in another table. Include a String() method for debugging. We will also include an Equal() method for comparing relationKet.

// schema.go

// RelationKey defines one half of a foreign key (either the 
// child table or the referenced/parent table).
type RelationKey struct {
	table   string
	columns []string
}

// Equal checks that RelationKeys are defined over the same table and column sets.
func (rk RelationKey) Equal(rk2 RelationKey) bool {
    // Check if RelationKeys are over the same tables.
    if rk.table != rk2.table {
        return false
    }
    if len(rk.columns) != len(rk2.columns) {
        return false
    }

    // Check if column sets are the same.
    columnKeys := make(map[string]bool, 0)
    for columnName, columnType := range rk.columns {
        key := fmt.Sprintf("%v-%v", columnName, columnType)
        cols[key] = true
    }
    for columnName, columnType := range rk2.columns {
        key := fmt.Sprintf("%v-%v", columnName, columnType)
        cols[key] = false
    }
    for _, val := range columnkeys {
        if val {
            return false
        }
    }

    return true
}

// Relation is a directed mapping between sets of columns on 
// two distinct tables.
type Relation struct {
	a RelationKey
	b RelationKey
}

func (r Relation) Equal(r2 Relation) bool {
    return r.a.Equal(r2.a) && r.b.Equal(r2.b)
}

func (r Relation) String() string {
	return fmt.Sprintf("<%s (%v) -> %s (%v)>", r.a.table, r.a.columns, r.b.table, r.b.columns)
}

Finally, let’s add a schema interface. Separating out interface and implementation is a good practice for adding unit tests, and allows us to think through our ideal API before diving into the internals of the component. Our schema interface will also check calls for duplicate tables and invalid relations (i.e. between non-existent tables/columns), and throw a nice error message if the user tries to define an invalid schema.

// schema.go

// Schema contains a set of tables with columns and relations
// between those columns (foreign keys).
type Schema interface {
    AddTable(table Table) error
    AddRelation(a, b RelationKey) error
    String() string
}

Now we define a struct that implements our schema interface above. We map table names to table information and store our relations in a slice.

type schemaImpl struct {
    Tables map[string]Table
    Relations []Relation
}

// AddTable extends the schema with a new table. It validates
// that a table with the given name does not exist.
func (s *schemaImpl) AddTable(table Table) error {
	name := table.name
	if _, ok := s.Tables[name]; ok {
		return fmt.Errorf("table with name %s already exists", name)
	}
	s.Tables[name] = table
	return nil
}

// AddRelation extends the schema with a new relation/foreign
// key. It validates that the selected tables and columns exist 
// in the schema, and that the relation does not already exist.
func (s *schemaImpl) AddRelation(a, b RelationKey) error {
    // Check that both the tables exist.
	if _, ok := s.Tables[a.table]; !ok {
		return fmt.Errorf("table with name %s does not exist", a.table)
	}
	if _, ok := s.Tables[b.table]; !ok {
		return fmt.Errorf("table with name %s does not exist", b.table)
	}
    // Check that all the columns exist.
	for _, col := range a.columns {
        if _, ok := s.Tables[a.table].columns[col]; !ok {
			return fmt.Errorf("column with name %s does not exist on table with name %s", col, a.table)
		}
	}
	for _, col := range b.columns {
		if _, ok := s.Tables[b.table].columns[col]; !ok {
			return fmt.Errorf("column with name %s does not exist on table with name %s", col, b.table)
		}
    }
    newRel := NewRelation(a, b)
    // Check whether the relation already exists.
    for _, relation := range s.Relations {
        if relation == newRel {
            return fmt.Errorf("new relation already exists on schema %s", newRel.String())
        }
    }
	s.Relations = append(s.Relations, newRel)
	return nil
}

func (s *schemaImpl) String() string {
	tables := make([]string, 0)
	for name := range s.Tables {
		tables = append(tables, name)
	}
	return fmt.Sprintf("<tables=%v relations=%v>", tables, s.Relations)
}

// NewSchema is a simple factory/constructor for schemaImpl.
func NewSchema() Schema {
    return &schemaImpl{
        Tables: make(map[string]Table, 0),
        Relations: make([]Relation, 0),
    }
}