For 8 years I trained Consultants at A.T. Kearney in how to do analysis using Excel and Access. A big part of the theory is to do know what analysis to do in what tool. I have seen Consultants with MBAs from the best schools in the world take weeks doing analysis (poorly) in Excel when all of it could (and subsequently was) done in hours in Access. When you want to cut & slice & drill large quantities of data, Excel just doesn’t cut it. Even with vlookups, sumifs and even arrays all over the shop Excel just doesn’t cut it. Even worse, I have seen people do pivot tables, cut & paste the output and then do analysis on that data… only for something to change, the pattern of copy & pasting values… a total nightmare.
One of the big tricks is to get the data schema and relationships right. What is the structure that you want to analyse? Setting up primary keys and ensuring they are all consistent across the entire model. The image below shows an overly simplified example of 5 tables, which form part of how to do bid analysis in the context of Strategic Sourcing.
My experience is, in general, most people really struggle to get their heads account designing a relational database or how to write SQL queries with complex joins etc.
So, having gotten my head around relational databases and feel a small quantum of expertise in them, it was quite a surprise when the OrgVue team informed me that we would take inspiration from graph databases in order to solve the Org Design problem I had laid out. What were they talking about? And then we could make it schemaless!!! How? A journey of understanding. Graph databases are a totally different paradigm. There are nodes, edges, properties… what are they?
Starting with the Org Design Challenge
There is an org structure, which is hierarchical. The box (node in graphing speak) has reporting lines and can even have dotted lines (breaking that hierarchy from a simple tree). On top of this organisational structural view, I wanted to be able to define processes and accountabilities; objective trees and again, accountabilities… However, I couldn’t be certain which elements clients wanted to define. What about “Customer Data” in another hierarchy, e.g. markets, segments, location… ?
All this, apparently, was ideally solved by a graph.
So, what is a graph?
Wikipedia says: “A graph database uses graph structures with nodes, edges, and properties to represent and store data. By definition, a graph database is any storage system that provides index-free adjacency. This means that every element contains a direct pointer to its adjacent element and no index lookups are necessary.”
Let me break this down:
“Nodes represent entities such as people, business, accounts, or any other item you might want to keep track of.”
Think of a node as the “thing” or a “folder”. The folder contains information. For instance, if it is a person, then the information you may want to collect includes: Name, Age, Gender, Tenure, Nationality, a picture… All this information within the node is called the properties of the node. In the above Wikipedia example, Node with Id: 1 has name: Alice and Age: 18. Other nodes can have different types of properties. It just doesn’t matter. Knock yourself out and give each node as many properties as you like. Furthermore, not all nodes require the same properties to be defined, and what type of data you hold in each property is entirely up to you. Equally, don’t worry about how to link them, what the primary keys are, the relationships and referential integrity… those are not the constructs of the “Graphing World”.
The next key concept is edges. “Edges are the lines that connect nodes to nodes or nodes to properties and they represent the relationship between the two”.
Edges have values just like properties. For example the famous RACI & RAPID (or our simplified RAS) grids that define accountabilities for decisions or activities or… for a given set of roles. Other examples could be the percentage of their time that someone spends doing given activities (which is part of what we call the IAA, Individual Activity Analysis) or whether someone is in a given role. The range is endless.
The beauty, from my perspective at least, is that you can then slice the data by any property, edge or node. It is free form. For analysts it really feels liberating. It just works. Yes, like Excel or Access (or SQL or any other query language) there is stuff to learn. OrgVue has an Expression Language called “Gizmo”. It allows you to “traverse” the graph. It allows you to create new properties (such as sum of cost; ratio of x:y; roll-up of (x * y + z)… a topic for a much longer piece) or use many of the out-of-the-box properties such as depth; number of children (called outgoing count, e.g. in the context of design this is the Span of Control); is leaf or not (leaf doesn’t have any children, i.e. at the end of the tree).
As the above should be making clear, there is a whole new language surrounding this graphing paradigm. You have children (reports into Node) and descendants (all children, their children… until you get to the leaves). It all feels very organic. You can create and then explore. Like all things, it takes a bit of time to get used to. But getting used to it is pretty easy. My experience is most people I tried to train in relational databases just never got there. And now, finally, there is a way.
As a footnote, what are examples of other graphing databases?
- Social networking sites like Facebook use them
- Amazon recommendations are built on one
- Your SatNav – roads are edges
There are lots of exciting examples of graphs. The idea of graphs is not new (just relatively new to me, but then again, I’m not a computer scientist… just an Economist turned Management Consultant turned Entrepreneur… so what do I really know). In fact, graphs and object orientated databases have been around for ages. The relational database world grew out of the need to maintain integrity of transactional data sets and to allow easy manipulation of these data. It clearly captured market share to the point where “people like me” didn’t know there was even another option or world. So there is another world. It feels organic. It is schemaless. It just works. Enjoy.