You’ve just received a message from your data analyst and knowledge worker peers: Given the uncertain economy and rigid labor market for data analysts and specialist knowledge workers, they need to find new ways to solve data-intensive problems.
As part of the technical team there to support them, you get looped into a follow-up meeting. During this session, your peers never use the term “graph thinking” specifically, but they do mention how much value they’d get from being able to more easily understand how multiple data entities are connected and interact with each other. They probably don’t even know the spreadsheet they’re showing on their Zoom screen share or the conference room TV is exactly what’s holding them back. As they skillfully move between columns or worksheets to illustrate the important correlations they’ve already found to produce their insights, you start to see an opportunity.
The onus now falls on you to choose the right technology, make proper design decisions, find the right partners, and ultimately deliver a solution that helps your peers swiftly connect data entities to improve how they generate insights and make decisions on behalf of your organization.
Graph thinking: the mindset for solving complex problems faster
First, you need to understand that your peers aren’t just asking for a new business intelligence SaaS app that connects to your existing data lake and creates slightly different dashboards than they already had. They’re looking to apply different analytical processes and methods—an entirely new way of looking at your organization’s data—which requires you to change up how you process and store it.
That’s graph thinking—a mindset that recognizes the connectedness of your organization’s data and prioritizes understanding those relationships in context to solve specific business problems.
Graph thinking is a powerful paradigm shift because it recognizes how important context is to make big decisions. Instead of your peers jumping back and forth between columns or worksheets for data entities they perceive to be connected, they can codify when and how entities are connected based on their type of relationship. With supportive technology, like a native graph database, your peers no longer need to worry about where data is stored, allowing them to focus exclusively on developing insights based on shared attributes and the broader web of connections.
If you want a deeper explanation of graph thinking, check out our complete explainer on what graph thinking is, how it’s different from data analysis of the past, and some powerful success stories based on our tangible experiences at GraphGrid with our leading composable graph + AI platform.
Accelerating graph and AI solutions for your data analyst and knowledge worker peers
As much as you might be tempted to start deploying new infrastructure now that you understand what your peers are looking for, successful graph thinking relies more on your design and platform choices before you build, integrate, and deploy a single service.
Some of these essential choices include:
What will your graph data model look like?
When you model graph data, you decide which entities in your dataset should become nodes, the fundamental units of a graph database, which are most often the people, places, and things relevant to your business. You also need to define how these nodes are connected, create edges, add labels for relevant categories, then add context with properties.
A graph database treats the connections between nodes with the same significance as the nodes themselves, which means your graph model influences how your data/knowledge peers explore the data, enrich it, and develop knowledge.
Here’s a simplified graph data model from a GraphGrid project to help the United States Department of the Treasury discover fraud:
Data modeling for a graph database is challenging for newcomers because there is no “starter kit” or formula you can follow. Unlike relational databases or spreadsheets, which let you add more columns and worry about querying/filtering later, a graph data model is only valid if you design it around the business problems your peers are trying to solve.
Always start with the business problems your peers are trying to solve, then work backward by thinking of your future data model as a whiteboard sketch. There will likely be certain entities or relationships you’ll want to highlight with bigger letters or a standout color—by prioritizing these in your data model as nodes and edges and enriching them with properties, you’ll start your leap to graph thinking on the right foot.
In the above example, the US Treasury was most interested in simplifying how analysts investigate fraud and non-compliance, which meant they wanted to prioritize people and organizations, with connections defined by ownership, business dealings, familial relationships, and more. Other data is then stored as properties to these nodes, which ensures analysts can query, search for, and discover the all-important context of a given node and its immediate connections.
Getting your graph data model right isn’t a single stop, but a journey—don’t get caught up in aiming for perfection. Take a single day to model around one of the business problems and then let your peers get to work. As your organization learns more about graph thinking, and your peers discover other challenges they’d like to overcome using your data, you can add to and alter the topology of your graph data. Nothing is set in stone.
Can you use an open source data format?
You’ve most certainly played around with the different data formats available for data stored in spreadsheet format, like Excel Workbook (.xlsx), comma-separated values (.csv), and beyond. For the most part, the spreadsheet data model is simple enough that it’s relatively easy to import/export from one format to another without losing fidelity or features.
Graph data formats don’t have the same luxury. Because a graph database must natively store the connections between various nodes and treat those connections as equally important as the nodes themselves, a simple format of values and indexes won’t work.
Because of this complexity, many providers of graph databases and their associated technologies and platforms have developed proprietary data formats to support their unique features or optimize for specific qualities, like performance or flexibility.
When you use a platform with a proprietary data format, the results of your data model, and any additional connections your data/knowledge peers have made within your graph platform, become locked in. There’s no exporting those insights back to a standard data format if you take your graph thinking journey elsewhere.
When you choose a platform that utilizes open source data formats, you avoid that dreaded vendor-lock-in. For example, GraphGrid’s composable graph + AI platform utilizes ONgDB as its open native graph database, which means you’re always able to see what happens beneath the hood and keep open the opportunity to add or reconfigure the constituent parts of your platform—or your data model itself—based on your graph thinking journey.
What kind of query language do you require?
The choice of data format also affects the graph query language you and your data/knowledge peers utilize to search for and filter your data. You should ask some tough questions about any graph database and what its query language supports:
- Is the language expressive and declarative, allowing users to focus on the data they want to retrieve instead of worrying about the joins/traversals required to do so?
- Can users filter and sort results based on node properties, edge types, and more?
- Is the syntax human-readable, especially to your less technical users, like business analysts?
- Can users easily write depth-based queries that traverse the graph?
- Does the language implement some or all of the Graph Query Standard (GQL), which would make queries and knowledge interoperable with multiple systems?
GraphGrid’s platform uses Geequel, an openCypher implementation, as its graph query language, allowing users to represent complex ideas with a few lines of code and simple syntax.
What systems/services do you need to support the use case of your peers?
No single tool, service, or platform can enable graph thinking on its own. You need to launch an entire ecosystem, one you’ve intentionally and carefully designed to support your peers’ graph thinking, which likely includes all kinds of complex services, like:
- A graph database, data format, and querying language.
- Observability, like metrics and logs.
- Search and indexing features through a search and analytics engine like Elasticsearch.
- File storage for cloud, hybrid, or on-premises deployments.
- Security and authentication, both on the platform- and role-bas
- Natural language processing, machine learning, and other AI services to develop models and process data.
- Front-end/visualization tooling for your data/knowledge peers to connect nodes and develop new insights.
Only now can you return to the idea of building we introduced early on. Are you willing to build, own, and maintain all these interconnected systems, or are you more interested in standing up a platform quickly and abstracting away all the complexity?
The cohesive, composable graph thinking platform from GraphGrid
The shortest path to supporting your peers with technical solutions for their graph thinking journey is GraphGrid’s composable graph + AI platform. It lets you freely and flexibly develop your data model as your organization’s grasp on graph thinking transforms and ensures you invest your valuable time in an open source data format. When you give your peers access to your composable graph + AI platform, they’ll immediately start to practice graph thinking, handle data-intensive problems faster, and make novel insights that drive your business forward.
Better yet—GraphGrid doesn’t just offload all the complexity of designing, deploying, and maintaining the composable platform that makes your peers’ graph thinking possible. Our people partner with you on your entire graph thinking journey, starting with these complex-but-necessary questions detailed above.
Give yourself a solid foothold on this journey by getting started with a free solutioning session with our experts. We’ll listen to the data-intensive challenges and opportunities your peers are looking to solve, probe into the data you’re already storing, and offer our expertise on the initial design and data considerations you need to maximize the value you get from graph thinking.