This site is not optimized for Internet Explorer 9 and lower. Please choose another browser or upgrade your existing browser in order get the best experience of this website.
How to Hunt A Cheat
Knowledge Graphs in the Multibillion-Dollar Fight Against Fraud and Noncompliance
Few entities in the world capture as much data as the United States Department of the Treasury. Amid the hundreds of millions of annual financials, hide crooks and cheats who account for billions in fraud and noncompliance.
Fighting back by manually sifting through financial reports became untenable for one of the agency’s nine bureaus. There weren’t enough agents to do the job by hand – and 5,000 were already on the case.
Leadership turned to GraphGrid’s Connected Data Platform (CDP) to automatically connect data elements and enrich them with contextual meaning. With their newly connected data, they identified and recovered $10 million lost to fraud within 30 days of going live.
To better understand the scope of this challenge, and the magnitude of recovering so much, so quickly, consider that just this one bureau annually handles over 225 million individual reports and over 68 million cases via phone or in person. Adding more agents would never provide enough help to unearth all the potential fraud and noncompliance hiding in the bureau’s records.
That proved to be a nagging problem, Treasury and its various bureaus process forms and filings throughout the fiscal year. Without automated tools for spotting subtle signs of trouble, the bureau had no choice but to make educated guesses based on each investigator’s knowledge about which investigations might pay off with a recovery.
Think of it like searching for a needle in a haystack – when the haystack is petabytes of data that includes needles of varying shapes and sizes. You might find a needle, but you’ll be hunting in a wide area and may come up with nothing. It is a tedious and time-consuming task to find them without tooling that provides an advantage. GraphGrid CDP is like having a giant magnet that quickly pulls the needles to the surface.
In this case study, we’ll review how the bureau transformed the way investigators work by connecting their data with GraphGrid CDP.
By moving the responsibility of making key connections in the data from the investigators’ heads to the collaborative knowledge graph environment, CDP improved and automated investigations while adding capacity. They create a new R&D-like process that, to this day, allows for rapidly testing fraud pattern ideas without compromising existing casework.
Use knowledge graphs to add context to data, revealing patterns and accelerating insight.
At first, the bureau worked with GraphGrid-trained engineers through a systems integration partner to understand the contents of the data warehouse and which among the thousands of tables therein would be most relevant for conducting more efficient investigations.
Specifically, GraphGrid and its service partner set out to load data into a shared knowledge graph that could evolve. That way, as new types of investigations were identified and prioritized, new nodes (i.e., people, places, or
things), edges (i.e., the type of relationships, such as the leader of a related organization), labels (i.e., categories of nodes, such as an organization), and properties (i.e., context for actions taken) describing the particulars of each new investigation were added.
The initial design focused on capturing data to help solve the most urgent cases, investigations led by 100+ agents whose job was to identify criminal fraud.
Engineers responded by moving subsets of information from the most relevant tables, transferring the data into a knowledge graph that would dramatically improve agents’ ability to search and see connections.
GraphGrid CDP made it simple for the agents to start with one clue (a piece of data), quickly navigate to other clues, and visually see patterns in data. Whereas before, for example, they’d query the data warehouse directly for an exact match using a strict numerical identifier. With CDP, they can now see that one particular exec was at the center of many transactions and had relationships with known or suspected fraudsters.
The team also inserted safeguards. Instead of granting all agents full authority over the knowledge graph, engineers created compartmentalized levels of access and control. In that way, CDP mirrored the bureau’s own internal security and audit protocols to ensure confidentiality and maintain the legal due process.
The net result? Patterns that previously could only be identified by comparing documents and entering information on spreadsheets were suddenly in full view of investigating agents within a few keystrokes, according to their level of clearance.
Recover millions in unpaid funds and exponentially increase the bureau’s ability to process cases.
As the knowledge graph grew, so did its ability to serve the broader team of agents searching for instances of fraud and noncompliance. But it was the initial success that kickstarted the imagination of bureau leadership and its agents. Just 30 days after going live with GraphGrid CDP, agents identified previously undiscovered violations leading to $10 million in recovery orders.
Since then, GraphGrid CDP has expanded from serving those first 100 criminal investigators to over 5,000 of the bureau’s agents. Several use cases have emerged as a result. For example, one team has found it particularly fruitful to study how enterprises are organized globally, looking for patterns that may reveal noncompliance or fraud.
Testing new ideas for patterns of fraud has also become easier. With GraphGrid CDP, analyst teams can isolate the signals they believe may indicate new types of fraud or noncompliance and then build a temporary knowledge graph to capture the relevant contextual data. This provides additional clues and reduces the amount of time required to determine if fraud exists or not because the investigator is no longer piecing together everything in their head. Now, teams of investigators collaborate on tests lasting weeks or months. If the hypothesis proves correct, data from the temporary knowledge graph moves into the permanent graph and becomes part of the team’s ongoing caseload.
It’s a staggering change: instead of struggling to identify the highest-value cases and then applying manual resources to solve them, hoping for the best, the bureau now has an active R&D mechanism for proactively hunting for new schemes – all with GraphGrid CDP at the center. In each case, what used to take 6 months or more, now rarely takes more than two hours – keeping investigators one step ahead of fraudsters.