How a fast-growing fintech improved GDPR compliance with Atlan in hours, not months
At a Look
- Tide, a UK-based digital financial institution with practically 500,000 small enterprise clients, sought to enhance their compliance with GDPR’s Proper to Erasure, generally generally known as the “Proper to be forgotten”.
- After adopting Atlan as their metadata platform, Tide’s information and authorized groups collaborated to outline personally identifiable info to be able to propagate these definitions and tags throughout their information property.
- Tide used Atlan Playbooks (rule-based bulk automations) to routinely establish, tag, and safe private information, turning a 50-day handbook course of into mere hours of labor.
Tide, a mobile-first monetary platform primarily based within the UK, provides quick, intuitive service to small enterprise clients. Information is essential to Tide, having supported its unbelievable progress to now practically 500,000 clients in simply eight years. However in monetary companies, information acutely presents danger and calls for cautious and fastidious safety of delicate monetary info. These dangers solely improve as enforcement of GDPR will increase, with nine-figure fines levied in opposition to offending companies in simply the previous few years.
Recognizing the immense alternatives offered by information, Tide’s CEO, Oliver Prill, recruited Hendrik Brackmann to construct an information science crew. “The ambition at that time wasn’t a lot to construct an information group. It was about the place we may use machine studying at Tide”, Hendrik shared, “but it surely rapidly grew to become clear you can’t notice that in the event you don’t have an information platform.”
The journey towards information maturity was a frightening one. Initially reporting into the Finance crew at Tide, the info platform crew consisted of simply two workers. It grew to become Hendrik’s duty to develop not simply a sophisticated information science crew, however to decide on the correct information platform expertise, and to suggest, construct, and scale information and reporting groups.
“We seemed very deeply into how our group ought to look,” stated Hendrik. “We made numerous modifications, from splitting roles between analytics engineers and analysts, to beginning an information governance crew.” And together with personnel progress and a extra mature help mannequin to help Tide’s progress, Hendrik ensured that his crew was aligned to enterprise wants, delivering transformational options like a transaction monitoring system, help for income identification, and machine studying–powered danger scoring.
In simply 4 years, Hendrik grew the operate to a crew of 67 throughout information engineering, analytics, information science, and governance. It was throughout this time of maximum progress that Hendrik acknowledged room for enchancment: “We grew in a short time, and we noticed we weren’t as environment friendly as we thought.”
Whereas Tide’s information crew had matured by leaps and bounds, as a regulated entity, compliance was a excessive precedence that demanded big effort and a focus. “The authorized crew not often spoke with the engineering capabilities. It was a bit remoted,” Hendrik stated.
Early Days of Information Governance
Recognizing that collaboration between authorized and technical groups had to enhance, Hendrik started looking for an information governance skilled. He met Michal Szymanski, who would turn into Tide’s Information Governance Supervisor. “The preliminary thought was to rent Michal as a bridge to the privateness operate,” Hendrik remarked.
Michal joined Tide as a one-man crew. “My scope of obligations elevated rather a lot,” stated Michal. “I needed to cope with an unlimited array of challenges, ranging from understanding the place information governance may assist in such a company.” He started by trying to know his stakeholders’ wants. “I needed to begin by interviewing many individuals throughout totally different enterprise areas to know what they wanted.”
Based in 2016, Tide had little of the technical debt or legacy expertise that sometimes burdens conventional monetary companies organizations. Their information stack consisted of dbt, Airflow, and Snowflake, with Looker downstream as their Enterprise Intelligence (BI) layer. Whereas Tide had invested in the correct expertise, Michal discovered that his colleagues discovered it obscure how information traveled throughout their stack.
Hendrik noticed this problem as a chance for progress.
We wished to embed information safety and privateness into our working processes, relatively than discussing it on the finish of initiatives.
By combining Michal’s new governance operate, an understanding of information lineage, and customary definitions of information, they may obtain the collaboration that they had been lacking.
Hendrik and Michal started looking for an answer. Summarizing the trail ahead, Michal defined, “We would have liked to have a platform the place we may put all such attention-grabbing info to assist customers navigate the info that now we have. So my first process was to establish an information catalog.”
Including a Context Layer
After an intensive analysis of the market, Hendrik and Michal selected Atlan as their information catalog.
[Atlan] built-in seamlessly with all of our instruments, and we felt it was very straightforward to make use of.
Beginning with just a few key downside statements, Tide applied Atlan to enhance information discovery, visibility, and governance within the brief time period, and democratize information entry and understanding in the long term. To start out, Hendrik ensured that Atlan was correctly built-in with their information stack, and was capturing all related metadata.
With Atlan, technical and non-technical customers may discover the correct information asset for his or her wants, rapidly and intuitively, lowering the time it as soon as took to search out, discover, and use information throughout instruments like Snowflake, Looker, and dbt. Utilizing Atlan’s information glossary and metrics, Tide started to take pleasure in higher context surrounding their information domains, which set the stage for standardizing classifications of delicate information like personally identifiable info. And lastly, Atlan’s automated lineage added transparency so Hendrik’s crew may perceive the place information got here from, the way it remodeled all through the info pipeline, and the place it was in the end consumed — one thing they couldn’t do earlier than.
Tide grew to make use of Atlan to help a wide selection of customers and enterprise models, from Authorized and Privateness, to Information Science, Engineering, Governance, and BI colleagues. With improved context, larger belief in information, and democratized entry to Tide’s information, Hendrik started to contemplate new use instances: “We have been trying to establish how we may drive course of efficiencies in our analytics and engineering groups.”
With a 360-degree view of their information property, the stage was set for Hendrik’s crew to construct broader, extra mission-critical options.
The GDPR Problem
After utilizing Atlan to raised perceive their information property, Hendrik’s crew was able to help an important use case.
“Like each firm, we should be compliant with GDPR,” stated Michal. And a key part of GDPR compliance is the correct to erasure, extra generally generally known as the “Proper to be forgotten”, which provides Tide’s clients throughout the European Union and the UK the correct to ask for his or her private information to be deleted.
Tide’s information crew understood these obligations effectively, however the technique of compliance was tough.
Our manufacturing help crew had a script, and at any time when somebody wished to delete information, they’d undergo our back-end databases and delete private information fields.
And whereas the help crew’s script managed a major quantity of information deletion, handbook effort was wanted to search out and delete information that endured elsewhere in secondary methods that had native projections of the non-public information fields. Michal defined, “The method was not capturing information from all the brand new sources that saved showing within the group, simply the important thing information supply.”
Complicating this problem was an absence of shared definitions of private information, with differing opinions on what constituted personally identifiable info throughout organizations from Authorized to IT. This meant that finishing the “Proper to be forgotten” course of concerned steadily re-litigating definitions.
Whereas Tide was doing its finest to adjust to GDPR, as its expertise stack and structure grew extra difficult, new services have been launched, and clients elevated over time, the compliance course of took solely extra effort and time.
Automating this course of grew to become a precedence. In a super world, when a buyer exercised their proper to be forgotten, a single click on of a button would routinely establish and delete or archive all information in regards to the buyer in accordance with GDPR. Immense handbook effort, and the danger of delays or human error, can be eradicated.
That’s precisely what Hendrik set his crew to do.
Driving Frequent Understanding
Earlier than pouring sources into fixing the issue, Hendrik and Michal wanted to justify the hassle to their colleagues. “It required element to be offered to senior leaders to be able to resolve that we might make investments money and time in fixing such an issue,” stated Michal. “That was essential, as a result of nobody actually desires to take a position except it means some improve of income or value financial savings. We stated we are able to keep away from fines and we are able to be sure the corporate is dealing with private information at a excessive stage.”
The case was so sturdy that fixing the issue grew to become a crew OKR. With their aim in hand, Hendrik requested his crew to know the issue in higher element: “The very first step was to determine the place we had this type of information, then figuring out possession.”
In his function as a bridge between the info crew and its enterprise counterparts, Michal labored with the Authorized crew to determine what did or didn’t represent private information. And to make sure the groups have been collaborating easily, Hendrik established a cross-functional working group. “It’s simply getting the correct folks in a room after which getting them to speak,” stated Hendrik. “Our greatest contribution was bringing folks collectively and protecting them targeted.”
By bringing technical groups and area consultants collectively, Hendrik ensured each voice was heard and that his crew remained targeted on collaboratively delivering worth, relatively than arcane technical ideas. Recalling an instance of how strongly the crew collaborated, Hendrik shared, “We had our privateness lawyer on the decision after we mentioned structure. He may reply any questions that may come up immediately.”
With these definitions in hand, Hendrik and Michal started evaluating them in opposition to current documentation and processes. “There have been a few locations the place totally different folks have been making an attempt to listing private information. So the entrance finish crew did this, and the again finish crew did that. Some product managers did the identical, they usually weren’t constant,” Michal defined.
Additional, whereas his colleagues had a very good command of their information, they usually had bother speaking the info’s definitions — a key a part of good information governance. Oftentimes, column names would function definitions. “In lots of instances, it was not exact sufficient,” stated Michal.
With clear misalignment, Tide wanted extra exact documentation and course of. Atlan offered an easy solution to remedy this problem. Hendrik’s crew would take what they discovered from their analysis (together with new definitions of private information, alternatives for enchancment, and homeowners of information) and doc it as soon as and for all of their catalog.
We stated: Okay, our supply of reality for private information is Atlan. We have been blessed by Authorized. Everybody, any more, may begin to perceive private information.
From 50 Days to five Hours
With their information property built-in with and made navigable by Atlan, Tide used automated lineage to rapidly and simply decide the place personally identifiable information lived, and the way it moved by their structure. Beginning by figuring out the columns and tables the place private information endured, the crew then used Atlan to trace it downstream.
Michal defined simply how helpful lineage was to the crew: “This was very helpful. It confirmed us how a lot information now we have in our information warehouse, after which we may additionally extrapolate this to the upstream sources of Snowflake. We knew we had it in Snowflake as a result of it’s coming from this and this database. So we knowledgeable the groups that that they had quite a lot of private information and we would have liked to give you a design.”
Subsequent, Hendrik’s crew determined to correctly tag personally identifiable information, and add their newly decided definitions. Property saved in Snowflake, like account numbers, electronic mail, cellphone numbers, and extra, can be searchable, however correctly secured and masked within the Atlan UI.
Whereas worthwhile, the handbook effort concerned was daunting. Michal defined, “Individuals must go into the databases and attempt to translate my listing of private information parts. There have been 31 parts to search out in our databases, and now we have greater than 100 schemas, every with between 10 to twenty tables. So it could be quite a lot of work to establish it.”
Making assumptions about which schemas may comprise personally identifiable info may save time, however this wasn’t an possibility. The chance concerned meant Michal and his crew needed to be exact, looking out and tagging location-by-location, or it could show expensive.
If we have been very diligent and did it for each schema, then it could most likely be half a day for every schema. So half a day, 100 instances.
After discussing this scope with the Atlan skilled companies crew, Michal discovered about Playbooks, a function distinctive to Atlan. As an alternative of spending 50 days manually figuring out after which tagging personally identifiable info, Tide may use Playbooks to establish, tag, after which classify the info in a single, automated workflow.
Hendrik’s crew was able to spend 50 days of effort on a process that might clarify enhancements to Tide’s danger profile. However after integrating their information property with Atlan and driving consensus on definitions, they used Playbooks’ automation to perform their aim in mere hours. Michal defined, “It was mainly just a few hours to debate what we would have liked.”
After saving practically 50 days of labor, Tide can now make additional enhancements to their course of, far earlier than anticipated.
Within the months to come back, the crew is constructing a microservices-based orchestrator to deal with requests from clients about their private information. It is going to then be enhanced to anonymize information in accordance with GDPR requirements for de-identification and Tide’s information retention obligations as a regulated enterprise. Right here, too, Atlan has helped. Tide’s engineers can construct these options extra rapidly by referencing the knowledge and lineage made doable by Hendrik’s crew and Atlan.
I’d say I received nice help from the Atlan crew, who have been with me on the entire journey. I’d have by no means thought of Playbooks. It was steered in the correct means for the correct use case.
As for Hendrik, his crew’s accomplishments imply the conclusion of his imaginative and prescient from the very starting of his time at Tide. “Over the past yr, we’ve managed to maneuver nearer to the enterprise. With the ability to create this type of organizational change is one thing that I really feel very pleased with.”
With a major win for his crew in hand, enabled by the correct expertise and guided by the correct technique, Hendrik shared his recommendation for fellow information leaders. “Give attention to enterprise worth, and the precise worth you’re producing in your group relatively than discovering a course of everybody within the trade follows and adopting the identical factor. Don’t attempt to do governance in all places. Determine what information units are related to you, and deal with these ends.”
Be taught extra about Atlan’s Playbooks and different supercharged automation options from 2022.
Header photograph: Dan Nelson on Unsplash