Neil Sandle, Product Management Director, Asset Control, weighs in on data lineage and how this can assist in improving data management for financial firms.
Data lineage is an increasingly critical capability in data management. It pinpoints the origins of data, what happens to it and how it evolves over time. Having an ability to track and visualise data lineage is becoming increasingly important for financial services organisations. When valuing portfolios or modelling risk, or investigating referential data in regulatory reporting, for instance, firms need to trace data back to its source; and demonstrate who has looked at it; what kind of quality verification it has undergone; and what tests it has passed.
The need for firms to explain data lineage in this way has become an integral part of data quality frameworks deployed across the sector and is enshrined in many of the latest regulations (e.g. FRTB, TRIM and BCBS239). In addition, from an information security or content licensing perspective, lineage is also needed to connect data to licensing policies and/or access restrictions. Yet, many firms still struggle to get it right.
Scoping the Challenge
Financial services organisations invariably fail to track information fully or lose track of individual data elements in the overall flow. Many neither fully record the data sources they have used to make calculations nor record parameter or business rule changes. All this contextual intelligence frequently gets lost.
The main issue firms face is that the data management application or databases they use don’t trace the data, underlying the master copy, or its connections with the approved and validated data. Firms simply do not have the capability to track that context.
Added to this, many such businesses operate over a dispersed infrastructure. Different departments often retain their own data stores, including dedicated copies of data that might also be available elsewhere. That is partly because organisations have historically been worried that consolidating all their data into one system would lead to scaling issues and ultimately a deterioration in performance. It is partly down to politics, with different departments, working with dedicated budgets, wanting their own separate private data stores. Whatever the rationale behind the development of these data silos, however, the end result is that the process at which values used in internal and external reporting are arrived at involves a journey through different applications – and information is generally lost.
In summary, the lack of capability in many data management systems, together with the development of convoluted architectures that encourage the retention of multiple data copies, leads to information uncertainty and quality degradation. The trail is effectively broken and organisations cannot trace the journey from a specific data point through their own systems infrastructure.
Finding a Solution
In light of this, how can organisations start to mend their broken data chains to meet regulatory requirements and achieve strategic goals? To do this, they will need to track the contextual information around a data point. That means finding a way of pinpointing the sources of data; understanding and capturing the selection process that led to a particular source being picked, and what the rules were to derive a piece of data based on underlying data. Moreover, they need to do a better job of keeping this information close to the data and tracking changes to the rules around data sources.
The second major opportunity presented to financial services firms in this area is around rationalising their infrastructure and cutting back on the number of applications they have. Some of the larger banks have already stated their plans on application rationalisation. In many cases, they are being driven to do this by high maintenance costs as well as operational risks. Beyond this, financial services firms need to focus on making the infrastructure simpler. That means fewer handover points where data leaves one application and enters another. It also means bringing users closer to the data rather than having to transport the data itself every time.
Technology can help firms achieve their goals in these respects, of course. It has a role to play in tracking information, putting data models in place and ensuring that the metadata that is captured, retained and made accessible is sufficiently rich to keep sources together with the metadata, keep the relations in place between sources and final data, and make sure that relevant rule parameters are in place.
In a sense, therefore, technology needs to track all the data but it also needs to scale in order to be sufficiently accessible and make certain that users can get their hands on that data and that they effectively have a path to it. Users that trust this access path are less inclined to build their own databases and their own duplicate copies of data. Technology needs to be easily accessible but it also needs to support business user enablement. In this case, that means providing easy access to the data by browsing and enterprise search capabilities, for example.
Plotting the Way Forward
Data lineage is becoming more important for financial services organisations today. It is increasingly hard-wired into regulations and data quality frameworks (most notably, the ECB’s TRIM) that impact firms across the sector. Data lineage is central to firms growing need to deliver ‘explainability’. If a bank, for example, values something at £25 on a given day, it needs to be able to explain why it valued it at this amount, how it came to that decision, what data points it used and what else it took into account to arrive at that decision.
All this context increasingly needs to be tracked today. This need for financial services organisations to ‘explain’ how they arrive at certain decisions has been enshrined in recent regulations – and that is likely to continue driving the need for high-quality data lineage in the future. But data lineage is not just needed for regulatory purposes, it can also help in the process of carrying out diagnostics to improve data quality and in content licensing, it can be key to know the origins of a particular piece of data: effectively where it has been derived from, and in light of recent regulations like GDPR, it can also be crucial in maintaining and managing client records in a compliant way.
Overall, the need for data lineage is making financial services firms increasingly data-centric in their approach. They have to connect all the various pieces in their infrastructure and trace the journey of data from A to B. In doing that, they need to put the right processes and systems in place. That’s crucial already today and it is likely to become ever more so in the future.