Strong data management and governance are key to meeting regulatory requirements, especially with the increased reporting burdens recent reforms have introduced. DerivSource spoke with Rich Robinson, head of strategy and industry relations for symbology at Bloomberg about the challenges of tracking data lineage, and how taking a metadata approach, such as with the Financial Instrument Global Identifier (FIGI), might be more beneficial than seeking to create a ‘golden copy’.
Q: In your article on data lineage, you write of its importance in the data management process, which has grown in significance due to the different financial reforms. Can you please explain its role?
A: Our business is based on exchange of information, and therefore the data that underlies that information is critical to proper understanding. As data moves through processes, it changes. Other industries, such as manufacturing, have been managing their data much more effectively, for much longer.
Lineage is easier to track with a physical product that’s not subject to different interpretations or contexts. A car is a car, a windshield wiper is just that; there may be different types and styles, but they are easily definable. Compare that with a person. In financial services, there isn’t really a single person definition, because context is more important. Someone can be a legal entity, a counterparty, a beneficiary, a trader, and all those things at the same time. The data must be managed properly so it represents the context that thing exists in, that is the role and / or actions it takes on that influence other elements it interacts with.
So, data management in financial services is focused on ensuring we understand what data we are using at any point in time or context, that’s of quality, relates to the purpose we’re using it for, and that we understand not just the upstream source it comes from, but the downstream impact that data could have on downstream systems.
Q: Why is managing data quality and tracking data lineage such a challenge for professionals? What have been the main obstacles?
A: Brian Sentence of Xenomorph compares data to tap water. When you think about the lineage of water, you need to consider where it originally came from, any contaminants that could have been introduced, how much it costs for delivery, including costs that may incur due to leaks in pipes or lack of maintenance of pipes. Data follows the same principles. Data, like water, may look the same, but if the source changes or any of the transformation points along the way change from that source, it can have a great impact on what you’re actually looking at. Something like fluoride can be introduced without your knowledge to make the data better, or a stream overflows with chemicals, which washes into your source, and now that data is bad.
Also, context comes into play, so ice and water vapour are still water, but what you can do with them is vastly different.
There are many potential obstacles, depending on the situation.The most common example is a large organisation that has gone through many acquisitions and mergers, and thus has multiple models and systems to rationalise. There are operational silos, functional silos, different firm types, and so on. But even a small firm that isn’t burdened by legacy issues can quickly find themselves dealing with bad data if they don’t have the proper procedures in place to manage lineage and quality.
Q: Tabb Group issued a recent report called Building A Framework for Innovation and Interoperability that you cited in your article on data lineage, and it says that more than 50% of firms operate using more than one security master, and nearly 25% of asset services utilise more than ten masters. What is the impact of this?
A: Multiple security masters exist usually because they serve different purposes and therefore end up based on different identifiers, be it tickers, internally created codes, or other propriety identifiers like ISINs, SEDOLs or CUSIPs.
Immediately, the different databases now have different contextual views of the same thing. A framework is needed that can manage the transformation of one context, such as a ticker, to a different context, such as a SEDOL, that exists at a more aggregate level with a different set of data tied to that identifier.
Further, that transformation has to be managed both ways, to manage discrepancies in data loss or data creation that are associated with those different identifiers. The result is trade-offs among costs, quality, flexibility, functionality, and data fitness for the need.
There is no single solution to these challenges, but taking a metadata management approach and using a framework like the Financial Instrument Global Identifier (FIGI) can be much more effective across all these dimensions than forcing a single identification standard for centralising all data in a ‘golden copy’ database.
Q: To that point, how can an instrument identification framework, and in particular FIGI, support the data management process? What are the benefits, if you can maybe provide a bit more detail on FIGI?
The FIGI framework provides a flexible foundation for data management professionals to build a governance approach to identify financial instruments in the right context, based on different functions or operational requirements.
The basis of any proper data management approach begins with the ability to uniquely identify something with a clear, unique definition, with permanence. In the data world, we commonly refer to a primary key or uniform resource identifier approach. The unique key always points to the same object, so you can always find it, wherever you go.
An analogy is part numbers for a car, or a broken appliance. That part number ensures it will fit for the particular need, because it matches, regardless of the manufacturer or part maker.
FIGI is the first financial data standard that has these exact qualities—uniqueness and permanence, coupled with an open data approach. The part number would be less helpful if customers had to pay for someone to find out what the part number is, and then pay for using that part number to order the part.
Being based on the metadata approach is what makes FIGI a framework, and it makes it extensible as well as enables the core structure of a contextual and self-referencing system of identification.
Firms can use FIGI in many ways, from a core data management strategy across instrument masters, to middleware and metadata layer for interoperability between legacy systems and new builds that are based on different identification approaches.
Although that sounds very straightforward there’s still a lingering misunderstanding of FIGI, and I was wondering if maybe you could explain why that is?
The biggest challenge is always preconceived notions, and especially understanding FIGI as a framework, as opposed to an identifier such as an ISIN.
What’s important to note though is that existing proprietary identifiers all have a role to play. They exist because at some point there was a problem that needed to be solved, and the identifier addressed that need. So there are many legacy systems based on them, and they’re not going away any time soon.