Adwait-Joshi-Dataseers

Guest Post: A Data-Centric Approach for Payments

By Adwait Joshi, CEO, Dataseers

In 2017, The Economist reported “The world’s most valuable resource is no longer oil, but data.” Since then we have only added more data to the universe. This statement doesn’t need justification in today’s world as the largest hacks continue to take place in an effort to steal valuable data! On the dark web this data is available for sale, and the more hacks that validate this data the more expensive it becomes.

According to DOMO’s research, “Over 2.5 quintillion bytes of data are created every single day, and it’s only going to grow from there. By 2020, it’s estimated that 1.7MB of data will be created every second for every person on earth.” But what does that mean in terms of Payments? As payments garner innovation and as people start using alternative payment methods, data will continue to generate at a very high velocity, variety, and volume. The payments industry is very different from other industries as it requires that its data be analyzed almost as quickly as it’s generated.  This is due to the fraud and compliance elements directly associated with it.

As the reality of real-time settlement looms ahead, this data also needs to be reconciled quickly to aid in better money management. All of these problems are overwhelming hurdles for growing enterprises. Companies and financial institutions are spending millions of dollars trying to make sense of this issue. The problem can be metaphorically compared to a situation where an individual is trapped in quicksand; the more he or she struggles the worse their situation becomes. So, then what is the solution? For that we have to understand a data-driven culture. Often businesses are sales/market driven (which is to be expected), however, data generated is often not considered until it has become a problem. In other words, you find yourself in a predicament too late, struggle, and wind up in over your head – like quicksand. It’s now time to change that Rhetoric.

Consider this: your doctor checks your blood on a regular basis and uses the results to detect potential problems early-on and keep your body healthy. Similarly, data is the lifeblood of an organization and with proper analysis can be used to identify problems or changes in pattern. This data analysis can be used as an early warning system or even to identify an organizations direction as part of strategic planning. However, for this to happen, there has to be a data-centric approach. Data is often treated as an expense and, therefore, is often only looked at when there is an existing problem. In a data-centric approach, you treat data as an asset and not as an expense, which means that you have a plan to protect, grow, and manage your greatest asset.

Organizations often put their data in warehouses. If you are like most, when you hear the word “warehouse” you imagine a 4-sided building with limited storage coms. Data Warehouses are similar. They provide a limited ability to store and harness data. Anything beyond the traditional warehouse would require reengineering or architecture. It’s difficult for anyone to rely on warehouses when data is so unpredictable and every changing. This is when we start looking at concepts of a Data Lake. A lake, as its name suggests, cannot be restructured with boundaries. It shrinks and grows where and how it pleases based on volume. Similarly, in today’s world – especially with organizations putting data first – there is a growing need to consider the use of data lakes as opposed to data warehouses.

Luckily, the world of payments is both structured and semi-structured data; very rarely you find unstructured data. However, there is one element that causes confusion regarding payments data and that’s the dimension of time. Settlements an occur on differing time cycles and this creates a challenge amongst enterprises when it comes to truly understanding the “timing and accuracy of settlements”. These elements require an out-of-the-box thinking.

Such thinking would lead to a solution devised to specifically handle the complexity of time. I may sound a bit like Dr. Stephen Hawking here, but it’s important to understand what time does to payments data. There is a difference between real-time, near-real-time, and batch. All three variations exist in payments. There are aspects of fraud that need to be dealt with in real-time, compliance and velocity issues need be addressed in near-real-time, whereas bank accounts settle in batch time. Although we are trying to move towards real-time settlements, it will take time for this to flow all the way back to a banking institution. This is mainly because of how these systems were traditionally designed. However, it’s very important to understand that these time dimensions have to simultaneously coexist and often be interlinked. For example, a real-time fraud event may create problems with the next day’s settlement and so a continuity has to be maintained in the data elements. Traditional systems cannot handle this complexity and often fail at achieving anything meaningful as a result. Modern data-focused fintechs are now diligently working in the payments space to understand these complexities and provide solutions to address the issues that arise as a result of payments innovation.

Another powerful weapon in the arsenal of companies who are ready to harness the power of data is Artificial Intelligence. Although, I prefer to call it Machine Learning. Let’s first understand the difference. True “AI” is very rare in the payments industry because of the type of data that’s being used. AI is very powerful when it comes to image processing. For example, think about its use in facial recognition at airports or autonomous driving. The algorithms that are mainly used in the payments space, however, are more oriented towards Machine Learning. With ML, you teach a machine how to recognize certain patterns and it does so. Supervised and Unsupervised Learning both play an important role in the payments landscape. At the end of the day the job of the machine is to detect anomalies and present them back to the user.

Based on a scoring algorithm, it may be able to act on its own (for example, if the risk score is greater than 80 out of a 100 it may block the transaction from happening). While in other cases, it may put the transaction in a human-managed queue for it to be evaluated and determined as fraudulent or not. Both these techniques are effective and almost independent of one another. When it comes to regulatory compliance a human element is often needed because there could be different interpretations of a similar scenario. For example, 2 individuals may have activity that looks similar.  One case may be fraud while the other it may be within the realms of normal. This is why baselines are very important for understanding consumer behavior. Without understanding what’s normal, you cannot find what’s abnormal.

Most rule-based systems can miss anomalies due to the fact that they don’t take recency and seasonality into play. They often ignore the fact that there even is a ‘norm.’ They simply use a go/no-go approach if a rule is tripped and a dedicated action takes place. In modern thinking this approach just doesn’t work. All scenarios need to be considered before making a decision. Technology has progressed far beyond the traditional and, thankfully, companies like DataSeers are allowing their clients to harness the power of their data and use it in ways that benefit their organization. We call this approach Taming the Data Demon, or simply shifting data from liability to asset.