AI and machine learning. These are not just tantalizing buzzwords – these are real technologies that have the capability to provide businesses with incredible insights into what is happening in the world. However, let’s not get ahead of ourselves. The output of these predictive tools is only as good as the data being analyzed. Good data science and predictive modelling relies on cleansed and contextualized data. If you’ve got dirty data, you’re dead in the water. It’s a big challenge; various studies and surveys have all concluded that the most time consuming portion of a data science or machine learning project is data cleansing. Data annotation for machine learning is projected to be a $2.57 billion industry by 20271.
Banking transaction data is notoriously difficult to wrangle. Transaction data is comprised of millions upon millions of rows of dirty, un-normalized, and often cryptic transaction strings. Segmint’s expertise lies in our ability to take this mess of data and transform it into a clean, normalized, and categorized set of labels called Key Lifestyle Indicators, or KLIs
KLIs are labels describing the type of transaction or behavior that a customer or member is engaging in.
The great news is Segmint has already solved the data cleansing and labeling problem for financial institutions. This means, if you’re a financial institution, the most time consuming and laborious step in a predictive modeling initiative is already done.
KLIs are ideal data labels because they provide the key characteristics a data scientist wants from a data label.
First, KLIs are extremely well-normalized. For example, millions upon millions of transaction variants for purchases at Amazon are grouped and cleanly labeled as Amazon.com.
Segmint has developed tens-of-thousands of KLIs ranging from labels that describe a purchase at the brand level to labels that describe the count of products a customer or member has with an institution.
Second, KLIs are organized into a taxonomy which provides context to the meaning of the label and groups them with other KLIs that share the same characteristic. The Wells Fargo Mortgages KLI from above is grouped with KLIs for hundreds of other mortgage providers. Machine learning and AI can key on these common characteristics to identify patterns and trends that may have predictive importance for an attrition model.
Highly normalized and contextualized data labels are a key ingredient for highly useful and accurate predictive modelling.
KLIs represent a clean, contextual, and complete set of potential variables for financial transactions. Segmint has developed an autoML AI Modeling platform specifically designed to use KLIs as model inputs. Segmint’s AI Modeling platform is designed to analyze each KLI in an unbiased manner and identify the most predictive variables to incorporate into any model.
Segmint’s new Attrition Model is the first model built on the AI Modeling platform but any KLI can be chosen as a target for predictive model development. As new data is received from our Clients, Segmint assigns KLIs and the AI Modeling platform rescores on a daily basis, minimizing the time that it takes for the latest transactions to be considered by a predictive model.
The next generation of FI predictive modeling is here. Powered by KLIs.