Machine Learning Technology Stack for Banks

Willingness To Appreciate Nuances Will Serve Customers Best

Ankur Garg

September 10, 2020

Woman learning stack technology

By Ankur Garg | Special To Banking New England

Implementation of machine learning (ML) is often misunderstood. Yet, knowledge of a ML stack—the collection of technological tools and processes that facilitate the generation of data-derived insights—is vital. There are four key components of the ML stack:

1) Sourcing data;

2) Establishing a trusted zone or “single source of truth” (SSOT);

3) Establishing modeling environments; and

4) Provisioning model outputs or insights to downstream applications.

“By understanding ML technology stack implementation, banks can leverage the benefits of data and generate programming that could transform their businesses, with early adopters more likely to see sustained success,” according to Raymond Chase, vice president for data analytics with Connecticut-based People’s United Bank. With experience in the industry spanning more than 30 years, Chase says he has seen many projects fail despite best intentions when the ML stack is not addressed.


Data sourcing includes surveying accessible data types that are fed as inputs to the algorithm, as well as the processes and technologies needed to tap into sources. Examples of sources include core transactions, customer-provided information, the Internet, external databases, market research data, social media, and website traffic. Once sourced, data must be curated through an SSOT, a structuring of the data in a consistent place, and data lineage is established to ensure quality and trace impact downstream. Curated data from an SSOT can then be sourced by a modeling environment that is created to implement ML algorithms.


It is important to prove data validity and quality throughout the chain of handling. Data must be aggregated, reconciled, and validated before being consumed for ML purposes. Key attributes of a trusted zone include:

• A central repository of data, aggregated from multiple channels.

• Clearly defined and documented data elements and data lineage.

• Documentation of assumptions. For example, if a teller’s cash transaction is reported by the core system and reported by the teller transaction system, documentation must show which entry prevails and why it prevails.

• Protocol for addressing unintended exceptions. For example, if there’s a localized glitch at the branch level for an ATM that is not able to report transactions on a certain date, there should be a way to account for missing transactions and to capture them when they’re available.

• Daily reporting that matches and reconciles counts across systems.

• Architecture that expands vertically and horizontally.

• The data store that houses the trusted zone should have high availability and be resilient to failure. Lately, more data warehouses are hosted on Cloud. Cloud benefits include high availability, cost-effectiveness, and horizontal and vertical scaling. Another trend is increasing adoption of NoSQL databases such as MongoDB. These provide greater flexibility and better performance to store unstructured data, vis-a-vis relational databases.

As with all things digital, regulation and security of data are intensive. Data is more intimate today, and privacy and security regulations are more complicated. The data governance team should be part of any ML implementation. Having data lineage that tracks data sourcing is thus effective to ensure compliance.

Data collected and held must be protected. Security and risk management teams must be involved to initiate and monitor best practices, and to develop security breach response. Investment in outsourced assistance is worthwhile for smaller institutions. If Cloud vendors are utilized, they must contractually agree that data security is their responsibility. Transmission of data from the premises to Cloud and from Cloud to premises must be part of the scope and should be carefully designed to address security risk. Data encryption before transmittal to Cloud, even when transmission occurs over a secured virtual private network, is valuable.


The objective is to facilitate creation of models that generate meaningful insights and placing insights in a way that passes model validation and audit requirements. There are three components: modeling infrastructure, development tools, and DevOps. Different options for ML modeling environments include:

• Ready-to-use services, such as Amazon’s Polly and IBM’s Watson.

• Automated ML, such as DataRobot.

• ML Workbench, such as Amazon’s SageMaker.

• Custom-/in-house-built ML modeling environments: All components of a modeling environment, programming tools, and DevOps tools are gathered, created, configured, and maintained by the institution.

A current trend is the movement of modeling platforms to Cloud from in-house implementation of Apache Hadoop. Hadoop-based stacks could have high upfront costs and be complicated to maintain. Moving to Cloud offers benefits including minimal upfront capital investment and flexibility. As the needs for storage and computation change, it adapts seamlessly. Think of it as “pay as you go.” Most major Cloud providers also offer ML ready-to-use services and ML workbenches that could be utilized with minimal setup requirements.

ML modeling environments should facilitate model validation and account for associated challenges. Models must be validated for bias, must be explainable, and must document parameter and method selection. Documentation must be detailed so that a third party could recreate the model without being provided source code. It is therefore important to standardize model development and validation processes.

Assessing model risk is typically required before production. Regulatory guidelines require decision-makers to understand the intent for building these models, assumptions made, and limitations. Using a model outside the scope of its initial intent should be avoided. While ML is great at modeling complicated non-linear scenarios, it is less transparent than traditional models, making ML model validation challenging. For example, while using ML for creating a credit-risk model, one must be able to explain the outcome of negative credit decisions, which is required by law. This is an especially sensitive issue during the global pandemic.

The selected model must have conceptual reasoning behind its development and construction. It is important to document why the model was selected, the math behind it, and the feature-selection process. Sourcing of features and data integrity are also essential and more easily accomplished with an SSOT. Special care should be taken when utilizing AutoML because it provides pre-cooked models that must pass for conceptual soundness. Model validation should be closely assessed when selecting any AutoML product.


Delivery of insights is categorized as real-time delivery or batch. Real-time insights are required to be processed, generated, and delivered within short timeframes or near real-time. For example, detection of fraudulent transactions. Batch delivery is processed and generated in groups.

Considerations for designing and hosting the compute tier for real-time models include request frequency and load. If this is unpredictable or highly variable, hosting the compute tier in Cloud is advisable. Creating a web service-based API layer dedicated to this compute tier is also wise.

Real-time models should require registration to the API layer, which should enable applications to retrieve information on how to structure API requests and the expected structure of output. ML models differ from traditional models in that they can be continuously trained.


With today’s increased volume of big data, it is more difficult to generate insights using traditional analytics. The ability of the ML stack to significantly automate this process complements the growth of big data, especially when ML infrastructure is understood. What’s more, insight into the nuances regarding implementation of the ML stack will positively impact the ultimate follow through that the actual machine learning produces and improve the customer’s relationship with the institution.

An unwillingness to appreciate these nuances and to be accountable to the ML stack, consequently, will compromise the machine learning’s intended impact on the end user, the customer.

Ankur Garg is a full stack data science expert working as Enterprise Data Analytics Architect at People's United Bank.

Advertise With Us

Have news to share?

To submit news, contact us at