Rethinking Market Data Monitoring: Enhancing Operational Security and Reducing Costs with Prometheus at a Swiss Bank

BACK

Rethinking Market Data Monitoring: Enhancing Operational Security and Reducing Costs with Prometheus at a Swiss Bank

Introduction

In 2024, we implemented a comprehensive monitoring solution for a Swiss financial client’s new central market data platform. With the OpenShift Monitoring Stack and Prometheus, we created a robust solution for unified monitoring of the platform distributed across multiple clusters, detecting and alerting technical and business errors, and ensuring operational security.

The Challenge

The client faced the challenging task of monitoring a complex and distributed system. The market data platform consisted of around 30 components, including a central component based on the Solace Event Broker and third-party software, leading to delays in error detection and potential data losses.

  • Distributed Architecture: The market data platform was distributed across multiple Red Hat OpenShift clusters and network zones.
  • Complexity: Around 30 individual components, including third-party software, formed the overall system.
  • Lack of Monitoring: A comprehensive, central monitoring system did not exist. Monitoring was primarily assigned to application management.


The Solution

The final solution implemented for the Swiss bank involved a comprehensive monitoring system based on the OpenShift Monitoring Stack with Prometheus. This solution addressed the challenges of monitoring a complex, distributed market data platform by integrating seamlessly into the client's existing OpenShift environment. Key aspects of the solution included:

  • Central Monitoring Platform**: Utilized Prometheus, Alertmanager, and Grafana to create a unified monitoring system across multiple OpenShift clusters and network zones.
  • Automated Deployment: Used Helm and ArgoCD for automated and reproducible deployment of monitoring components.
  • Kubernetes & Application Monitoring: Defined specific service monitors and Prometheus rules for each component, alongside generic Kubernetes rules for infrastructure monitoring.
  • Business Monitoring: Integrated business metrics into Prometheus for monitoring business processes and generating alerts for business errors.
  • Advanced Alerting: Implemented a complex Alertmanager configuration with detailed alerting rules and mute intervals.

Overall, this solution increased operational security, standardized the monitoring approach, enabled global integration, and reduced costs for the client.


Implementation Process

To address these challenges, we adopted a standards-based approach that seamlessly integrated into the client’s existing OpenShift environment: Since the Prometheus-based monitoring was already used on the OpenShift platform, no additional licensing costs were incurred, and the open solution provided the greatest flexibility.

  • OpenShift Monitoring Stack (Prometheus): Building on the monitoring stack already integrated in OpenShift with Prometheus, Alertmanager, and Grafana, a central monitoring solution was created.
  • Helm & ArgoCD: Deployment of all monitoring components was automated and reproducible via ArgoCD and Helm.
  • Kubernetes & Application Monitoring: In addition to generic Kubernetes rules for infrastructure monitoring (CPU, memory, pod status, etc.), specific service monitors and Prometheus rules were defined for each component of the market data platform.
  • Business Monitoring: In addition to technical monitoring, business metrics were also integrated into Prometheus. This enabled monitoring of business processes and alerts for business errors.
  • AlertmanagerConfig: A complex Alertmanager configuration was implemented to map detailed alerting rules and mute intervals for alerting. A total of about 30 Prometheus rules were defined.

Results Achieved

The implemented monitoring solution led to measurable improvements and qualitative advantages for the client:

  • Increased Operational Security: The comprehensive monitoring proved to be a decisive factor for the successful introduction of the market data platform and ensures its operational security.
  • Standardization: By using the OpenShift Monitoring Stack as a foundation, the solution is easily expandable and can serve as a blueprint for future monitoring requirements.
  • Central Monitoring Platform: The solution enables global integration into the existing monitoring platform.
  • Monitoring of the Entire System: The monitoring platform now monitors all technical and business components.
  • Cost Reduction: The costs for implementing and operating the monitoring solution were significantly reduced.

Lessons Learned

The project confirmed that the OpenShift Monitoring Stack is an excellent foundation for application monitoring. Key insights were:

  • OpenShift Monitoring as a Basis: The OpenShift Monitoring Stack provides a solid foundation for comprehensive application monitoring in OpenShift environments.
  • Domain Knowledge for Business Monitoring**: Integrating business metrics requires deep domain knowledge and close collaboration with business departments.
  • Complexity of AlertmanagerConfig: Extensive Alertmanager configurations can quickly become confusing and require expertise in maintenance and rule management.
  • Avoiding Alert Fatigue: An iterative approach to alert configuration is essential. Alerts must be actionable, intelligent, and urgent to avoid being ignored.
  • Prometheus Monitoring Stack Delivers: The Prometheus Monitoring Stack has proven to be a powerful and flexible solution.

Interested in this solution ?

Discover how this solution can be tailored to meet your specific needs
interested in this solution
Contact Us
tim&koko
COUNTRIES

Switzerland

Services

Cloud Engineering, Observability, Cloud Architecture

Technologies

OpenShift, Helm, Prometheus

Customer Vertical

Finance

Project Date

March 2025

SIZE OF THE COMPANY

45000

About the solution provider

Ready to take off the Rocket?

Rocket Engineers