Business

Solution Monitoring Strategy

For the purpose of monitoring systems, there are many different approaches and methodologies. Technical monitoring, functional monitoring, and business process monitoring are the three fundamental types of monitoring.

In this article, I will focus on the high level recommendations on functional and technical solution monitoring and explain how to set up a clear strategy for the processes around it. I will leave out business process monitoring as this is specific to each business and is dealt at the management level.

So why is monitoring important? Here are just a few of most important considerations:

System observability - it increases the visibility and accessibility of solution data streams.
Business Continuity - ensures early problem spotting and detection of anomalies.
Team Awareness - ensures the team is aware with what is happening at all times, and is dealing with the problems independently.
Customer Experience - your customers will generally enjoy a higher level of quality, and assuming they are happy with the solution, this is likely to increase customer loyalty.

There is a relationship between observability and monitoring, and they do have distinct functions. Monitoring involves gathering and displaying data before using it for further analysis or monitoring, observability refers to the accessibility of data.

Functional Monitoring

Functional monitoring only looks at the functional aspect of the solution, evaluating an use-case or a group of use-cases on a system. It identifies performance and availability problems at the functional level and ensures this is visible and recorded.

Functional monitoring is usually performed automatically by executing scripted operations on a system. Robot-based monitoring is excellent to ensure quality of service and users' experience.

When your solution is actively used by customers, functional monitoring is essential to ensure quality of service. Essentially, this is about testing all core user journeys and system workflows repeatedly, and then monitoring the results for any anomalies.

By testing the workflows, we continuously get information about the availability of the system. Depending on the application, these tests are run on production on a specific schedule (e.g., hourly or daily).

To illustrate an example of functional monitoring, I will use a generic use journey of a customer placing an order.

Customer chooses a product
Selects quantity and shipment options
Select payment options
Completes payment
Receives order confirmation on screen and via email

This user journey is quite common, but it is far from easy to test functionally, since it integrates with payment and possibly shipment flows provided by external services. In addition to this, users may come from various promotions and may bypass some of the steps, which further complicates the testing process.

To address this, the best strategy is to apply multi-aspect functional testing:

Monitor User Journey, using robotic implementation, e.g. have a specific test which would execute steps and report on results
Monitor real user activity, e.g. number of orders started vs completed, number of users dropping out at a specific step. Ensure your decisions are based on data generated by multiple cases. Low amounts of traffic may not spot potential problems.
External service monitoring - collect metrics with regard to availability and performance of external services required in this process, e.g. continuously check availability of shipment or payment options, provided via 3rd party and report on any anomalies

Technical Monitoring

The purpose of technical monitoring is to determine how well the software components underlying the system perform in real time. It focuses on specific technical functions of each component in isolation and may not report on the functionality of the system as a whole. It effectively reports on issues and allows operators to decide how to fix these.

It is important to stress out that technical monitoring may not identify all problems within the system, as some issues may not show up in technical monitoring at all.

Below, I will focus on several best practices for ensuring that your systems are effectively monitored:

Context and Critical Components Mapping

The first step is to understand the context of your system, this is where you will need to conceptualise your system and get to know all components. The most important thing here is to determine and document which of the areas identified in your initial evaluation of your environment are the most business critical. Here are some steps I would recommend:

Create a list of all critical components, including any standalone components, APIs, databases and custom applications.
Look outside of your application's environment. Does your application depend on any third parties? If so, consider how critical their role is and document this.
Create a list of all components which "support" your solution, such as hardware servers, auto-scaling, backups and disaster recovery implementations. This is usually attributed to infrastructure monitoring, and most of the well known cloud providers already include tools to cover basic monitoring.
Consider the value of monitoring your development, testing/UAT environments, including continuous delivery processes. This is usually done by the development team anyway, but depending on your setup it is good practice to ensure things are running smoothly and take action on potential problems early.

Select and decide on a Technical Monitoring Solution

There are many tools available on the market. The most important thing is to choose a monitoring platform that can cover and monitor most of your business critical components in one single place. Adding additional tooling can significantly increase complexity, time to resolution, and the amount of effort required to perform proactive performance assessment and improvement activities.

Metrics

Analyse and decide on which metrics and what kind of data is important to you and your application. The monitoring tools you choose depend on the type of the system you run, and may look very different for a small e-commerce environment than for a highly distributed containerized Java application.

Make sure the potential monitoring platform candidates can deliver the necessary metrics. Some tools might be able to gather the metrics you need right out of the box, while others might need significant adjustment or even code changes in your application.

At a bare minimum your metrics should report on component availability, performance and critical errors in the application. This can be relatively simple to achieve as most of the available tools are able to hook into your components and start aggregating your errors quickly, as well as monitor available endpoints.

Alerting

While monitoring is basically collecting data, alerting is a proactive notification approach to monitoring, via email, SMS, ticketing system etc. While alerting is quite useful, you need to consider avoiding "alert fatigue" - when the system sends out too many alerts which require no action which leads to monitoring teams possibly missing important ones. Alert only on critical situations which are critical and DO require action. With alerting, less is more.

When setting up alerting, consider the business context of your application rather than getting too technical, for example in an ecommerce application, alerting and monitoring the technical metrics of a purchase workflow might prove much more valuable than collecting CPU data for example. Consider what is important for your end users and focus your alerting to enhance that end user experience.

It's important to keep in mind that tuning alert volume to reach an appropriate level is often an on-going process. When starting out, refer back to the business purpose of your application. In an e-commerce application, for example, monitoring the transaction response time for business critical transactions such as 'add to cart' and 'checkout' are clearly important, but monitoring for CPU usage on a given application server is probably unnecessary. Consider what is important for your end users and focus your alerting to enhance that end user experience. Think more about user experience and focus on what is business critical.

Conclusion

It is critically important to develop an effective monitoring strategy in order to have a truly performant and reliable application.

The best strategy is to combine functional and technical monitoring to obtain a complete view of the system. This will ensure control, impact awareness and will facilitate an adequate level of quality of service and confidence in operations.

More Insights

Business

You have an idea? Now build an app with us in 5 steps

Outsourcing can be a great way to bring in new skills, but it’s also a way to make sure that your business is protected, both financially and legally.

Business

What is an MVP and why is it a good idea?

The Minimum Viable Product Theory was first introduced in a blog post by Frank Robinson in order to define the nature of a "product" in the context of increasing returns and reducing risks.

Newsletter

5 Technology Trends That Will Dominate In 2023

As the digital transformation landscape continues to evolve and new technologies emerge, it's important that we don't lose sight of the core drivers of digital transformation: sustainability, data volumes, and compute and network speeds.

General contact information

Phone/Fax: +44 (0) 20 8090 0828
E-mail: info@atomate.net

Address in
United Kingdom

Atomate Limited
Ravensbourne, 6 Penrose Way
Greenwich Peninsula
London SE10 0EW
Phone +44 (0) 20 8090 0828

Address in
Moldova

Atomate Limited SRL
20 Puskin Street
MD-2005, Chisinau, Moldova
Registered Company No / IDNO: 1007600065728
Tel: +373 (0) 22 229500

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Solution Monitoring Strategy

Functional Monitoring

Technical Monitoring

Context and Critical Components Mapping

Select and decide on a Technical Monitoring Solution

Metrics

Alerting

Conclusion

More Insights

General contact information

Address in
United Kingdom

Address in
Moldova

Company

Services

Insights

Connect with Us

What we do

Get started

Insights

Company

Solution Monitoring Strategy

Functional Monitoring

Technical Monitoring

Context and Critical Components Mapping

Select and decide on a Technical Monitoring Solution

Metrics

Alerting

Conclusion

More Insights

Follow the latest technology and thought leadership by subscribing to Atomate newsletters.

General contact information

Address in United Kingdom

Address in Moldova

Company

Services

Insights

Connect with Us

Address in
United Kingdom

Address in
Moldova