In today’s digital landscape, the ability to handle sudden traffic spikes and growing user demands is crucial for cloud applications. Horizontal scaling has emerged as a game-changer, allowing businesses to expand their infrastructure horizontally by adding more machines to their resource pool. This approach offers enhanced performance, improved availability, and cost-effective resource management, making it an essential strategy for organizations aiming to maintain competitive edge in the cloud computing realm.
IONOS, a leading cloud service provider, offers robust solutions to master horizontal scaling for cloud applications. Their platform enables businesses to implement auto-scaling, optimize load balancing, and manage data effectively in horizontally scaled environments. By leveraging IONOS’s tools and best practices, companies can design scalable architectures, implement demand-based scaling policies, and ensure high availability. This article delves into the challenges of cloud application scalability and explores how IONOS’s horizontal scaling solutions can help organizations build resilient, high-performance cloud applications.
Cloud Application Scalability Challenges
As cloud applications continue to grow in complexity and user base, organizations face numerous challenges in ensuring their systems can scale effectively. These challenges encompass various aspects of cloud infrastructure and application design, requiring careful consideration and strategic planning to overcome.
Traffic Spikes
One of the most significant challenges in cloud application scalability is handling sudden and substantial increases in traffic. These traffic spikes can occur due to various reasons, such as viral content, marketing campaigns, events, or unexpected shifts in user behavior [1]. During these periods, maintaining responsiveness and accessibility is crucial for providing a positive user experience. Users expect quick load times and uninterrupted access to services, and failure to handle traffic spikes can lead to frustration and damage to brand reputation [1].
Traffic spikes often coincide with opportunities for increased revenue, such as sales events or promotions. Inadequate preparation for these surges can result in lost sales, missed opportunities, and revenue decline [1]. For instance, social media platforms like Facebook or Instagram can cause certain content to spread rapidly, potentially leading to a huge influx of traffic. While this exposure can benefit the brand, it also puts significant pressure on servers and infrastructure, potentially causing system failures and slow speeds [1].
Resource Constraints
Resource constraints pose another major challenge in scaling cloud applications. These constraints can manifest in various forms, including:
- Network limitations: Slow network speeds, congestion, and lack of redundancy can create bottlenecks that hinder the efficient distribution of data and resources [2].
- Storage constraints: Inadequate storage capacity, poor data management practices, and lack of data redundancy can limit the ability of the infrastructure to handle increasing workloads and data volumes [2].
- CPU and memory usage: Bottlenecks commonly occur in these areas, affecting the overall performance of the application [3].
To address these constraints, organizations must carefully monitor and optimize their resource utilization. This involves implementing strategies such as load balancing, which distributes incoming network traffic across multiple servers to prevent any single server from becoming overwhelmed [2]. By evenly distributing workloads, load balancing helps improve overall performance, reliability, and resource utilization [2].
Performance Bottlenecks
Performance bottlenecks can significantly impact the scalability of cloud applications. These bottlenecks can occur at various points in the system infrastructure and manifest in different ways:
- Slow response times: During traffic spikes, increased demand can overload servers and infrastructure, leading to slower response times and longer load times for web pages or application features [1].
- Service outages: If infrastructure capacity is insufficient to handle the surge in traffic, it may lead to service outages or downtime. Users may encounter error messages, timeouts, or complete unavailability of the application [1].
- Unresponsive backend systems: High volumes of concurrent requests can overwhelm backend systems, causing them to become unresponsive or slow in processing user inputs [1].
To identify and address these performance bottlenecks, organizations can employ various strategies:
- Load testing: This involves replicating typical conditions and assessing the application’s behavior when dealing with expected concurrent user loads [3].
- Stress testing: This rigorous form of performance testing puts the application under extreme conditions to evaluate its sturdiness, stability, and reliability [3].
- Scalability testing: This type of performance testing evaluates the application’s ability to cope with growing amounts of work and helps identify the maximum capacity before performance starts deteriorating [3].
- Continuous profiling: This modern solution involves continuous collection of application performance data, highlighting the most resource-intensive areas of the application code [3].
By implementing these testing and monitoring strategies, organizations can proactively identify and address performance bottlenecks, ensuring their cloud applications can scale effectively to meet growing demands.
IONOS Horizontal Scaling Solutions
IONOS offers robust horizontal scaling solutions to address the challenges of cloud application scalability. These solutions include VM Auto Scaling, Managed Kubernetes Clusters, and Container Orchestration, providing businesses with the tools to efficiently manage and scale their cloud applications.
VM Auto Scaling Overview
VM Auto Scaling is a managed service that automatically adjusts the number of VM instances horizontally based on configured policies. This functionality ensures that applications have sufficient resources to handle varying workloads while optimizing costs [4]. The key components of VM Auto Scaling include:
- Auto Scaling Group: A collection of VM instances managed by VM Auto Scaling.
- Scaling Policy: Defines how the Auto Scaling Group scales based on parameters such as CPU usage or load balancing utilization.
- VM Replica Configuration: Specifies the properties of new VM replicas created during scaling.
- VM Auto Scaling Manager: Creates and manages Auto Scaling Groups, defines policies, and replicates settings for VM instances [4].
VM Auto Scaling offers several benefits:
- Improved resource utilization and cost efficiency
- Enhanced application performance and user experience
- Increased scalability to support business growth
- Reduced operational overhead through automation [4]
However, it’s important to note some limitations:
- Best suited for gradual increases in demand due to cooldown timers
- Limited by customer contract limits
- Recommended maximum of 100 VMs per Auto Scaling Group
- Potential performance issues with large-scale jobs [4]
Managed Kubernetes Clusters
IONOS Managed Kubernetes service simplifies the deployment, management, and scaling of containerized applications [5]. Key features include:
- Auto-scaling: Ensures high availability of Kubernetes deployments while optimizing costs.
- Precise node control: Allows users to define the initial, maximum, and minimum number of nodes within a node pool.
- Futureproof storage: Offers fully integrated persistent data storage through the IONOS Cloud ecosystem.
- Easy integration and automation: Provides various SDKs and config management tools for seamless integration with CI/CD pipelines [5].
The IONOS Managed Kubernetes service stands out from competitors by focusing on simplifying the use of software containers and Kubernetes as an orchestration tool. It offers visual cluster and node pool management through the Data Center Designer (DCD) interface, allowing users to create, manage, and delete Kubernetes clusters and node pools easily [5].
Pricing for the Managed Kubernetes service is transparent, with users only paying for the computing and storage resources required by the Kubernetes nodes. This includes costs for computing (from $0.0240 per core per hour), RAM ($0.0071 per GB per hour), block storage (from $0.0533 per GB per 30 days), and outgoing data transfer (from $0.09 per GB) [5].
Container Orchestration
IONOS offers container orchestration solutions through its Managed Kubernetes service and other container-focused tools. These solutions help businesses efficiently manage and scale their containerized applications.
One notable container orchestration tool provided by IONOS is Kubernetes, which automates the deployment, management, and scaling of containerized applications [6]. Kubernetes supports several key features:
- Automated rollout and rollback of changes
- Service discovery within the network
- Mass storage orchestration
- Scaling of applications and services
- Batch processing of data [6]
For projects requiring a more focused approach on the application layer, IONOS offers AWS Fargate integration. Fargate is a “serverless” engine for containers that follows the container-as-a-service model. It allows users to concentrate on the application rather than infrastructure management, taking care of resource provisioning and scaling elastically [6].
To enhance horizontal scaling capabilities, IONOS provides solutions for distributing ingress network traffic across multiple Kubernetes nodes. This approach helps preserve the original client IP address and allows for scaling beyond the 2 Gbit/s throughput limit of individual nodes [7]. The process involves:
- Reserving multiple IP addresses in the IP Manager
- Creating node pools with dedicated ingress nodes
- Configuring services to use specific nodes and IP addresses
- Setting up DNS load balancing to distribute traffic across multiple ingress nodes [7]
By leveraging these container orchestration and horizontal scaling solutions, businesses can effectively manage their cloud applications, ensure high availability, and optimize resource utilization while maintaining flexibility and scalability.
Designing for Horizontal Scalability
Designing applications for horizontal scalability is crucial for businesses aiming to handle growing demands and traffic spikes efficiently. This approach involves several key strategies that enable cloud applications to scale out seamlessly.
Microservices Architecture
Microservices architecture plays a vital role in achieving horizontal scalability. This design pattern involves breaking down applications into smaller, independent services that communicate through standardized APIs [8]. Each microservice operates in isolation, allowing for easier scaling and maintenance.
Key benefits of microservices architecture include:
- Independent scaling: Individual services can be scaled based on specific needs, optimizing resource utilization [9].
- Improved fault tolerance: Failure in one service doesn’t affect others, enhancing overall system reliability [10].
- Faster development and deployment: Teams can work on and deploy services independently, accelerating innovation [11].
For example, Spotify utilizes microservices to quickly react to market changes and publish new features faster. Their search suggestion feature is a self-contained microservice with a dedicated team [11].
Stateless Components
Implementing stateless components is crucial for horizontal scalability. Stateless applications don’t store data from one request to another, making them easier to scale across multiple servers [9].
Characteristics of stateless applications include:
- No continuous interaction between requests
- Session data not stored in application memory
- Each session treated as if running for the first time [9]
Benefits of stateless design:
- Improved scalability: Any server can handle any request, allowing for seamless load distribution [10].
- Cost-effectiveness: Pay only for resources used, rather than maintaining idle machines [9].
- Simplified infrastructure: Decoupled components reduce complexity and boost operational efficiency [9].
To make applications stateless, store session-related details on the client-side rather than the server-side. This approach allows sessions to pass through multiple servers interchangeably [9].
Distributed Caching
Distributed caching is a powerful technique for enhancing performance and scalability in horizontally scaled environments. It involves using multiple nodes to store frequently accessed data in memory [12].
Key advantages of distributed caching:
- Reduced database calls: Caching frequently accessed data in memory minimizes the need for database queries [12].
- Improved response times: Retrieving data from memory is faster than from slower storage devices [12].
- Enhanced availability: Spreading cached data across multiple nodes ensures high availability and redundancy [12].
Major companies like eBay, Amazon, and Twitter utilize distributed caching to improve their application performance and scalability [12]. For instance, Twitter’s distributed cache comprises multiple nodes across data centers, storing user session data and facilitating communication among microservices [12].
To implement effective distributed caching:
- Identify frequently accessed data
- Choose a suitable caching solution (e.g., Redis, Memcached)
- Implement cache invalidation strategies to ensure data consistency
- Monitor cache performance and adjust as needed
By incorporating these design principles – microservices architecture, stateless components, and distributed caching – organizations can create cloud applications that scale horizontally with ease, meeting the demands of growing user bases and fluctuating traffic patterns.
Implementing Auto Scaling with IONOS
IONOS offers a robust VM Auto Scaling service that enables clients to automatically adjust the number of VM instances horizontally based on configured policies. This functionality ensures that applications have sufficient resources to handle varying workloads while optimizing costs [4]. The VM Auto Scaling service comprises several key components that work together to provide efficient and flexible scaling solutions.
Creating Scaling Groups
To implement auto scaling with IONOS, the first step is to create a VM Auto Scaling Group (ASG). An ASG is a collection of VM instances managed by the VM Auto Scaling service [4]. To create an ASG, users need to follow these steps:
- Navigate to the “Configuration” tab in the “Create VM Auto Scaling Group” window.
- Provide a name for the ASG.
- Select a data center from the drop-down list, choosing either an existing Virtual Data Center (VDC) or creating a new one.
- Specify the minimum count of VMs, which serves as a reference to prevent scaling below this number.
- Set the maximum count of VMs, which acts as an upper limit for scaling [13].
It’s important to note that the creation of an ASG triggers the automatic creation of two monitoring alarms for ‘Scale-In’ and ‘Scale-Out’ operations based on the policy settings [14].
Defining Scaling Policies
Scaling policies are crucial for determining how the VM Auto Scaling Group adapts to changing demands. To define effective scaling policies, users should consider the following:
- Metric Selection: Choose a metric to monitor, such as CPU utilization average or network bytes for incoming and outgoing traffic.
- Scale In Threshold: Set a value that triggers the scale-in operation when the chosen metric falls below this threshold.
- Scale Out Threshold: Specify a value that initiates the scale-out operation when the metric exceeds this threshold [13].
For both scale-in and scale-out actions, users can configure:
- Amount Type: Choose between “Percentage” or “Absolute” to define the number of replicas to add or remove.
- Amount: Specify the number of VM instances to be added or deleted.
- Cooldown Period: Set an interval between each auto scaling action to prevent rapid fluctuations [13].
Additionally, for scale-in operations, users can select a termination policy to determine whether the oldest or most recent replica should be deleted first [13].
Monitoring and Alerting
Effective monitoring and alerting are essential for maintaining optimal performance and resource utilization in auto-scaled environments. IONOS provides several features to enhance monitoring capabilities:
- Custom Event Notifications: Users can define their own monitoring events and set up notifications to act proactively before load peaks occur [15].
- Historical Data Analysis: The system allows access to historical data logs and defined events for up to 14 days, enabling users to process this data via an interface and optimize their monitoring strategy [15].
- Anomaly Detection: The monitoring system can easily detect deviations and anomalies in load baselines, allowing users to create alarms for abnormal load activities or potential security issues like VM hijacking [15].
- Performance Thresholds: Users can define custom load peak limits and set individual notifications, such as CPU load alerts, to add resources before an overload occurs [15].
By leveraging these monitoring and alerting features, organizations can ensure their auto-scaled environments remain responsive and cost-effective.
To maximize the benefits of VM Auto Scaling, it’s important to consider the following limitations:
- The service is best suited for gradual increases in demand due to cooldown timers.
- Scaling capabilities are limited by customer contract limits.
- It’s recommended to limit the maximum number of VMs in an Auto Scaling Group to 100 or less for optimal performance.
- Large-scale jobs may encounter performance issues, so it’s advisable to limit VM creation or deletion to a maximum of five at a time [4].
Load Balancing in Horizontally Scaled Environments
Load balancing is a crucial technique in system design, used to distribute incoming network traffic across multiple servers or resources. Its primary goal is to prevent any single server from becoming overwhelmed with requests, thus avoiding performance bottlenecks and enhancing overall system reliability and availability [16].
Application Load Balancer Configuration
IONOS offers a robust Application Load Balancer (ALB) solution that optimizes traffic and provides intelligent load balancing tailored to individual applications. The ALB creates new load balancing functions to ensure smooth performance of IT infrastructure in the HTTP(S) transport layer [17].
To configure an ALB in the IONOS environment, follow these steps:
- Add an ALB element by dragging it to the workspace.
- Connect the northern interface to Internet Access and the southern interface to a target Server.
- Configure the ALB Settings by providing a name, primary IPv4 address, and additional IP addresses if needed [18].
Forwarding rules define how client traffic is distributed to targets. To add forwarding rules:
- Select the Forwarding rules tab in the Inspector pane.
- Click +Add forwarding rule and fill in the required fields, including name, protocol, listener IP, listener port, and client timeout [18].
HTTP rules are essential for properly routing incoming traffic, load balancing between multiple targets, and improving security. To create an HTTP rule:
- Select +Add HTTP Rule on the right side.
- Choose between Forward, Redirect, or Static options.
- Fill in the required fields for the selected option [18].
Session Persistence
Session persistence, also known as session affinity or sticky sessions, is a mechanism used in load balancing to ensure that multiple requests from the same client are consistently routed to the same backend server [16]. This is particularly important for applications that require maintaining session state, such as login credentials or shopping cart contents [16].
Benefits of session persistence include:
- Improved User Experience: Ensures continuity in user interactions by preventing disruptions like lost session data or frequent re-authentication.
- Consistency in Session State: Maintains session data integrity by directing all requests associated with the same session to the same backend server.
- Optimized Caching and Resource Utilization: Improves caching efficiency and reduces redundant data retrieval or computation across multiple servers.
- Load Balancer Efficiency: Reduces overhead associated with session lookup and routing decisions [16].
To implement session persistence, techniques like cookies, IP addresses, or HTTP headers are used to identify clients and route their subsequent requests back to the same server that initially served them [16].
SSL Offloading
SSL offloading is a technique where the encryption and decryption of SSL/TLS traffic is handled by the load balancer instead of the backend servers. This approach offers several advantages:
- Increased Server Capacity: Backend servers are relieved of the computational load associated with SSL/TLS operations, allowing them to serve more client requests.
- Simplified Certificate Monitoring: SSL/TLS certificates can be managed from a single location, simplifying the process and ensuring all backend servers have current certificates.
- Improved Security: Terminating traffic on the load balancer introduces options such as mTLS, where users can be authenticated based on client certificates [19].
However, SSL offloading also has some disadvantages to consider:
- Reduced Load Balancer Capacity: SSL offloading can double the load and halve the speed of the load balancer.
- Higher Latency: The additional processing requirements may result in noticeable delays, especially for geographically dispersed workloads.
- Inhibited Scalability: Moving SSL termination to a single point (the load balancer) may limit potential scalability.
- Broader Attack Surface: If compromised, the load balancer becomes a critical security endpoint, potentially exposing decrypted traffic and sensitive information [19].
Given these considerations, SSL offloading should only be implemented if the application cannot perform cryptography securely itself or if there is a compelling business or technical reason to do so [19].
Data Management in Scaled Applications
Database Scaling Techniques
Database scaling is a crucial process for managing increasing data volumes and user activity in cloud applications [20]. Two primary approaches to database scaling are vertical and horizontal scaling. Vertical scaling, also known as “scaling up,” involves adding more resources or processing power to a single machine. This method focuses on upgrading CPU, RAM, or storage capacity to enhance the processing speed or storage capabilities of a single server [20].
Horizontal scaling, or “scaling out,” takes a different approach by adding more machines to distribute the database load [20]. This method is highly sought after for its potential to handle massive amounts of data and traffic by simply adding more nodes to the infrastructure [21]. Horizontal scaling can be implemented through two common techniques:
- Sharding: This method involves dividing a large database into smaller, more manageable pieces called shards and distributing them across multiple machines [20].
- Replication: This technique creates multiple copies of the same database on different machines, with one designated as the primary machine where changes are made and then propagated to other replicas [20].
Storage Considerations
When implementing horizontal scaling, storage considerations play a vital role in ensuring efficient data management. Cloud computing offers flexible options for processing and storing critical data and applications at the required scale [22]. This flexibility is particularly beneficial for organizations that are just starting up or experiencing rapid growth [22].
To address storage needs in horizontally scaled environments, businesses can leverage scalable cloud infrastructure. This enables them to reshape their systems to accommodate changing workloads and easily transform private cloud networks into hybrid cloud or multi-cloud environments [22]. Some key storage considerations include:
- Scalable storage solutions: Cloud providers offer scalable storage options that can grow with your application’s needs. For example, DigitalOcean’s block storage solution, Volumes, allows users to scale storage effortlessly by attaching additional storage to Droplets and resizing as necessary [23].
- Distributed storage systems: Implementing distributed storage systems can help manage data across multiple nodes, ensuring efficient data retrieval and storage in horizontally scaled environments.
- Caching strategies: Correct use of caching improves the performance and scalability of applications by reducing the number of database fetches and network calls required [10]. Caching frequently retrieved data in memory ensures quick and reliable access, leading to scalability advantages.
Data Consistency
Maintaining data consistency is one of the most challenging aspects of horizontal scaling, especially for SQL databases. These databases are designed with a focus on relationships and ACID properties (Atomicity, Consistency, Isolation, and Durability) to ensure reliable transactions [21]. However, maintaining these properties across a distributed system can be complex and challenging.
Some key challenges in maintaining data consistency in horizontally scaled environments include:
- Atomicity: Ensuring that each transaction is all-or-nothing across multiple nodes requires complex coordination [21].
- Consistency: Maintaining predefined rules, constraints, and cascades across distributed data becomes more challenging [21].
- Isolation: Enforcing isolation for concurrent transactions without significant performance penalties is difficult in distributed databases [21].
- Durability: Guaranteeing that committed transactions remain so across all nodes, despite potential failures, adds complexity to the system [21].
To address these challenges, organizations must carefully consider their database choice and implement appropriate strategies for maintaining data consistency. NoSQL databases, which are often designed for simple distributed operations, may offer easier horizontal scaling options compared to traditional SQL databases [10].
Testing and Optimizing Horizontal Scaling
Testing and optimizing horizontal scaling is crucial for ensuring the efficiency and reliability of cloud applications. This process involves various techniques to assess and enhance system performance under increasing workloads.
Load Testing
Load testing is a critical component of scalability testing that evaluates a system’s ability to handle anticipated loads and assesses response times under normal and peak conditions [24]. This process involves simulating gradual increases in website traffic to measure its impact on response times. By conducting load tests, development teams can identify potential bottlenecks and optimize system responsiveness under heavy workloads, contributing to overall application resilience [24].
Performance Tuning
Performance tuning is essential for optimizing horizontally scaled applications. It involves monitoring and analyzing key metrics to identify areas for improvement. Some critical performance indicators include:
- Response Time: Measures the time a system takes to respond to a user’s request, helping assess how well the system maintains acceptable performance levels under varying workloads [24].
- Throughput: Tracks the number of transactions or operations processed by the system within a specific time frame, ensuring the system can sustainably process increased loads without performance decline [24].
- Resource Utilization: Monitors the percentage of available system resources (CPU, memory, disk space) utilized during operations, helping identify bottlenecks and ensure efficient allocation as demand scales [24].
To optimize performance, organizations should consider implementing the following strategies:
- Implement stateless applications: By moving session-specific data to the client-side, sessions can be processed seamlessly across all servers, simplifying horizontal scaling [25].
- Leverage microservices: Splitting services according to resource needs can help separate resource-heavy processes from lighter ones, reducing the need to scale up individual components when bandwidth demand increases [25].
- Automate scaling processes: Automation makes it easy and cost-effective to create and replicate workloads, ensuring servers scale out during demand spikes without manual intervention [25].
Scaling Simulations
Scaling simulations are crucial for testing and optimizing horizontal scaling strategies. These simulations help identify potential issues and optimize system performance under various scenarios. Key aspects of scaling simulations include:
- Testing partitioning schemes: Verify the effectiveness and efficiency of data partitioning strategies by measuring factors such as response time, throughput, and scalability [26].
- Analyzing scaling delays: Test the effects of scaling delays, as it takes time for scaling to complete [26].
- Identifying dependency issues: Scaling or partitioning in one area of a workload might cause performance issues on dependencies, particularly in stateful parts like databases [26].
- Stress testing: Push the system beyond its designed capacity to understand its behavior under extreme conditions and identify breaking points [24].
By conducting thorough scaling simulations, organizations can optimize their horizontal scaling strategies and ensure their applications can handle expected workloads efficiently. Regular stress tests and performance monitoring help plan improvements in existing infrastructure and software as needed [25].
Conclusion
Mastering horizontal scaling with IONOS has a profound impact on cloud applications, enabling businesses to handle sudden traffic spikes and growing user demands effectively. By leveraging IONOS’s robust solutions, including VM Auto Scaling, Managed Kubernetes Clusters, and Container Orchestration, organizations can build resilient, high-performance cloud applications. These tools, combined with strategies like microservices architecture, stateless components, and distributed caching, allow companies to design scalable architectures and implement demand-based scaling policies.
To wrap up, the journey to optimize horizontal scaling involves careful planning, implementation, and continuous refinement. Through rigorous testing, performance tuning, and scaling simulations, businesses can ensure their applications can handle expected workloads efficiently. As the digital landscape continues to evolve, mastering horizontal scaling with IONOS equips organizations with the tools and knowledge to stay competitive and meet the ever-changing needs of their users.
FAQs
- Does cloud storage have the capability for horizontal scaling?
Horizontal scaling, which involves adding more servers, is a key feature of cloud storage. This type of scalability allows businesses to adjust their computing resources based on varying and peak demand levels. Cloud storage can also be scaled vertically by enhancing the capabilities of existing servers. - Can cloud services expand both vertically and horizontally?
Yes, cloud services can be scaled both ways. Horizontal scaling adds more nodes to the system, whereas vertical scaling increases the power of existing machines. For example, upgrading CPUs or increasing memory, storage, or network speeds are all forms of vertical scaling. - What is IONOS and does it provide cloud services?
IONOS is a leading European provider specializing in cloud infrastructure, cloud services, and hosting services. Known for its high performance, competitive pricing, robust security, and excellent customer support, IONOS facilitates digital transformation effectively.
References
[1] – https://www.geeksforgeeks.org/strategies-and-tips-for-dealing-with-traffic-spikes/
[2] – https://moldstud.com/articles/p-addressing-scalability-challenges-in-cloud-based-applications
[3] – https://granulate.io/blog/identifying-bottlenecks-in-modern-applications/
[4] – https://docs.ionos.com/cloud/compute-services/vm-auto-scaling/overview
[5] – https://cloud.ionos.com/managed/kubernetes
[6] – https://www.ionos.com/digitalguide/server/know-how/kubernetes-alternatives/?srsltid=AfmBOop7GLlV–s52NcOHpJl2DYtLHd-HgLDpnmr0M3kBImRa9TFkBWD
[7] – https://docs.ionos.com/cloud/containers/managed-kubernetes/use-cases/horizontal-scaling
[8] – https://www.ionos.com/digitalguide/websites/web-development/what-is-cloud-native/?srsltid=AfmBOoo5fFp7An5y6WlQXU96lyyltIsQI3d-_qzWndtVhuW-PzmT0mrg
[9] – https://www.rosehosting.com/blog/stateful-and-stateless-horizontal-scaling-for-cloud-environments/
[10] – https://deploy.equinix.com/blog/designing-applications-for-horizontal-scalability/
[11] – https://www.ionos.com/digitalguide/websites/web-development/microservice-architecture/?srsltid=AfmBOopEwnp82NOcDvrNv9JqUKzjgUoJiWUSJ2IZZC8E5MVUHdjC0dT7
[12] – https://www.harperdb.io/post/distributed-cache
[13] – https://docs.ionos.com/cloud/compute-services/vm-auto-scaling/how-tos/configure-vm-auto-scaling-group
[14] – https://api.ionos.com/docs/vmautoscaling/v1.ea/
[15] – https://cloud.ionos.com/managed/monitoring-as-a-service
[16] – https://www.geeksforgeeks.org/load-balancer-session-persistence/
[17] – https://cloud.ionos.com/network/application-load-balancer
[18] – https://docs.ionos.com/cloud/network-services/application-load-balancer/how-tos/setup-alb
[19] – https://www.loadbalancer.org/blog/the-pros-and-cons-of-offloading-ssl-decryption-encryption-to-your-adcs/
[20] – https://www.mongodb.com/resources/basics/horizontal-vs-vertical-scaling
[21] – https://www.designgurus.io/blog/horizontally-scale-sql-databases
[22] – https://www.spiceworks.com/tech/cloud/articles/horizontal-vs-vertical-cloud-scaling/
[23] – https://www.digitalocean.com/resources/articles/horizontal-scaling-vs-vertical-scaling
[24] – https://www.tricentis.com/learn/a-guide-to-scalability-testing-with-examples
[25] – https://www.spiceworks.com/tech/cloud/articles/horizontal-cloud-scaling/
[26] – https://learn.microsoft.com/en-us/azure/well-architected/performance-efficiency/scale-partition