Running any production database deployment in the cloud requires careful consideration of both performance and security.
In this article, I’ll share a battle-tested approach for deploying a MariaDB Galera cluster with replicas, a ProxySQL layer in front of them, and monitoring for all these technologies via PMM. The focus will be on building solid security boundaries.
Security Requirements
We want to meet several key security requirements:
- Network Segregation: Isolate database infrastructure from application layers to minimize the attack surface;
- Protected Database Access: Enforce that all application database traffic passes through ProxySQL rather than connecting directly to database instances;
- Secure Backup Channels: Ensure backup processes have minimal access rights and secure data transfer paths;
- Secure Monitoring: Enable comprehensive monitoring without exposing database instances to unnecessary access.
Let’s explore how to address these security requirements using a real-world architecture that I’ve implemented for several production workloads.
Architecture Overview
Our architecture contains the following components:
- 3 EC2 instances running MariaDB with Galera for synchronous replication;
- 2 EC2 instances running MariaDB replicas for read scaling and backup purposes;
- 2 EC2 instances running ProxySQL as a load balancer;
- PMM3 a a monitoring solution, on a separate EC2 instance;
- Application servers in their own security group, already existing, not covered in this article;
- Nightly backups from a replica to S3.
To secure this environment for good, we’ll implement a 2 VPCs architecture with carefully configured security groups.
A Multi-VPC Design
I asked Claude Sonnet 3.7 to draw an ASCII diagram to illustrate the architecture:
+-------------------------------Application VPC-----------------+
| |
| +----------------+ |
| | | |
| | Application | |
| | Servers | |
| | (sg-app) | |
| | | |
| +-------+--------+ |
| | |
+----------|----------------------------------------------------+
|
| VPC Peering Connection
| (Only ProxySQL access permitted)
|
+----------|--------------------Database VPC--------------------+
| | |
| +-------v-------------+ +---------------+ |
| | | | | |
| | ProxySQL Instances | <---------------> | | |
| | (sg-proxy) | | | |
| | | | | |
| +----------+----------+ | | |
| | | | |
| v | | |
| +----------+-----------+ | | |
| | | | PMM | |
| | Galera Cluster | <--------------> | Monitoring | |
| | Nodes 1, 2, 3 | | (sg-pmm) | |
| | (sg-galera) | | | |
| | | | | |
| +----------+-----------+ | | |
| | | | |
| v | | |
| +----------+-----------+ | | |
| | | | | |
| | MariaDB Replicas | <--------------> | | |
| | (sg-replica) | | | |
| | | | | |
| +----------+-----------+ +---------------+ |
| | |
| | |
| v |
| |
| S3 Bucket |
| (Backup Storage) |
| |
+---------------------------------------------------------------+
We’ll also assume that there are one of more jumphosts, even if they’re not included in the diagram above to keep it simpler. Jumphosts are assigned the jumphost
security group.
VPC 1: Database VPC
Contains all database-related components:
- MariaDB Galera cluster nodes;
- MariaDB replicas;
- ProxySQL instances;
- A PMM monitoring instance.
VPC 2: Application VPC
This is the VPC that runs the application. The application runs in at least one security group that we’ll call sg-app
.
The application already exists and we don’t care much about its details, as long as it’s able to connect to ProxySQL. Its queries will be sent by ProxySQL to Galera and replicas, but this must be completely transparent for the application.
Why Two VPCs?
Our architecture has different VPCs for the application and the database infrastructure. This provides important benefits:
- Defense in Depth: By isolating database infrastructure in its own VPC, we add an extra security boundary. An attacker who manages to break into the application VPC will still be unable to reach the database directly;
- Clear Access Patterns: It’s easier to enforce and audit that applications can only access databases through the proper channels (ProxySQL in our case);
- Independent Administration: Database and application teams can manage their network configurations independently, and one team’s actions won’t compromise the other team’s work;
- Frameworks Compliance: Many compliance frameworks recommend network segmentation, which this approach satisfies.
Security Group Configuration
While VPC are great to isolate different components of our architecture, security groups remain vital. Security groups determine which inbound and outbound communications hosts are allowed to establish. Let’s see the configuration of our security groups and their inbound/outbound rules.
Connection Flow
The desired data flow is the following:
- No direct connection is possible from applications to database servers;
- Application servers can only connect to ProxySQL;
- ProxySQL can connect to both Galera nodes and replicas;
- Galera nodes can communicate with each other;
- Replicas can connect to Galera nodes for replication;
- PMM3 can monitor ProxySQL, Galera nodes and replicas, but it’s supposed to use agents.
sg-base
This security group is assigned to all instances.
Inbound rules
Jumphost access:
- TCP 22 from
jumphost
Outbound rules
- TCP 80 to
*
– Packages update - TCP+UDP 53 to
*
– DNS - TCP 123 to
*
– NTP
PMM
PMM should always use agents, when it’s possible. So we want all our EC2 instances to be able to communicate with PMM. For more flexibility, we want to allow both push and pull methods.
- TCP 7777 (out)
- TCP 8428 (in, out)
- TCP 42000 – 51999
If you think that the listen ports interval is too wide, I agree with you, but we can adjust it using pmm-agent
--ports-min
and --ports-max
options.
sg-galera
Security group for Galera nodes.
Inbound rules
For Galera. replication:
- TCP 4567 from
sg-galera
- TCP 4568 from
sg-galera
- TCP 4444 from
sg-galera
For asynchronous replication:
- TCP 3306 from
sg-replica
For queries:
- TCP 3306 from
sg-proxy
Outbound rules
For Galera replication:
- TCP 4567 to
sg-galera
- TCP 4568 to
sg-galera
- TCP 4444 to
sg-galera
sg-replica
This hostgroup is for replicas.
Inbound rules
- TCP 3306 from
sg-proxy
Outbound rules
- TCP 3306 to
sg-galera
- Access to S3 endpoints
sg-proxy
This hostgroup is for ProxySQL instances.
Inbound rules
- TCP 3306 from
sg-app
Outbound rules
- TCP 3306 to
sg-galera
- TCP 3306 to
sg-replica
sg-pmm
The following rules should be both inbound and outbound:
- TCP 80 to
*
– HTTP, gRPC over HTTP - TCP 443 to
*
– HTTPS, gRPC over HTTPS - TCP 7771 to
sg-galera
,sg-replica
,sg-proxy
– gRPC
Outbound only
- TCP 7772 to
*
sg-app
As mentioned before, we’re not going to cover this hostgroup, as it’s outside of our domain. But it’s important that it allows outbound connections to 3306.
Implementation Considerations
The architecture’s logic should be pretty clear, but some implementation details probably need be explained.
VPC Peering
To connect the Application VPC and Database VPC, use VPC Peering:
aws ec2 create-vpc-peering-connection \
--vpc-id vpc-app \
--peer-vpc-id vpc-db
Subnets and NACLs
I decided not to include subnets in this setup. With subnets, this article would have probably been too long. But more importantly, they are unnecessary for many use cases.
That being said, some setups would benefit from subnets for their database infrastructure. For example, they’d allow us to easily distribute ProxySQL across multiple Availability Zones.
It is also possible to use a single-VPC architecture, but split it into multiple subnets. This would still complicate the architecture a bit.
If we add subnets, we’ll also be able to use NACLs to guarantee a stronger protection for stateful connections. In this case, connections to the port 3306 of ProxySQL or MariaDB. NACLs are not an alternative to security groups: when subnets are used, both security groups and NACLs should be used.
S3 Backup Access
For backups to S3, I recommend using VPC Endpoints:
aws ec2 create-vpc-endpoint \
--vpc-id vpc-database \
--service-name com.amazonaws.region.s3 \
--route-table-ids rtb-database
This will create the network path for our replicas to reach S3 without going through the internet.
Make sure to allow outbound HTTPS connections (443/TCP). The destination should be the VPC endpoint for S3, or prefix list ID.
Also make sure to create an IAM role that is assigned to the replicas and has permissions to write to S3. Here’s an example:
{
"Version": "...",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:ListBucket",
"s3:DeleteObject"
],
"Resource": [
"arn:aws:s3:::some-bucket",
"arn:aws:s3:::some-bucket/*"
]
}
]
}
Conclusions
A well-designed, multi-VPC architecture with proper security groups provides a robust foundation for running MariaDB Galera clusters in AWS. By clearly separating database and application resources, and enforcing communication through ProxySQL, you can achieve a secure yet performant database environment.
This architecture provides many benefits, especially from a security perspective. I strongly advise it when security is an important concern, particularly for sectors like healthcare or finance.
Remember that network security is just one aspect of a comprehensive database security strategy. You should complement this design with connection encryption, data-at-rest encryption, fine-grained authentication and permissions, and so on.
Federico Razzoli
0 Comments