This guide will walk you through picking the best way to deploy Baserow to AWS, what pre-requisites you will need to set up and also provide specific installation instructions. This guide is for those who already have experience and knowledge of deploying applications to AWS.
Baserow can be deployed to AWS in the following ways:
All of these deployment methods will store Baserow’s data and state in RDS and S3 making switching between them (for example migrating to using EKS later on) straightforward.
We will go through these in more detail later in the guide but to give you a quick overview this is what any AWS deployment of Baserow will need:
The baserow/baserow:1.29.2
image runs all of Baserow’s various services inside the
container for ease of use.
This image is designed for single server deployments or simple deployments to horizontally scalable container deployment platforms such as AWS Fargate or Google Cloud Run.
baserow/baserow:1.29.2
.baserow/baserow:1.29.2
containers horizontally you
by default end up with:
We will be skipping over configuring VPCs, security groups, IAM users, secrets manager etc and ELB specifics to keep this guide more generally applicable.
First Baserow needs an S3 bucket. Baserow will use this bucket to store files uploaded by users into tables and view/table exports for users to then download.
Baserow will then generate pre-signed S3 URLs for the user to view and download files from this bucket. As a result, these pre-signed URLs need to be accessible from the user’s browser and so depending on your setup most likely the bucket to allow public GET/ACLs.
We recommend setting up a separate IAM user who Baserow will be configured with credentials for, so it can upload into and delete from the bucket. Here is an example S3 policy for this user to grant the minimal required operations:
policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObjectAcl",
"s3:GetObject",
"s3:DeleteObject",
"s3:PutObjectAcl"
],
"Principal": {
"AWS": "arn:aws:iam::xyz:user/YOUR_USER"
},
"Resource": [
"arn:aws:s3:::YOUR_BUCKET_NAME/*",
"arn:aws:s3:::YOUR_BUCKET_NAME"
]
}]
}
This bucket will also in the current version of Baserow need CORS rules set for the in-tool file download button to work correctly. An example rule is provided below:
cors_rule {
allowed_headers = ["*"]
allowed_methods = ["GET"]
allowed_origins = ["REPLACE_WITH_YOUR_BASEROW_PUBLIC)URL"]
expose_headers = ["ETag"]
max_age_seconds = 3000
}
Finally, these are the following public access settings we used to ensure the users browser can GET and download the S3 files:
block_public_acls = false
block_public_policy = true
ignore_public_acls = true
restrict_public_buckets = true
Baserow stores all of its non-file data in a PostgreSQL database. In AWS we recommend using an RDS Postgres cluster. You will later need to configure Baserow to be able to access this cluster.
Baserow uses PostgreSQL heavily so scaling the RDS cluster up will be needed for larger deployments.
You should not put an RDS Proxy in front of the RDS instance. This is because the proxy causes a transaction to end after a definition language (DDL) statement completes. Baserow is therefore incompatible because it makes schema changes, and if anything goes wrong the after a schema migration, the transaction isn’t rolled back, and results in data inconsistencies. More information: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/rds-proxy.howitworks.html#rds-proxy-transactions
Baserow uses Redis as a cache, for real-time collaboration over WebSockets and async background task processing.
We recommend setting up an Elasticache Redis in non-cluster mode with TLS enabled and in AUTH mode where you can specify an AUTH token password for Baserow to later connect to the Redis server.
Generally, the Redis server is not the bottleneck in Baserow deployments as they scale.
Now create a target group on port 80 and ALB ready to route traffic to the Baserow containers.
When setting up the health check for the ALB the baserow/baserow:1.29.2
container
,which you’ll be deploying next, choose port 80
and health check
URL /api/_health/
. We recommend a long grace period of 900 seconds to account for
first-time migrations being run on the first container’s startup.
Now we are ready to spin up our baserow/baserow:1.29.2
containers. See below for a
full task definition and environment variables. We recommend launching the containers
with 2vCPUs and 4 GB of RAM each to start with. In short, you will want to:
baserow/baserow:1.29.2
image.80
on TCP as this is where this images HTTP server is
listening by default.A full list of all environment variables available can be found here.
Env variable | Description |
---|---|
DISABLE_VOLUME_CHECK=true |
Must be set to true. Needed to disable the check designed to help non-technical users who are not configuring an external Postgres and S3. Because we are configuring external services we do not need any volume mounted into the container. |
BASEROW_PUBLIC_URL |
The public URL or IP that will be used to access baserow in your user’s browser. Always should start with http:// https:// even if accessing via an IP address. |
DATABASE_HOST |
The hostname of the Postgres database Baserow will use to store its data in. |
DATABASE_USER |
The username of the database user Baserow will use to connect to the database at DATABASE_HOST . |
DATABASE_PORT |
The port Baserow will use when trying to connect to the Postgres database at DATABASE_HOST . |
DATABASE_NAME |
The database name Baserow will use to store data. |
DATABASE_PASSWORD |
The password of DATABASE_USER on the Postgres server at DATABASE_HOST . Alternatively, you can provide DATABASE_PASSWORD_FILE and set it to the file path of a secret injected into the container’s file system. |
REDIS_URL |
A standard Redis connection string in the format of: redis://[redisuser]:[password]@[redishost]:[redisport]/0?ssl_cert_reqs=required . |
AWS_STORAGE_BUCKET_NAME |
Your AWS storage bucket name. |
AWS_ACCESS_KEY_ID |
The access key for your S3 IAM AWS account. When set to anything other than empty will switch Baserow to use a S3 compatible bucket for storing user file uploads. |
AWS_SECRET_ACCESS_KEY |
The access secret key for your S3 IAM AWS account. AWS_SECRET_ACCESS_KEY_FILE can similarly be provided instead. |
DOWNLOAD_FILE_VIA_XHR |
Must be set to 1 to work with AWS S3 currently to force download links to download files via XHR query to bypass Content-Disposition: inline . If your files are stored under another origin, you also must add CORS headers to your S3 bucket. |
BASEROW_EXTRA_ALLOWED_HOSTS |
An optional comma-separated list of hostnames which will be added to Baserow’s Django backend ALLOWED_HOSTS setting. Add your ALB IP address here so the health checks it sends are allowed through, or alternatively configure the less secure value * to get things running and restrict hosts later once everything is working. |
BASEROW_JWT_SIGNING_KEY |
Must be set so all the containers share the same signing key. The signing key is used to sign the content of generated tokens. For HMAC signing, this should be a random string with at least as many bits of data as is required by the signing protocol. See here for more details. BASEROW_JWT_SIGNING_KEY_FILE is also supported. |
SECRET_KEY |
Must be set so all the containers share the same secret key. The Secret key used by Django for cryptographic signing such as generating secure password reset links and managing sessions. See here for more details. SECRET_KEY_FILE is also supported. |
EMAIL_SMTP_* |
There are a number of SMTP related environment variables documented in our environment variable guide here which will also need to be set so Baserow can send invitation and password reset emails. |
Here is a full example task definition using ECS and Fargate.
container_definitions = <<DEFINITION
[
{
"name": "baserow_task",
"image": "baserow/baserow:1.29.2",
"logConfiguration": { #logs are not mandatory
"logDriver": "awslogs",
"options": {
"awslogs-region" : "YOUR_REGION_NAME",
"awslogs-group" : "/ecs/baserow_log",
"awslogs-stream-prefix" : "baserow",
"awslogs-create-group": "true"
}
},
"environment": [
{
"name": "DISABLE_VOLUME_CHECK",
"value": "yes"
},
{
"name": "BASEROW_PUBLIC_URL",
"value": "<YOUR_PUBLIC_URL>"
},
{
"name": "DATABASE_HOST",
"value": "<YOUR_POSTGRES_DB_HOST>"
},
{
"name": "DATABASE_USER",
"value": "postgres"
},
{
"name": "DATABASE_PORT",
"value": "<PORT_NUMBER>"
},
{
"name": "DATABASE_NAME",
"value": "<YOUR_POSTGRES_DB_NAME>"
},
{
"name": "DATABASE_PASSWORD",
"value": "<YOUR_POSTGRES_DB_PASSWORD>"
},
{
"name": "REDIS_URL",
"value": "rediss://default:password@YOUR_REDIS_PRIMARY_ENDPOINT:6379/0?ssl_cert_reqs=required"
},
{
"name": "AWS_STORAGE_BUCKET_NAME",
"value": "<YOUR_BUCKET_NAME>"
},
{
"name": "AWS_ACCESS_KEY_ID",
"value": "<YOUR_AWS_ACCESS_KEY_ID>"
},
{
"name": "AWS_SECRET_ACCESS_KEY",
"value": "<YOUR_AWS_SECRET_ACCESS_KEY>"
},
{
"name": "DOWNLOAD_FILE_VIA_XHR",
"value": "1"
},
{
"name": "BASEROW_EXTRA_ALLOWED_HOSTS",
"value": "<YOUR_ALLOWED_HOSTS>"
},
{
"name": "BASEROW_JWT_SIGNING_KEY",
"value": "<YOUR_SIGNING_KEY>"
},
{
"name": "SECRET_KEY",
"value": "<YOUR_SECRET_KEY>"
}
],
"essential": true,
"portMappings": [
{
"containerPort": 80,
"hostPort": 80
}
],
"memory": 8192,
"cpu": 4096
}
]
DEFINITION
requires_compatibilities = ["FARGATE"]
network_mode = "awsvpc"
memory = 8192
cpu = 4096
Other than launching more baserow/baserow
tasks and scaling up the RDS postgres
server, the baserow/baserow
image has the following scaling environment variables
which can help reduce the resource usage per container or allocate more resources to
certain services inside the container.
BASEROW_AMOUNT_OF_GUNICORN_WORKERS
controls the number of REST API workers (
the things that do most of the API work) per container. Defaults to 3. Each extra
worker generally takes up around 100-200 MB of RAM.BASEROW_AMOUNT_OF_WORKERS
controls the number of background task celery runners,
these run real-time collaboration tasks, cleanup jobs, and other slow tasks like big
file exports/imports. If you are scaling many of these containers you probably only
need one of these background workers per container as they will all pool together and
collect background tasks submitted from any other container via Redis.BASEROW_RUN_MINIMAL=yes
AND BASEROW_AMOUNT_OF_WORKERS=1
.
You should now have a fully running Baserow cluster. The first user to sign-up becomes the first “staff” instance-wide admin user. This user can then configure Baserow’s in-tool settings, active enterprise licenses, promote other users to being staff etc.
The baserow/backend:1.29.2
and baserow/web-frontend:1.29.2
images allow you to run
Baserow’s various services as separate containers.
These images are used by the community Helm chart, our various docker-compose.yml example setups and are best for production environments where you want full control and flexibility managing Baserow.
Follow steps 1, 2 and 3 from the Option 1 guide above as we need the exact same S3 bucket, RDS and Redis setup first.
First you will need to make 3 separate target groups with a target type of IP:
backend-asgi
on port 8000
/HTTP
with a health check using URL /api/_health/
for
our websocket service.backend-wsgi
on port 8000
/HTTP
with a health check using URL /api/_health/
for
our backend API service.web-frontend
on port 3000
/HTTP
with a health check using URL /_health/
for
our frontend service.The trailing slash on the health check endpoints is required!
Now make the ALB with a HTTP port 80 listener routing to the web-frontend
group.
Then once it is made go to this listener and configure it with three different rules
routing to each of the separate groups.
web-frontend
group./ws/*
which forwards to the backend-asgi
group./api/*
which forwards to the backend-wsgi
group.Later on the Baserow web-frontend
service will need to be able to communicate with
the backend-wsgi
service through a load balancer. You can use this same load balancer
to both balance external requests and these inter service requests if you want, however
make sure you’ve appropriately setup the security groups to allow communication between
the ECS tasks and ALB.
Let’s now deploy each of Baserow’s individual services to Fargate/ECS. Make a new cluster for Baserow and then proceed to make the following task definitions.
If you are familiar with K8S then this sample config gives an overview of the services.
Alternatively this docker-compose
can also be used as reference
This service is our HTTP REST API service. When creating the task definition you should:
baserow/backend:1.29.2
imagegunicorn-wsgi,--timeout,60
as the Command.We recommend setting the timeout of each HTTP API request to 60 seconds in the command above as the default of 30 seconds can be too short for very large Baserow tables.
8000
/TCP
with HTTP
App protocol.Env variable | Description |
---|---|
BASEROW_PUBLIC_URL |
The public URL or IP that will be used to access baserow in your user’s browser. Always should start with http:// https:// even if accessing via an IP address. |
DATABASE_HOST |
The hostname of the Postgres database Baserow will use to store its data in. |
DATABASE_USER |
The username of the database user Baserow will use to connect to the database at DATABASE_HOST . |
DATABASE_PORT |
The port Baserow will use when trying to connect to the Postgres database at DATABASE_HOST . |
DATABASE_NAME |
The database name Baserow will use to store data. |
DATABASE_PASSWORD |
The password of DATABASE_USER on the Postgres server at DATABASE_HOST . Alternatively, you can provide DATABASE_PASSWORD_FILE and set it to the file path of a secret injected into the container’s file system. |
REDIS_URL |
A standard Redis connection string in the format of: redis://[redisuser]:[password]@[redishost]:[redisport]/0?ssl_cert_reqs=required . |
AWS_STORAGE_BUCKET_NAME |
Your AWS storage bucket name. |
AWS_ACCESS_KEY_ID |
The access key for your S3 IAM AWS account. When set to anything other than empty will switch Baserow to use a S3 compatible bucket for storing user file uploads. |
AWS_SECRET_ACCESS_KEY |
The access secret key for your S3 IAM AWS account. AWS_SECRET_ACCESS_KEY_FILE can similarly be provided instead. |
BASEROW_EXTRA_ALLOWED_HOSTS |
An optional comma-separated list of hostnames which will be added to Baserow’s Django backend ALLOWED_HOSTS setting. Add your ALB IP address here so the health checks it sends are allowed through, or alternatively configure the less secure value * to get things running and restrict hosts later once everything is working. |
BASEROW_JWT_SIGNING_KEY |
Must be set so all the containers share the same signing key. The signing key is used to sign the content of generated tokens. For HMAC signing, this should be a random string with at least as many bits of data as is required by the signing protocol. See here for more details. BASEROW_JWT_SIGNING_KEY_FILE is also supported. |
SECRET_KEY |
Must be set so all the containers share the same secret key. The Secret key used by Django for cryptographic signing such as generating secure password reset links and managing sessions. See here for more details. SECRET_KEY_FILE is also supported. |
EMAIL_SMTP_* |
There are a number of SMTP related environment variables documented in our environment variable guide here which will also need to be set so Baserow can send invitation and password reset emails. |
It is possible to just use the ASGI service and not make a separate ASGI and WSGI service and then route all HTTP and Websocket requests to the single ASGI service. However, the ASGI service has degraded performance handling normal HTTP requests compared to the service in WSGI mode above. Also being able to scale them separately is nice as often only a few ASGI services are needed to handle the websocket load.
This service is our Websocket API service and when configuring the task definition you should:
baserow/backend:1.29.2
gunicorn
as the Command.8000
/TCP
This service is our asynchronous high priority task worker queue used for realtime collaboration and sending emails.
baserow/backend:1.29.2
image with celery-worker
as the image command.celery-worker
as the Command.This service is our asynchronous slow/low priority task worker queue for batch processes and running potentially slow operations for users like table exports and imports etc.
baserow/backend:1.29.2
image.celery-exportworker
as the Command.This service is our CRON task scheduler that can have multiple replicas deployed.
baserow/backend:1.29.2
image.celery-beat
as the Command.Finally, this service is used for server side rendering and serving the frontend of Baserow.
baserow/web-frontend:1.29.2
image with no arguments needed.3000
BASEROW_PUBLIC_URL
:
The public URL or IP that will be used to access baserow in your user’s browser.
Always should start with http:// https:// even if accessing via an IP address.PRIVATE_BACKEND_URL
: The web-frontend containers need to be able to send HTTP
requests to the backend-wsgi
containers running the rest API.
BASEROW_EXTRA_ALLOWED_HOSTS=*
on the backend-wsgi
and backend-asgi
containers make sure to add the value you’ve set for this env
var to BASEROW_EXTRA_ALLOWED_HOSTS
for those services otherwise they will not
allow connections from the web-frontend.DOWNLOAD_FILE_VIA_XHR
: Must be set to 1
to work with AWS S3 currently to force
download links to download files via XHR query to bypass Content-Disposition: inline
. If your files are stored under another origin, you also must add CORS headers to
your S3 bucket.Now make sure to go back and create the ECS services for the
task definitions you just made. Remember to set 900 second grace
periods in the health check when connecting things up.
Alternatively you can make a single ECS service for all of Baserow’s tasks, however you will then need to use the API/CLI to connect multiple target groups to this single ECS service as this is not possible via the AWS UI currently. You might also need to ensure that
backend-asgi
andbackend-wsgi
are exposed on different ports for the target groups to route properly to each individually. Set theBASEROW_BACKEND_PORT
env var on both of these services to different values to get them binding to different ports inside the container.
backend-wsgi
service connected to the backend-wsgi
target groupbackend-asgi
service connected to the backend-asgi
target groupweb-frontend
service connected to the web-frontend
target groupcelery-worker
servicecelery-exportworker
servicecelery-beat
serviceMost of the time scaling up your backend-wsgi
tasks and RDS postgres will be your
first port of call for handling more requests. If your realtime collaboration is slowing
down you can scale up the backend-asgi
and celery-worker
services. Finally, if you
are having to wait a long time for jobs to start (they will have progress bars in the
Baserow UI stuck at 0%) then you can add more celery-exportworker
services.
There are also the following env vars that can change the number of worker processes launched inside each container to vertically scale each one:
BASEROW_AMOUNT_OF_GUNICORN_WORKERS
controls the number of REST API workers (
the things that do most of the API work) per gunicorn-wsgi
or gunicorn
container.
Defaults to 3. Each extra worker generally takes up around 100-200 MB of RAM.BASEROW_AMOUNT_OF_WORKERS
controls the number of background task celery runners,
in the celery-worker
and celery-exportworker
containers.You should now have a fully running Baserow cluster. This deployment method is more complex to get working so if you need any help please post in our community forums.
The first user to sign-up becomes the first “staff” instance-wide admin user. This user can then configure Baserow’s in-tool settings, active enterprise licenses, promote other users to being staff etc.
Upgrading an ECS/Fargate deployment of Baserow can be done by.
baserow/baserow
or baserow/backend-wsgi/asgi
container to startup
will
automatically apply any required database migrations and upgrades.CROSSSLOT Keys in request don't hash to the same slot
errorIf you see this error in your logs it means you have launched Baserow with a Redis which is in cluster mode. Baserow uses libraries that do not support Redis in cluster mode, so you will need to provision a new Redis in the non-cluster mode for things to work. Non-cluster mode Redis can still be scaled and multi-zone, additionally Baserow does not generally end up with Redis as the bottleneck for requests.
The first-time migration after a first time deploy or upgrade might take some time so you might want to try increasing the grace period.
Secondly, Make sure the ELB can connect to the container. Please note that the container has its internal health check script which will be also calling the health check endpoint. So the presence of logs showing health check 200 responses doesn’t mean your ELB is the one triggering those.
CORS errors
when downloading a file from a Baserow file fieldYour CORS was not set up properly on the S3 bucket. Please see the example CORS config above or contact us for support.
Secure Redis scheme specified (rediss) with no ssl options, defaulting to insecure SSL behavior
warningsMake sure you have added ?ssl_cert_reqs=required
onto the end of your REDIS_URL
env
var.