development
Back to overview

How Baserow lets users generate Django models on the fly

Blog placeholder image

Here at Baserow we’re using the Django ORM in a unique way. We generate dynamic models that we use to mutate user data. With Baserow, non-technical users can create their own no-code database without technical knowledge. Think of it as a hybrid relational database with a slick UI. To safely do so, we try to use as much of the Django ORM as much possible. In this blog post, we’ll explore how we’ve pushed the ORM to its limits whilst building Baserow’s key backend features.

enter image description here

Django Models

Models are one of Django’s most powerful features, letting you both represent your database schema in python but also create and migrate your schema from the models themselves. If you for example want to store projects in a SQL table, your model would look like this:

class Project(models.Model):
    name = models.CharField(max_length=255)

To create the table in the database, you first need to generate and execute the migrations using the makemigrations and migrate management commands. This will detect the changes in your models and generate a Python file containing the changes, which are then executed with the second command.

Baserow tables

We use the same approach for our tables and schema changes. However with Baserow, users can create their own relational database without the knowledge of Python, Django or PostgreSQL. To avoid confusion, when we mention a PostgreSQL table, we mean a table created in the PostgreSQL database and a Baserow table is the one created in Baserow by a user via the web frontend or the REST API. Every Baserow table is backed by a real PostgreSQL table in the database. A Baserow table is created by a user via the web frontend interface or the REST API.

Dynamically generating models

How do we build a Django app that lets non-technical users essentially create and migrate their own models? The first step was realizing that Django models could also be generated on the fly using the Python type function and that we could use the schema editor to make schema changes just like the migrations. Generating the project model looks like this:

from django.db import models

Project = type(
    "Project",
    (models.Model,),
    {
        "name": models.CharField(max_length=255),
        "Meta": type(
            "Meta",
            (),
            {"app_label": "test"}
        ),
        "__module__": "database.models"
    }
)

It can happen that you need to generate the same model or a model with the same name for the second time. However, this results in an error from Django that the model is already registered.

At Baserow we regenerate the model every time we need it. This for example happens when a row is updated or requested. We do this because the table schema might have changed and because one Baserow instance could have millions of tables. Registering them could result into running out of memory very quickly.

Registration can be prevented by extending the AppConfig. We identify generated models by adding a _generated_table_model property to it.

# apps.py
class DatabaseConfig(AppConfig):
    name = "database"

    def ready(self):
        original_register_model = self.apps.register_model

        def register_model(app_label, model):
            if not hasattr(model, "_generated_table_model"):
                original_register_model(app_label, model)
            else:
                self.apps.do_pending_operations(model)
                self.apps.clear_cache()

        self.apps.register_model = register_model

Making schema changes

After generating the model, you can’t create new records because the table has not yet been created in the PostgreSQL database. Normally, the schema change is done by executing the migration, using the migrate management command. When you apply a migration file in Django, it uses the schema_editor under the hood to make the change. The schema editor can also be used with generated models. If you for example want to make the project table and create a new record you could do this:

from django.db import connection, models

# The model as described in the previous example.
Project = type(...)

with connection.schema_editor() as schema_editor:
    schema_editor.create_model(Project)

Project.objects.create(name="Baserow")

The schema editor has everything you need to make all the changes you need, from add_field to delete_model.

How Baserow generates models

Baserow works similarly to the approach described above. The only difference is that we generate our models dynamically from metadata tables which describe what the user table looks like. Two of these key metadata tables are the Table table, which has a row per user table, and the Field table, which has a row per user created field in a table:

Table

id name
1 Project

Field

id table_id Name type
1 1 name text
2 1 description text

If we want to generate the Project model, we have to query the table and field metadata tables first and then use that data.

class Table(models.Model):
    name = models.CharField(max_length=255)

class Field(models.Model):
    table = models.ForeignKey(Table, on_delete=models.CASCADE)
    name = models.CharField(max_length=255)
    type = models.CharField(max_length=32)

table = Table.objects.get(pk=1)
fields = Fields.objects.filter(table=table)
attrs = {
    "Meta": type(
        "Meta",
        (),
        {"app_label": "test"}
    ),
    "__module__": "database.models"
}

for field in fields:
    attrs[field.name] = models.CharField(max_length=255)

GeneratedModel = type(
    f"Table{table.id}",
    (models.Model,),
    attrs
)

# Assuming the PostgreSQL table has already been created using the schema editor.
GeneratedModel.objects.all()

Of course, we do a whole bunch of other things in Baserow, like supporting different field types. We have wrapped up all the model generation code into the single get_model method shown below. For example, if you want to create a new row in a Baserow table you can do it in four lines of code:

from baserow.contrib.database.table.models import Table

table = Table.objects.get(pk=YOUR_TABLE_ID)
model = table.get_model()
model.objects.create()

If you create a Baserow plugin, you can easily fetch data from your Baserow table this way.

By using the Django ORM to manage our Baserow tables, we can avoid security mistakes like SQL injection, write clean and easy-to-understand Django ORM code when working on user data and finally provide a great API for Baserow plug-ins.

Caching models

Some Baserow tables consist of 100+ fields and the model needs to be generated frequently. Fetching 100 fields and generating the corresponding model can take a significant amount of time, especially if you have to do it every single request. To improve the performance, we cache the models by storing the field attributes in a Redis cache. Simplified it looks a bit like this:

from django.core import cache

fields = Fields.objects.filter(table=table)
for field in fields:
    attrs[field.name] = models.CharField(max_length=255)

cache.set(f"table_model_cache_{table.id}", attrs, timeout=None)
attrs = cache.get(f"table_model_cache_{table.id}")
GeneratedModel = type(
    f"Table{table.id}",
    (models.Model,),
    attrs
)

Sponsoring Django to support open-source projects

Thanks to Django, we at Baserow save tons of time: we build features faster, we write cleaner code, and we avoid many common security mistakes, to name just a few. Now it’s our turn to give back. We decided to sponsor the company and now Baserow is an official Corporate Member of the Django Foundation! This is the least we can do to support this important open source project.

Dive deeper into how we use Django at Baserow

All the use cases described above are very simplified. If you want to learn more about how we generate models, apply schema migrations and cache models, click here.

And the last thing, Baserow is also an open-source project, so everyone interested is welcome to contribute — whether through code, documentation or bug reports!

Visit the Baserow website: https://baserow.io/

Check out the Baserow repo: https://gitlab.com/bramw/baserow

release
May 10, 2022 by Bram Wiepjes
May 2022 release of Baserow

Today we are releasing 1.10: multiple cell paste, batch API endpoints, undo-redo, the coloring of rows and Zapier integration are out!

info
May 21, 2020 by Bram Wiepjes
Best Airtable alternatives
info
March 2, 2021 by Bram Wiepjes
Best Excel alternatives