# Architecture In the end, web applications all have the same basic needs. Of course, it needs to be adapted for the needs of the application, however it is assumed as a common base for all the following infrastructure. ![Model W Infrastructure Schema](_static/infrastructure.png) We'll explain the rationale behind main choices here. ## Framework stack Since the requirements explained here stem from the choice of frameworks, we'll dive first into those. ### Back-end — Django At the core of the core lies Django. It's a fairly old, slow and un-sexy framework in today's world full of revolutionary async noSQL blazing-fast world-class industry-standard frameworks. How did we land there? There are the 3 main features of Django (for our use case): 1. The ORM 2. The ORM 3. The ORM We'll cover the choice of SQL and PostgreSQL later on, but the fact that Django has such a powerful ORM enables several productivity boosts: - The ORM has a Git-friendly migrations system - Most of the time, this allows for zero-downtime deployments (worst-case scenario, some requests fail, but if the app is well done the consequences are very small) - The migrations can be merged between branches, allowing for teamwork and complex release cycles that match real-life requirements - It allows easy introspection of models - The `admin` app can auto-generate full CRUD admin interfaces within minutes of work, which saved our ass so many times - The Django REST Framework can lay the base of the API by simply looking at models (then it's up to you to add the specific behaviors you need, but a _lot_ of work is done automatically) Any alternative framework should at least offer the same guarantees. So far, none really exist as such on the market and Django is entirely satisfying, so we stick with it. Then it has many perks about security, abstractions for common services (storage, emails, etc), internationalisation and localization, etc. Those are important albeit secondary. ### Front-end — Nuxt The underlying choice is: which JS framework to use? It wasn't always obvious, but some of the key names are React, Vue, Angular and Svelte. Out of those, we want to eliminate those that are not actually reactive (React) and those that are made for "enterprise" development (Angular). This leaves us with Vue and Svelte. Out of those two, we need an SSR framework that will handle under the hood many things that developers don't even realize anymore: data store, metadata of the page, HTTP requests, etc. For these reasons, we have chosen to use Vue and Nuxt ## Storage Django and Nuxt both depend on external services for persistence. Those services must come from the cloud provider, as they are a massive pain in the ass to manage yourself. Commonly considered providers in our case include DigitalOcean, AWS, Scaleway and OVH. ### Structured data — PostgreSQL As explained before, we store most of structured data in a PostgreSQL database. This is because: - It is extremely viable: - It has a great output performance-wise. Most of the time when a NoSQL solution claims to be faster it is because it forgoes "slow" features of SQL like joints, which are absolutely essential for most applications. In other words, SQL storage allows you to do the same as NoSQL but comes with a lot more features. - Clustering and scaling and so forth is pretty hard to do yourself, but why would we care since it's the job of the cloud provider. - If a product is so successful that it brings a PostgreSQL instances to its knees, we'll have plenty of resources to optimize SQL queries, shard the data and so forth. - If you need to do some NoSQL things like storing JSON, PostgreSQL has very powerful JSON management, allows indexing and querying of JSON and the query language is Turing-complete with recursive queries. You can query _a lot_ of things. - And finally, the PostGIS extension for geographical data is available in all providers and is extremely powerful in managing geo queries. - It has fantastic support in the Django ORM This overall makes it the best choice for structured data storage. ### Cache — Redis Redis is very fast, very easy to manage and available in all cloud providers. That's also a solution that is natively supported by Django, which makes it a good choice for our stack. ### Queue — Also Redis Natively Celery would rather use RabbitMQ, which is also a trustable piece of software. However, the availability of RabbitMQ in various cloud providers is quite reduced. On the other hand, Redis provides a good option and works perfectly well. So that's what we'll go with. ### User-generated content — S3-like Everything that is user-generated (images mostly, but also PDF documents, videos and so forth) should be stored in a S3-like bucket. It's easy to use, there are the right abstractions in Django, and it's available everywhere. Given also that the original S3 remains unequaled in feature-completeness, it is a service easily bought from AWS even if the rest of the infrastructure is on a different provider. ## External communications The stack above will require to be able to send emails and/or SMS to clients. In order to do so, the abstraction to be used is the [Wailer](https://github.com/withagency/wailer) system, which provides a higher-level abstraction than the Django email backend (or the `django-sms` package, which it uses internally). The goal of Wailer is to enforce patterns and best practices for sending emails which are convenient for the developers and avoid the common bugs by asking the right questions. It still needs a backing service to send its messages. The recommended one for both emails and SMSes is Mailjet. Its configuration is explained in the [Django preset](https://modelw-django-preset.readthedocs.io/en/latest/env-vars.html#external-communications) documentation. ## PaaS platform It's expected that the app code will run in a PaaS platform, or at least some kind of orchestrator like Kubernetes. There should be 3 basic services (conceptually, in practice there are more): - Nuxt for the front-end - Django (through Uvicorn) for the back-end - Celery for the Django tasks Those 3 parts are expected to be distributed as some form of images that respect the 12 factors and can be spanned as needed. The PaaS platform should be provided by the cloud provider. To be honest, the target here is mainly the PaaS from DigitalOcean, which is by far the simplest one to manipulate. ## HTTP Routing HTTP routing can be a complex topic, however our recommendation goes as follows. There shall be two "endpoints": - A load balancer for all the processes of the PaaS - A CDN for the UGC and other content from the bucket Both should be on separate domains for security and simplicity reasons. The source needs to be found back, but basically it was advice from a Google Engineering blog post. On top of that, the requirements are quite different. The load-balancer's job will be to direct HTTP requests either to the back-end or the front-end. The easiest way to do this is to decide that everything that is below the `/back` path goes to the back-end while the rest goes to the front-end. Then both receive unmodified queries and can reply to the user while keeping it transparent for the browser. This is especially important for the cookies, which would be hard to manage if requests weren't made to the same domain for front and back. This way when you request: - `https://my-app.com/yolo` you get the front-end page "Yolo" - `https://my-app.com/back/api/thing/42/` you get the "thing" with ID 42 from API - `https://my-app.com/back/api/login/` you login against the back-end and this cookie will be forwarded to any page within this domain - Meaning that if you go back to `https://my-app.com/yolo` and you render it server-side, it will still receive the session cookie, allowing to forward it to back-end requests from the frontend (which Nuxt/Axios does automatically) Then for user-generated content, which is inherently static, it makes more sense to have a CDN in front of the bucket that will cache and serve this content on edge. ## Repo structure The rule of thumb is that all which is shipped together should sit in the same repo. Otherwise, you just end up doing the same conceptual commit in all the different components of the app. Let's just acknowledge that we work with monoliths! At the root of the repo should be found the different components: - `api` — The Django backend - `front` — The Nuxt front-end - `bot` — The BERNARD project (if any) - Any other component that you need ### API The API should follow the [API template](https://github.com/ModelW/project-maker/tree/develop/src/model_w/project_maker/template/api). Amongst many things: - Configuration comes 100% from environment variables - The project depends on the `modelw-preset-django` package at the current Model W version See [the Transform an Existing Project](./transform-project.md) section for more details. ### Front Same as the API, it should follow the [front template](https://github.com/ModelW/project-maker/tree/develop/src/model_w/project_maker/template/front). ## Extra While the stack is explained above, some peripheral tools deserve to be mentioned below. ### Sentry Sentry is a powerful exception reporting tool. It will let you aggregate: - Different exceptions with a lot of details as to which user had it, on which environment, how many times, etc - Detailed performance metrics including the list of SQL queries and their specific run time Overall, that's the first tool to look into for insights about any project.