15 sessions - 2h
What you will be able to do:
=> which in turn should bring good health, fortune and happiness
Advanced Database Design: understanding of normalization and denormalization concepts to optimize database structure
Views and Stored Functions:
Transactions and Concurrency Control:
Database Security: Introduce database security concepts,
Maintenance and Monitoring:
Triggers and procedures
CTEs, Window functions
Real datasets!
Just a few questions to know more about you
[forms.gle/VrpvjaPGEKCUji4Q9](https://forms.gle/VrpvjaPGEKCUji4Q9)
Goals
Postgres tutorials
The best resource is the excellent documentation
organize complex structured data
Discussion : Why use a database instead of just … files like json, or excel / csv ?
Sometimes we hear the word database used to describe an Excel file.
So I asked my friend GPT-4 to give me a definition of a database:
In simple terms: "A database is like a smart notebook or filing system that helps you keep track of lots of information and find exactly what you need in no time."
which definitely includes CSV files, Excel files, JSON files, XML files, and many other simple file-based formats.
From the Encyclopedia Britannica, we get:
database, any collection of data or information specifically organized for rapid search by computer. Databases are structured to facilitate storage, retrieval, modification, and deletion of data.
See also the Database article on Wikipedia.
a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and analyze the data.
No longer just talking about quickly finding information (the search part) but also about:
This is where a simple spreadsheet file no longer meets the objective.
A database management system (DBMS) is distinguished from a simple spreadsheet by several essential functionalities.
| Feature | Description | Excel | DBMS |
|---|---|---|---|
| Data Storage and Retrieval | Stores data in an organized manner and retrieves it as needed. | ✅ | ✅ |
| Data Manipulation | Allows adding, modifying, or deleting data. | ✅ | ✅ |
| Data Querying | Allows asking complex questions (queries) about the data. | ✔️ | ✅ |
| Data Organization | Structures data in formats such as tables, documents, or graphs to facilitate management. | ✅ | ✅ |
| Data Sharing | Allows multiple users or applications to use the database simultaneously. | ✅ | ✅ |
| Data Security | Protects data against unauthorized access or corruption. | ✔️ | ✅ |
| Concurrency Control | Manages multiple users modifying data at the same time without conflicts. | ✅ | |
| Backup and Recovery | Ensures that data is not lost and can be restored in case of failure. | ✔️ | ✅ |
| Data Integrity | Ensures that data remains accurate, consistent, and reliable. | ✅ | |
| Performance Optimization | Provides tools to optimize the speed and efficiency of data retrievals and updates. | ✅ | |
| Support for Transactions | Ensures that a group of operations (transactions) is completed entirely or not at all. ACID compliance | ✅ |
1991: Object-Oriented databases gain attention.
Most OODBs from the 90s are no longer used. But they influenced the evolution of both SQL and NoSQL databases.
1995: MySQL is released as open source
And in 2009, 2 new NoSQL databases are launched:
Mongodb is web scale
That was 15 years ago back in Aug 27, 2010.
The rise of the world wide web (myspace - 2003 😍, youtube - 2005) and the massive increase in application scale by several orders of magnitude.
Suddenly, we have millions of people simultaneously trying to access and modify Terabytes of data in milliseconds.
Relational databases cannot keep up with the scale of applications, the chaotic nature of unstructured data, and the speed requirements.
The promise of NoSQL is volume and speed.
and meanwhile, in 2013, Docker containers revolutionize database deployment
Vector Databases 🌶️🌶️🌶️ (e.g., Pinecone, Weaviate, Qdrant, Milvus, Faiss, ...):
and also:
Vector search is booming. Vector search capabilities are being integrated into most existing DBMS, including PostgreSQL, MongoDB, and Neo4j.
In Brief
Relational databases
There are other types of databases than relational databases :
Vector database for LLMs
Document – JSON-like docs (e.g. MongoDB, Couchbase)
Key-Value – Fast lookup via key (e.g. Redis, DynamoDB)
Wide-Column – Columns grouped in families (e.g. Cassandra, HBase)
Graph – Nodes + edges (e.g. Neo4j, ArangoDB)
Time-Series – Optimized for time data (e.g. InfluxDB, TimescaleDB)
Multi-model – Mix of types (e.g. ArangoDB, OrientDB)
Vector: for LLMs, text transformed as a vector, optimized to retrieve similar vectors (Weaviate, Pinecone, Milvus, ..)
We have many databases to choose from. It all depends on scale, application nature, budget, etc.
| Database Type | Purpose | Examples | Application |
|---|---|---|---|
| Relational - SQL | Fixed schema | PostgreSQL, MySQL, Oracle | Transactions, normalization |
| Document Stores | Flexible schema, JSON-like documents | MongoDB, CouchDB | Web applications, content management |
| Graph Databases | Relationship-centered data | Neo4j, ArangoDB | Social networks, recommendation engines |
| Key-Value Stores | Simple and fast lookups | Redis, DynamoDB | Caching, session management |
| Vector Databases | Similarity search, AI embeddings | Pinecone, Weaviate | AI applications, semantic search |
| Column-Family Stores | Wide-column data, high scalability | Cassandra, HBase | Time-series, big data applications |
| Time-Series Databases | Time-ordered data | InfluxDB, TimescaleDB | IoT, monitoring systems |
Check the ranking of all databases at https://db-engines.com/en/ranking
Trends: https://db-engines.com/en/ranking_trend
source: https://www.generativevalue.com/p/a-primer-on-databases
Also see this interactive map that lists all players in 2023.

Advantages
✅ Free to use
✅ Transparent (view + audit code)
✅ Customizable
✅ Strong community support
✅ No vendor lock-in
Disadvantages
❌ May need more setup & tuning
❌ Support = community or paid 3rd party
❌ Fewer enterprise features (sometimes)
Advantages
✅ Enterprise-grade support
✅ Robust tooling & integrations
✅ Optimized performance at scale
✅ Security/compliance features
Disadvantages
❌ Expensive licensing
❌ Closed codebase
❌ Vendor lock-in risk
❌ Limited customization
In terms of relational databases compare
https://opensource.com/article/19/1/open-source-databases
PostgreSQL's most mentioned advantage is the efficiency of its central algorithm, which means it outperforms many databases that are advertised as more advanced.
One of the most flexible open source databases around;
You can write functions in a wide range of server-side languages: Python, Perl, Java, Ruby, C, and R.
SQLite is arguably the most implemented database engine in the world,
unlike many other databases it is not a client-server engine; rather, the full software is embedded into each implementation
to build and implement a small database, SQLite is arguably the best way to go. It is extremely small,
also very Fast
Extensions : add-ons that enhance the functionality of a PostgreSQL database.
The Guardian: from MongoDB to postgresQL
Take a few minutes to read the article
https://www.theguardian.com/info/2018/nov/30/bye-bye-mongo-hello-postgres
from 2018, so no hasty conclusion. the article underlines the difficulty of choosing the right database. MongoDB has evolved a lot since then.
Reasons for Moving from MongoDB to Migration Process: PostgreSQL:
Operational Challenges: The Guardian faced significant issues with MongoDB's OpsManager, including time-consuming upgrades, lack of effective support during outages, and the need for extensive custom scripting and management.
Cost and Efficiency: The high cost of MongoDB's support contract combined with the ongoing operational burden led them to seek a more manageable and cost-effective solution.
Feature Limitations: Alternatives like DynamoDB were considered but lacked essential features like encryption at rest, which Postgres on AWS RDS provided.
Parallel APIs: They created a new API using PostgreSQL and ran it in parallel with the old MongoDB API to ensure a smooth transition.
Data Migration: Content was migrated using a script that compared and validated data between the two databases.
Proxy Usage: A proxy was employed to replicate traffic to both databases, ensuring consistency and allowing for real-time testing.
Gradual Switchover: The team gradually shifted traffic to the new Postgres API, eventually decommissioning MongoDB without causing downtime.