A view in PostgreSQL is a virtual table that is created by a SELECT query.
It abstracts or simplifies complex queries.
It acts as a stored query that you can treat like a regular table.
Let's consider the normalized version of the trees table.

To see all the trees given a certain name, genre, species or variety you have to write a query with several joins
SELECT t.idbase, t.circumference, t.height, t.diameter, t.remarkable,
tn.name, tg.genre, ts.species, tv.variety
FROM trees t
JOIN taxonomy tax ON t.taxonomy_id = tax.id
JOIN tree_names tn ON tax.name_id = tn.id
JOIN tree_genres tg ON tax.genre_id = tg.id
JOIN tree_species ts ON tax.species_id = ts.id
JOIN tree_varieties tv ON tax.variety_id = tv.id
WHERE tv.variety = 'October Glory';
If you have to write this query a lot, it makes sense to create a view with
CREATE VIEW taxonomy_view AS
SELECT
t.idbase,
t.id_location_legacy,
t.circumference,
t.height,
t.diameter,
t.remarkable,
t.domain_id,
t.stage_id,
t.location_id,
tn.name AS tree_name,
tg.genre,
ts.species,
tv.variety
FROM
trees t
JOIN taxonomy tax ON t.taxonomy_id = tax.id
JOIN tree_names tn ON tax.name_id = tn.id
JOIN tree_genres tg ON tax.genre_id = tg.id
JOIN tree_species ts ON tax.species_id = ts.id
JOIN tree_varieties tv ON tax.variety_id = tv.id;
Then you only have to query the taxonomy_view view to get your trees
select * from taxonomy_view where variety = 'October Glory';
To modify a view you can either drop and recreate the view, or simply use the same CREATE OR REPlACE statement as it handles both creation and modification in one statement.
Let's say for instance that we want to add some conditional logic to the view so that we only get the trees in one of Paris Arrondissement and not outside of Paris.
We just rewrite the view:
CREATE OR REPLACE VIEW taxonomy_view AS
SELECT
t.idbase,
t.id_location_legacy,
t.circumference,
t.height,
t.diameter,
t.remarkable,
t.domain_id,
t.stage_id,
t.location_id,
tn.name AS tree_name,
tg.genre,
ts.species,
tv.variety,
loc.arrondissement
FROM
trees t
JOIN taxonomy tax ON t.taxonomy_id = tax.id
JOIN tree_names tn ON tax.name_id = tn.id
JOIN tree_genres tg ON tax.genre_id = tg.id
JOIN tree_species ts ON tax.species_id = ts.id
JOIN tree_varieties tv ON tax.variety_id = tv.id
-- add condition on locations
JOIN locations loc on loc.id = t.location_id
where loc.arrondissement like 'PARIS%';
You cannot just add or remove columns from a view using the REPLACE statement.
You have to drop the view and recreate it from scratch.
drop view taxonomy_view;
And then redefine the
CREATE OR REPLACE VIEW taxonomy_view AS
SELECT
tn.name AS tree_name,
tg.genre,
ts.species,
tv.variety,
loc.arrondissement,
t.height,
t.diameter,
t.remarkable
FROM
trees t
JOIN taxonomy tax ON t.taxonomy_id = tax.id
JOIN tree_names tn ON tax.name_id = tn.id
JOIN tree_genres tg ON tax.genre_id = tg.id
JOIN tree_species ts ON tax.species_id = ts.id
JOIN tree_varieties tv ON tax.variety_id = tv.id
-- add condition on locations
JOIN locations loc on loc.id = t.location_id
where loc.arrondissement like 'PARIS%';
As you would with a normal stored table, you can use the view in a query.
For instance let's get all the Erables (Maple trees) in Paris :
select * from taxonomy_view where tree_name = 'Erable';
A view does not store data itself.
When you query a view, PostgreSQL executes the underlying SELECT statement and returns the result set as if it were a table.
Remember : we saw that Everything is a relation
A strong use case for VIEWS is to weed out all the noise and anomalies from a messy dataset and working on good quality records only. For instance, we could add conditions in the VIEW on the presence of values for dimensions, stage, domains etc ... (NOT NULL), so that a dashboard only reflects the high quality data.
Views can restrict access to specific data. For instance, you can create a view that only exposes certain columns of a table, thereby limiting what data users can see.
Views can be updated or dropped without affecting the underlying tables. However, changing the structure of underlying tables might require updating the associated views.
Since views are not stored with the data but are generated on the fly, complex views with multiple joins or subqueries can lead to performance issues, especially with large datasets.
In a data processing pipeline (ETL), where should data transformation happen ?
Connect on your local to the normalized version of the treesdb (treesdb_v03).
PostgreSQL also supports materialized views, which store the result of the SELECT query physically on disk.
This is useful
materialized views are worth considering when there are no other better solutions.
see https://www.postgresql.org/docs/current/rules-materializedviews.html and https://www.postgresqltutorial.com/postgresql-views/postgresql-materialized-views/
To create a materialized view in PostgreSQL, you use the CREATE MATERIALIZED VIEW statement, which is similar to creating a regular view but with the MATERIALIZED keyword added.
CREATE MATERIALIZED VIEW view_name AS
SELECT columns
FROM table_name
WHERE conditions;
To load data into a materialized view, you use the REFRESH MATERIALIZED VIEW statement:
REFRESH MATERIALIZED VIEW view_name;
However, when you refresh data for a materialized view, PostgreSQL locks the underlying tables. Consequently, you will not be able to retrieve data from underlying tables while data is loading into the view.
To avoid this, you can use the CONCURRENTLY option.
REFRESH MATERIALIZED VIEW CONCURRENTLY view_name;
With the CONCURRENTLY option, PostgreSQL creates a temporary updated version of the materialized view, compares two versions, and performs INSERT and UPDATE only the differences.
To make the process automatic, we can
REFRESH MATERIALIZED VIEW view_name; queryMaterialized views are a powerful feature in PostgreSQL, providing a way to store the results of a query physically on disk, which can significantly improve performance for certain types of queries. However, they also come with trade-offs.
Here's an overview of the pros and cons:
| Pros | Cons |
|---|---|
| Performance Improvement - Faster query execution - Reduced computation time | Data Staleness - Potential for outdated information |
| Data Aggregation and Summarization - Efficient pre-aggregated data storage | Maintenance Overhead - Requires explicit, potentially resource-intensive refreshing |
| Offloading Complex Queries - Handles complex calculations, joins, and subqueries | Storage Space - Consumes additional disk space |
| Data Snapshot - Provides stable dataset for historical analysis | Data Consistency Challenges - Complexity in maintaining consistency with base tables |
| Ease of Use - Simplifies query logic in application layer | Write Performance Impact - Possible slowdown for insert/update/delete operations |
| Implementation Complexity - May require complex refresh strategies (e.g., triggers, scheduled jobs) |
Materialized views are particularly useful in scenarios where data doesn't change frequently or where the cost of occasional refreshes is outweighed by the performance benefits.
and