\d t: ERROR: XX000: cache lookup failed for relation

Lists: pgsql-hackers
From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: \d t: ERROR: XX000: cache lookup failed for relation
Date: 2018-06-03 00:42:51
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Resending to -hackers
https://www.postgresql.org/message-id/20180527022401.GA20949%40telsasoft.com

Is that considered an actionable problem?

Encountered consistently while trying to reproduce the vacuum full
pg_statistic/toast_2619 bug; while running a loop around VAC FULL and more in
another session:

[1]- Running { time sh -ec 'while :; do psql --port 5678 postgres -qc "VACUUM FULL pg_toast.pg_toast_2619"; psql --port 5678 postgres -qc "VACUUM FULL pg_statistic"; done'; date; } &
[2]+ Running time while :; do
psql postgres --port 5678 -c "INSERT INTO t SELECT i FROM generate_series(1,999999) i"; sleep 1; for a in `seq 999`;
do
psql postgres --port 5678 -c "ALTER TABLE t ALTER i TYPE int USING i::int"; sleep 1; psql postgres --port 5678 -c "ALTER TABLE t ALTER i TYPE bigint"; sleep 1;
done; psql postgres --port 5678 -c "TRUNCATE t"; sleep 1;
done &

$ psql --port 5678 postgres -x
psql (11beta1)
...
postgres=# \set VERBOSITY verbose
postgres=# \d t
ERROR: XX000: cache lookup failed for relation 8096742
LOCATION: flatten_reloptions, ruleutils.c:11065

Justin


From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: \d t: ERROR: XX000: cache lookup failed for relation
Date: 2018-06-04 16:12:53
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers


> Is that considered an actionable problem?

I think so. but I'm not able to reproduce that, I wrote a script to simplify but
it doesn't reproduce too.

And how long to wait to reproduce? I waited for one hour

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

Attachment Content-Type Size
1.sh application/x-shellscript 464 bytes

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: \d t: ERROR: XX000: cache lookup failed for relation
Date: 2018-06-04 16:40:33
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jun 04, 2018 at 07:12:53PM +0300, Teodor Sigaev wrote:
>
> >Is that considered an actionable problem?
>
>
> I think so. but I'm not able to reproduce that, I wrote a script to simplify

The failure is triggered by running "\d t" in (yet) another session - sorry if
that was unclear. It fails very consistently, probably over 75% of the time.

Also note that my "INSERT" was run in a separate loop, concurrent with the
VACUUM and ALTER, but yours is running consecutively.

Justin


From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: \d t: ERROR: XX000: cache lookup failed for relation
Date: 2018-06-04 17:01:41
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

> The failure is triggered by running "\d t" in (yet) another session - sorry if
> that was unclear. It fails very consistently, probably over 75% of the time.
No-no, I understood that. I tried \d in one more session.

>
> Also note that my "INSERT" was run in a separate loop, concurrent with the
> VACUUM and ALTER, but yours is running consecutively.

both loops run in backgound. I tried to run two scripts - and got a lot of
deadlocks but not a probem reproduction.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/


From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: \d t: ERROR: XX000: cache lookup failed for relation
Date: 2018-06-04 17:34:10
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jun 04, 2018 at 08:01:41PM +0300, Teodor Sigaev wrote:
> >Also note that my "INSERT" was run in a separate loop, concurrent with the
> >VACUUM and ALTER, but yours is running consecutively.
>
> both loops run in backgound. I tried to run two scripts - and got a lot of
> deadlocks but not a probem reproduction.

Ah, I think this is the missing, essential component:
CREATE INDEX ON t(right(i::text,1)) WHERE i::text LIKE '%1';

I can reproduce it running just this loop:

time while :; do for a in `seq 999`; do psql postgres --port 5678 -c "ALTER TABLE t ALTER i TYPE int USING i::int"; done; done

Justin


From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: \d t: ERROR: XX000: cache lookup failed for relation
Date: 2018-06-05 12:20:35
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

> Ah, I think this is the missing, essential component:
> CREATE INDEX ON t(right(i::text,1)) WHERE i::text LIKE '%1';
Finally, I reproduce it with attached script.

INSERT 0 999999 <- first insertion
ERROR: cache lookup failed for relation 1032219
ALTER TABLE
ERROR: cache lookup failed for relation 1033478
ALTER TABLE
ERROR: cache lookup failed for relation 1034073
ALTER TABLE
ERROR: cache lookup failed for relation 1034650
ALTER TABLE
ERROR: cache lookup failed for relation 1035238
ALTER TABLE
ERROR: cache lookup failed for relation 1035837

will investigate
--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

Attachment Content-Type Size
1.sh application/x-shellscript 548 bytes

From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: \d t: ERROR: XX000: cache lookup failed for relation
Date: 2018-06-05 21:53:47
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Teodor Sigaev wrote:
>> Ah, I think this is the missing, essential component:
>> CREATE INDEX ON t(right(i::text,1)) WHERE i::text LIKE '%1';
> Finally, I reproduce it with attached script.
In attachment simplified version of script. psql uses ordinary sql query
to get info about index with usual transaction isolation/MVCC. To create
a description of index it calls pg_get_indexdef() which doesn't use
transaction snapshot, it uses catalog snapshot because it accesses to
catalog through system catalog cache. So the difference is used snapshot
between ordinary SQL query and pg_get_indexdef(). I'm not sure that
easy to fix and should it be fixed at all.

Simplified query:
SELECT c2.relname, i.indexrelid,
pg_catalog.pg_get_indexdef(i.indexrelid, 0, true)
FROM pg_catalog.pg_class c, pg_catalog.pg_class c2,
pg_catalog.pg_index i
WHERE c.relname = 't' AND c.oid = i.indrelid AND i.indexrelid = c2.oid
--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

Attachment Content-Type Size
1.sh application/x-shellscript 1.4 KB