Re: Hot standby, slot ids and stuff

Lists: pgsql-hackers
From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Hot standby, slot ids and stuff
Date: 2009-01-08 18:30:37
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

I've mentioned before that I don't like the slotid stuff. From an
architectural point of view, we should try to keep the extra
communication between primary and standby to a minimum, for the sake of
robustness. The primary shouldn't have such a big role in keeping track
of which xids it has already emitted in WAL records, which subxids have
been marked, and so on.

Since I've been whining about that for some time, I figured I have to
put my money where my mouth is, so here's a patch based on v6a that
eliminates the concept of slotids, as well as the new xl_info2 field in
XLogRecord. This seems much simpler to me. I haven't given it much
testing, but seems to work. There's a whole bunch of comments marked
with XXX that need resolving, though.

Attached is a patch agains CVS HEAD (hs-noslotid-1.patch) as well as an
incremental patch against your v6a (hs-noslotid-against-v6a.patch).
Sorry if you were just working on some big changes and I joggled your
elbow. I also didn't check what changes you had in v7 yet.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Attachment Content-Type Size
hs-noslotid-1.patch.gz application/x-gzip 81.7 KB
hs-noslotid-against-v6a.patch.gz application/x-gzip 13.9 KB

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot standby, slot ids and stuff
Date: 2009-01-08 19:09:05
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers


On Thu, 2009-01-08 at 20:30 +0200, Heikki Linnakangas wrote:

> I've mentioned before that I don't like the slotid stuff. From an
> architectural point of view, we should try to keep the extra
> communication between primary and standby to a minimum, for the sake of
> robustness. The primary shouldn't have such a big role in keeping track
> of which xids it has already emitted in WAL records, which subxids have
> been marked, and so on.
>
> Since I've been whining about that for some time, I figured I have to
> put my money where my mouth is

Yes, you do, but I think not like this. Turning up with a patch and
saying "my way is better" but not actually saying what that way *is*
just confuses things.

If you want to do things a different way you need to say what you want
to do and what effects those changes will have. Are there tradeoffs? If
so what are they? ISTM easier to discuss them on list first than to
write a load of code and ask us all to read, understand and comment.

> , so here's a patch based on v6a that
> eliminates the concept of slotids, as well as the new xl_info2 field in
> XLogRecord. This seems much simpler to me. I haven't given it much
> testing, but seems to work. There's a whole bunch of comments marked
> with XXX that need resolving, though.
>
> Attached is a patch agains CVS HEAD (hs-noslotid-1.patch) as well as an
> incremental patch against your v6a (hs-noslotid-against-v6a.patch).
> Sorry if you were just working on some big changes and I joggled your
> elbow. I also didn't check what changes you had in v7 yet.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot standby, slot ids and stuff
Date: 2009-01-08 19:50:41
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers


On Thu, 2009-01-08 at 20:30 +0200, Heikki Linnakangas wrote:

> I've mentioned before that I don't like the slotid stuff. From an
> architectural point of view, we should try to keep the extra
> communication between primary and standby to a minimum, for the sake of
> robustness. The primary shouldn't have such a big role in keeping track
> of which xids it has already emitted in WAL records, which subxids have
> been marked, and so on.

Removing slotid does improve the code, I agree. It was never there for
reasons other than the functions and features it provides.

Effects of removing slotid are at least these

* if FATAL errors occur, yet we have long running transactions then we
have no way of removing those entries from the recovery procs. Since we
have a fixed pool of recovery transactions we won't have anywhere to
store that data. Snapshot sizes are fixed maximum with max_connections,
so eventually we would not be able to take a snapshot at all, and we'd
need to have a "ERROR: unable to take valid snapshot".

* if FATAL errors occur while holding AccessExclusiveLock then we have
no way of releasing those locks until the recovery proc is stale, which
might be some time. Not sure if your patch releases those?

* xid->proc lookup is now very slow and needs to be called more
frequently; will still be slower even with the further optimisations you
mention. Possibly a minor point with right tuning.

* slotid removed from xlrec header; makes no difference with 64-bit
systems because of alignment issues.

...perhaps more, not sure yet.

All we need to do is decide if we want these things or not. If not, then
we can remove the slotid stuff.

If it wasn't clear before, this was, for me, never a discussion about a
particular way of coding things. I care little for that and would
probably go with your suggestions more often than not. It's just simply
about what we want it to do. If you want to argue in favour of
restrictions to make things simpler, that's OK and I could probably live
with them quite happily myself. I'm trying to implement the project
philosophy of It Just Works, no restrictions or caveats - and I know if
I had not then the patch would have been rejected on that basis, as has
happened many times before.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot standby, slot ids and stuff
Date: 2009-01-08 20:31:28
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs wrote:
> * if FATAL errors occur, yet we have long running transactions then we
> have no way of removing those entries from the recovery procs. Since we
> have a fixed pool of recovery transactions we won't have anywhere to
> store that data. Snapshot sizes are fixed maximum with max_connections,
> so eventually we would not be able to take a snapshot at all, and we'd
> need to have a "ERROR: unable to take valid snapshot".

When a backend dies with FATAL, it writes an abort record before exiting.

(I was under the impression it doesn't until few minutes ago myself,
when I actually read the shutdown code :-))

> * if FATAL errors occur while holding AccessExclusiveLock then we have
> no way of releasing those locks until the recovery proc is stale, which
> might be some time. Not sure if your patch releases those?

I don't think so, that needs to be fixed.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot standby, slot ids and stuff
Date: 2009-01-08 20:31:44
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs wrote:
> If you want to do things a different way you need to say what you want
> to do and what effects those changes will have.

I want to reduce the coupling between the primary and the master. The
less they need to communicate, the better. I want to get rid of slotid,
and as many of the other extra information carried in WAL records that I
can. I believe that will make the patch both simpler and more robust.

> Are there tradeoffs? If so what are they?

I don't think there's any big difference in user-visible behavior.
RecordKnownAssignedTransactionId now needs to be called for every xlog
record as opposed to just the first record where an xid appears, because
I eliminated the hint flag in WAL to mark those records. And it needs to
look up the recover proc by xid, instead of using the slot id. But I
don't think that will have a significant impact on performance.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot standby, slot ids and stuff
Date: 2009-01-08 20:37:52
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers


On Thu, 2009-01-08 at 22:31 +0200, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > * if FATAL errors occur, yet we have long running transactions then we
> > have no way of removing those entries from the recovery procs. Since we
> > have a fixed pool of recovery transactions we won't have anywhere to
> > store that data. Snapshot sizes are fixed maximum with max_connections,
> > so eventually we would not be able to take a snapshot at all, and we'd
> > need to have a "ERROR: unable to take valid snapshot".
>
> When a backend dies with FATAL, it writes an abort record before exiting.
>
> (I was under the impression it doesn't until few minutes ago myself,
> when I actually read the shutdown code :-))

Not in all cases; keep reading :-)

The good thing is we have a choice now as to whether we care about that,
or not.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot standby, slot ids and stuff
Date: 2009-01-08 20:44:00
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs wrote:
> On Thu, 2009-01-08 at 22:31 +0200, Heikki Linnakangas wrote:
>> When a backend dies with FATAL, it writes an abort record before exiting.
>>
>> (I was under the impression it doesn't until few minutes ago myself,
>> when I actually read the shutdown code :-))
>
> Not in all cases; keep reading :-)

Want to give a clue? ShutdownPostgres is registered as an on_shmem_exit
hook in InitPostgres, and ShutdownPostgres calls AbortOutOfAnyTransaction.

PANIC is another story, but in that case the primary will go down, and
will write a new checkpoint soon after it starts up again.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot standby, slot ids and stuff
Date: 2009-01-08 20:50:00
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> On Thu, 2009-01-08 at 22:31 +0200, Heikki Linnakangas wrote:
>> When a backend dies with FATAL, it writes an abort record before exiting.
>>
>> (I was under the impression it doesn't until few minutes ago myself,
>> when I actually read the shutdown code :-))

> Not in all cases; keep reading :-)

If it doesn't, that's a bug. A FATAL exit is not supposed to leave the
shared state corrupted, it's only supposed to be a forcible session
termination. Any open transaction should be rolled back.

regards, tom lane


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot standby, slot ids and stuff
Date: 2009-01-09 09:10:40
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers


On Thu, 2009-01-08 at 15:50 -0500, Tom Lane wrote:
> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> > On Thu, 2009-01-08 at 22:31 +0200, Heikki Linnakangas wrote:
> >> When a backend dies with FATAL, it writes an abort record before exiting.
> >>
> >> (I was under the impression it doesn't until few minutes ago myself,
> >> when I actually read the shutdown code :-))
>
> > Not in all cases; keep reading :-)
>
> If it doesn't, that's a bug. A FATAL exit is not supposed to leave the
> shared state corrupted, it's only supposed to be a forcible session
> termination. Any open transaction should be rolled back.

Please look back at the earlier discussion we had on this exact point:
http://archives.postgresql.org/pgsql-hackers/2008-09/msg01809.php

Heikki, perhaps now you understand my continued opposition to your ideas
for simplification: I had already thought of them and was forced to rule
them out, not through my own choice. Tom, note that I listen to what you
say and try to write code that meets those requirements.

>From here, I don't mind which way we go. I want code that is as simple
as possible to "do the job", whatever we agree that to be.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot standby, slot ids and stuff
Date: 2009-01-09 10:33:04
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs wrote:
> On Thu, 2009-01-08 at 15:50 -0500, Tom Lane wrote:
>> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
>>> On Thu, 2009-01-08 at 22:31 +0200, Heikki Linnakangas wrote:
>>>> When a backend dies with FATAL, it writes an abort record before exiting.
>>>>
>>>> (I was under the impression it doesn't until few minutes ago myself,
>>>> when I actually read the shutdown code :-))
>>> Not in all cases; keep reading :-)
>> If it doesn't, that's a bug. A FATAL exit is not supposed to leave the
>> shared state corrupted, it's only supposed to be a forcible session
>> termination. Any open transaction should be rolled back.
>
> Please look back at the earlier discussion we had on this exact point:
> http://archives.postgresql.org/pgsql-hackers/2008-09/msg01809.php

I think the running-xacts list we dump to WAL at every checkpoint is
enough to handle that. Just treat the dead transaction as in-progress
until the next running-xacts record. It's presumably extremely rare to
have a process die with FATAL, and not write an abort record.

A related issue is that currently the recovery PANICs if it runs out of
recovery procs. I think that's not acceptable, regardless of whether we
use slotids or some other method to avoid it in normal operation,
because it means you can't recover at all if you set max_connections too
low in the standby (or in the primary, and you have to recover from
crash), or you run out of recovery procs because of an abort failed in
the primary like discussed on that thread. The standby should just
fast-forward to the next running-xacts record in that case.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot standby, slot ids and stuff
Date: 2009-01-09 11:20:40
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers


On Fri, 2009-01-09 at 12:33 +0200, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > On Thu, 2009-01-08 at 15:50 -0500, Tom Lane wrote:
> >> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> >>> On Thu, 2009-01-08 at 22:31 +0200, Heikki Linnakangas wrote:
> >>>> When a backend dies with FATAL, it writes an abort record before exiting.
> >>>>
> >>>> (I was under the impression it doesn't until few minutes ago myself,
> >>>> when I actually read the shutdown code :-))
> >>> Not in all cases; keep reading :-)
> >> If it doesn't, that's a bug. A FATAL exit is not supposed to leave the
> >> shared state corrupted, it's only supposed to be a forcible session
> >> termination. Any open transaction should be rolled back.
> >
> > Please look back at the earlier discussion we had on this exact point:
> > http://archives.postgresql.org/pgsql-hackers/2008-09/msg01809.php
>
> I think the running-xacts list we dump to WAL at every checkpoint is
> enough to handle that. Just treat the dead transaction as in-progress
> until the next running-xacts record. It's presumably extremely rare to
> have a process die with FATAL, and not write an abort record.

I agree, but I'll wait for Tom to speak further.

> A related issue is that currently the recovery PANICs if it runs out of
> recovery procs. I think that's not acceptable, regardless of whether we
> use slotids or some other method to avoid it in normal operation,
> because it means you can't recover at all if you set max_connections too
> low in the standby (or in the primary, and you have to recover from
> crash), or you run out of recovery procs because of an abort failed in
> the primary like discussed on that thread.

> The standby should just
> fast-forward to the next running-xacts record in that case.

What do you mean by "fast forward"?

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot standby, slot ids and stuff
Date: 2009-01-09 11:23:35
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs wrote:
> On Fri, 2009-01-09 at 12:33 +0200, Heikki Linnakangas wrote:
>> A related issue is that currently the recovery PANICs if it runs out of
>> recovery procs. I think that's not acceptable, regardless of whether we
>> use slotids or some other method to avoid it in normal operation,
>> because it means you can't recover at all if you set max_connections too
>> low in the standby (or in the primary, and you have to recover from
>> crash), or you run out of recovery procs because of an abort failed in
>> the primary like discussed on that thread.
>
>> The standby should just
>> fast-forward to the next running-xacts record in that case.
>
> What do you mean by "fast forward"?

I mean the standby should stop trying to track the in progress
transactions in recovery procs, and apply the WAL records like it does
before the consistent state is reached.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot standby, slot ids and stuff
Date: 2009-01-09 12:21:39
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers


On Fri, 2009-01-09 at 13:23 +0200, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > On Fri, 2009-01-09 at 12:33 +0200, Heikki Linnakangas wrote:
> >> A related issue is that currently the recovery PANICs if it runs out of
> >> recovery procs. I think that's not acceptable, regardless of whether we
> >> use slotids or some other method to avoid it in normal operation,
> >> because it means you can't recover at all if you set max_connections too
> >> low in the standby (or in the primary, and you have to recover from
> >> crash), or you run out of recovery procs because of an abort failed in
> >> the primary like discussed on that thread.
> >
> >> The standby should just
> >> fast-forward to the next running-xacts record in that case.
> >
> > What do you mean by "fast forward"?
>
> I mean the standby should stop trying to track the in progress
> transactions in recovery procs, and apply the WAL records like it does
> before the consistent state is reached.

If you say something is not acceptable you need to say what behaviour
you do find acceptable in its place. It's good to come up with new
ideas, but it's not good to ignore the problems the new ideas have. This
is a general point that applies both here and to your proposals with
synch rep. It's much harder to say how it should work in a way that
covers all the requirements and points others have made, otherwise its
just handwaving.

So, if we don't PANIC, how should we behave?

Without full information on running-xacts we would be unable to take a
snapshot, so should:
* backends be forcibly disconnected?
* backends hang waiting for snapshot info to be re-available again in X
minutes worth of WAL time?
* backends throw an ERROR: unable to provide snapshot at this time,
DETAIL: retry your statement later.
...other alternatives

and possibly prevent new connections.

If max_connections is higher on primary then the standby will *never* be
available for querying. Should we have multiple ERRORs depending upon
whether the situation is hopefully-temporary or looks-permanent?

Don't assume I want the PANIC. That clearly needs to be revisited if we
change slotids. I just want to make a balanced judgement based upon a
full consideration of the options.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot standby, slot ids and stuff
Date: 2009-01-09 12:38:05
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs wrote:
> On Fri, 2009-01-09 at 13:23 +0200, Heikki Linnakangas wrote:
>> I mean the standby should stop trying to track the in progress
>> transactions in recovery procs, and apply the WAL records like it does
>> before the consistent state is reached.
>
> ...
>
> So, if we don't PANIC, how should we behave?
>
> Without full information on running-xacts we would be unable to take a
> snapshot, so should:
> * backends be forcibly disconnected?
> * backends hang waiting for snapshot info to be re-available again in X
> minutes worth of WAL time?
> * backends throw an ERROR: unable to provide snapshot at this time,
> DETAIL: retry your statement later.
> ...other alternatives
>
> and possibly prevent new connections.

All of those seem reasonable to me. The 2nd option seems nicest, "X
minutes" should probably be controlled by max_standby_delay, after which
you can throw an error.

If we care enough, we could also keep tracking the transactions in
backend-private memory of the startup process, until there's enough room
in proc array. That would make the outage shorter, because you wouldn't
have to wait until the next running-xacts record, but only until enough
transactions have finished that they all fit in proc array again.

But whatever is the simplest, really.

> If max_connections is higher on primary then the standby will *never* be
> available for querying. Should we have multiple ERRORs depending upon
> whether the situation is hopefully-temporary or looks-permanent?
>
> Don't assume I want the PANIC. That clearly needs to be revisited if we
> change slotids.

It needs to be revisited whether we change slotids or not, IMHO.

Note that with slotids, you have a problem as soon as any of the slots
that don't exist on standby are used, regardless of how many concurrent
transactions there actually is. Without slots you only have a problem if
you really have more than standby's max_connections concurrent
transactions. That makes a big difference in practice.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot standby, slot ids and stuff
Date: 2009-01-09 13:08:32
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers


On Fri, 2009-01-09 at 14:38 +0200, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > On Fri, 2009-01-09 at 13:23 +0200, Heikki Linnakangas wrote:
> >> I mean the standby should stop trying to track the in progress
> >> transactions in recovery procs, and apply the WAL records like it does
> >> before the consistent state is reached.
> >
> > ...
> >
> > So, if we don't PANIC, how should we behave?
> >
> > Without full information on running-xacts we would be unable to take a
> > snapshot, so should:
> > * backends be forcibly disconnected?
> > * backends hang waiting for snapshot info to be re-available again in X
> > minutes worth of WAL time?
> > * backends throw an ERROR: unable to provide snapshot at this time,
> > DETAIL: retry your statement later.
> > ...other alternatives
> >
> > and possibly prevent new connections.
>
> All of those seem reasonable to me. The 2nd option seems nicest, "X
> minutes" should probably be controlled by max_standby_delay, after which
> you can throw an error.

Hmm, we use the recovery procs to track transactions that have
TransactionIds assigned. That means we will overflow only if we have
approach 100% write transactions at any time, or if we have more write
transactions in progress than we have max_connections on standby.

So it sounds like the overflow situation would probably be both rare
and, if it did occur, may not occur for long periods.

> If we care enough, we could also keep tracking the transactions in
> backend-private memory of the startup process, until there's enough room
> in proc array. That would make the outage shorter, because you wouldn't
> have to wait until the next running-xacts record, but only until enough
> transactions have finished that they all fit in proc array again.
>
> But whatever is the simplest, really.

The above does sound best since it would allow us to have the snapshot
hang for a short period. But at this stage of the game, more complex.

For now though, since it looks like it would happen fairly rarely, I'd
opt for the simplest: throw an ERROR.

> > If max_connections is higher on primary then the standby will *never* be
> > available for querying. Should we have multiple ERRORs depending upon
> > whether the situation is hopefully-temporary or looks-permanent?
> >
> > Don't assume I want the PANIC. That clearly needs to be revisited if we
> > change slotids.
>
> It needs to be revisited whether we change slotids or not, IMHO.
>
> Note that with slotids, you have a problem as soon as any of the slots
> that don't exist on standby are used, regardless of how many concurrent
> transactions there actually is. Without slots you only have a problem if
> you really have more than standby's max_connections concurrent
> transactions. That makes a big difference in practice.

Sometimes, but mostly people set max_connections higher because they
intend to use those extra connections. So no real advantage there
against the slotid approach :-)

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot standby, slot ids and stuff
Date: 2009-01-09 19:45:07
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers


On Fri, 2009-01-09 at 11:20 +0000, Simon Riggs wrote:
> On Fri, 2009-01-09 at 12:33 +0200, Heikki Linnakangas wrote:
> > Simon Riggs wrote:
> > > On Thu, 2009-01-08 at 15:50 -0500, Tom Lane wrote:
> > >> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> > >>> On Thu, 2009-01-08 at 22:31 +0200, Heikki Linnakangas wrote:
> > >>>> When a backend dies with FATAL, it writes an abort record before exiting.
> > >>>>
> > >>>> (I was under the impression it doesn't until few minutes ago myself,
> > >>>> when I actually read the shutdown code :-))
> > >>> Not in all cases; keep reading :-)
> > >> If it doesn't, that's a bug. A FATAL exit is not supposed to leave the
> > >> shared state corrupted, it's only supposed to be a forcible session
> > >> termination. Any open transaction should be rolled back.
> > >
> > > Please look back at the earlier discussion we had on this exact point:
> > > http://archives.postgresql.org/pgsql-hackers/2008-09/msg01809.php
> >
> > I think the running-xacts list we dump to WAL at every checkpoint is
> > enough to handle that. Just treat the dead transaction as in-progress
> > until the next running-xacts record. It's presumably extremely rare to
> > have a process die with FATAL, and not write an abort record.
>
> I agree, but I'll wait for Tom to speak further.

OK, will proceed without this. Time is pressing.

Heikki and I both agree that FATAL errors that fail to write abort
records are rare and an acceptable problem in real usage. If they do
occur, their nuisance factor is short-lived because of measures taken
within the patch.

Hot Standby does not *rely* upon there always an abort record for FATAL
errors, so we cannot reasonably say the current design would be
"unacceptably fragile" as I had once thought.

So based upon that, out comes the slotid concept and luckily much of the
cruftier aspects of the patch. Less code, probably fewer bugs. Good
thing.

I will produce a new v7 with those changes and merge the changes for v6b
also, so we can begin review again from there.

Hi ho, hi ho...

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot standby, slot ids and stuff
Date: 2009-01-10 09:40:56
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers


On Thu, 2009-01-08 at 20:30 +0200, Heikki Linnakangas wrote:

> Since I've been whining about that for some time, I figured I have to
> put my money where my mouth is, so here's a patch based on v6a that
> eliminates the concept of slotids, as well as the new xl_info2 field in
> XLogRecord. This seems much simpler to me. I haven't given it much
> testing, but seems to work. There's a whole bunch of comments marked
> with XXX that need resolving, though.

There's a confusion in the patch between top level xid and parent xid.
The xl field is named parentxid but actually contains top level. That
distinction is important because the code now uses the top level xid to
locate the recovery proc, formerly the role of the slotid.

This leads to an error when we SubTransSetParent(child_xid, top_xid);
since this assumes that the top_xid is the parent, which it is not.
Mostly you wouldn't notice unless you were looking up the subtrans
status for an xid that had committed but was the child of an aborted
subtransaction, with the top level xid having > 64 subtransactions. It's
possible the confusion leads to other bugs in UnobservedXid processing,
but I didn't look too hard at that.

AFAICS we need both parent and top xids. Or put another way, we need the
parent xid and other info that allows us to uniquely determine the proc
we need to update. Now the "other info..." could be top xid or it could
also be slotid, which then avoids later zig-zagging to look up the proc.

I'm wasn't looking for ways to reintroduce slotid, but it seems more
logical to keep slotid in light of the above. However, you will probably
view this as intransigence, so I will await your input.

I'm very happy that GetStandbyInfoForTransaction() and all the XLR2
flags have bitten the dust and will sleep for eternity.

For xl_rel_lock you add a field called xid and then ask
/* xid of the *parent* transaction. XXX why parent? */.
You've done this because it replaced slotid. But looking at that, I
think the 6a patch had a bug there: a subtransaction abort record would
release locks for the whole top level xact. So we need to pass both top
level xid (or slotid) and xid for each lock, then release using the
actual xid only.

You also ask: Shouldn't we call StartupSUBTRANS() and the other startup
functions like we do below, before letting anyone in?

My answer is that the current role of StartupSUBTRANS and friends is not
appropriate at that point, since they zero out those structures. I left
those routines in place thinking "startup" meant "moving to normal
running". If we did have a "startupsubtrans" at the point you note, it
would currently be empty: we don't keep track of the latest page during
recovery. Perhaps we should, but then we'd need to do the equivalent of
ExtendSubtrans etc, which it seemed easier to avoid.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot standby, slot ids and stuff
Date: 2009-01-10 11:45:52
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers


On Sat, 2009-01-10 at 09:40 +0000, Simon Riggs wrote:

> This leads to an error when we SubTransSetParent(child_xid, top_xid);
> since this assumes that the top_xid is the parent, which it is not.
> Mostly you wouldn't notice unless you were looking up the subtrans
> status for an xid that had committed but was the child of an aborted
> subtransaction, with the top level xid having > 64 subtransactions.
> It's possible the confusion leads to other bugs in UnobservedXid
> processing, but I didn't look too hard at that.
>
> AFAICS we need both parent and top xids.

I wonder if its possible to derive the parent by looking at the
highest/most newly assigned xid? Abort records would remove aborted
subtransactions and AFAIK we currently assign a new subtransaction only
ever from the latest current subtransaction. (This wouldn't be
necessarily true if supported true branch-anywhere subtransactions, but
we don't). Sounds correct, but not really sure.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot standby, slot ids and stuff
Date: 2009-01-11 03:03:24
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers


On Sat, 2009-01-10 at 11:45 +0000, Simon Riggs wrote:
> On Sat, 2009-01-10 at 09:40 +0000, Simon Riggs wrote:
>
> > This leads to an error when we SubTransSetParent(child_xid, top_xid);
> > since this assumes that the top_xid is the parent, which it is not.
> > Mostly you wouldn't notice unless you were looking up the subtrans
> > status for an xid that had committed but was the child of an aborted
> > subtransaction, with the top level xid having > 64 subtransactions.
> > It's possible the confusion leads to other bugs in UnobservedXid
> > processing, but I didn't look too hard at that.
> >
> > AFAICS we need both parent and top xids.
>
> I wonder if its possible to derive the parent by looking at the
> highest/most newly assigned xid? Abort records would remove aborted
> subtransactions and AFAIK we currently assign a new subtransaction only
> ever from the latest current subtransaction. (This wouldn't be
> necessarily true if supported true branch-anywhere subtransactions, but
> we don't). Sounds correct, but not really sure.

Starting to sound like a do-me-later-if-ever optimisation and certainly
nothing I want to rely on in court.

I'm progressing with parent_xid added to the xlog record header, for
now.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot standby, slot ids and stuff
Date: 2009-01-11 08:41:12
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs wrote:
> There's a confusion in the patch between top level xid and parent xid.
> The xl field is named parentxid but actually contains top level. That
> distinction is important because the code now uses the top level xid to
> locate the recovery proc, formerly the role of the slotid.

True, I changed the meaning halfway through. My thinking was that we can
get away by only tracking the top-level xact of each subtransaction, not
the whole tree of subtransactions.

> This leads to an error when we SubTransSetParent(child_xid, top_xid);
> since this assumes that the top_xid is the parent, which it is not.
> Mostly you wouldn't notice unless you were looking up the subtrans
> status for an xid that had committed but was the child of an aborted
> subtransaction, with the top level xid having > 64 subtransactions.

Hmm. When a subtransaction aborts, RecordTransactionAbort is called and
clog is updated to show the subtransaction and all it's subcommitted
children as aborted. I think we're safe, though it wouldn't hurt to
double-check.

It's an important point that needs documenting, though. I completely
neglected that.

> I'm wasn't looking for ways to reintroduce slotid, but it seems more
> logical to keep slotid in light of the above. However, you will probably
> view this as intransigence, so I will await your input.

Yeah, it sure does seem like intransigence :-)

> For xl_rel_lock you add a field called xid and then ask
> /* xid of the *parent* transaction. XXX why parent? */.
> You've done this because it replaced slotid. But looking at that, I
> think the 6a patch had a bug there: a subtransaction abort record would
> release locks for the whole top level xact. So we need to pass both top
> level xid (or slotid) and xid for each lock, then release using the
> actual xid only.

Hmm, would just the subtransaction xid be enough? If we use that to
release, what do we need the top-level xid for?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot standby, slot ids and stuff
Date: 2009-01-11 11:55:21
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers


On Sun, 2009-01-11 at 10:41 +0200, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > There's a confusion in the patch between top level xid and parent xid.
> > The xl field is named parentxid but actually contains top level. That
> > distinction is important because the code now uses the top level xid to
> > locate the recovery proc, formerly the role of the slotid.
>
> True, I changed the meaning halfway through. My thinking was that we can
> get away by only tracking the top-level xact of each subtransaction, not
> the whole tree of subtransactions.
>
> > This leads to an error when we SubTransSetParent(child_xid, top_xid);
> > since this assumes that the top_xid is the parent, which it is not.
> > Mostly you wouldn't notice unless you were looking up the subtrans
> > status for an xid that had committed but was the child of an aborted
> > subtransaction, with the top level xid having > 64 subtransactions.
>
> Hmm. When a subtransaction aborts, RecordTransactionAbort is called and
> clog is updated to show the subtransaction and all it's subcommitted
> children as aborted. I think we're safe, though it wouldn't hurt to
> double-check.

That part of your argument is correct but we are not safe. But please
look at XactLockTableWait() and see what you think. According to
comments this would lead to different lock release behaviour.

Beauty, thy name is not subtransaction.

If you agree that we need the parent xid then we have further problems.
Adding another xid onto the header of each WAL record will add 8 bytes
onto each record because of alignment. This was the other reason for
slotid, since I was able to fit that into just 2 bytes and avoid the 8
byte loss. Moving swiftly on, I have this morning though of an
alternative, so at least we now have some choice. Rather than store the
parent xid itself we store the difference between the current xid and
the parent xid. Typically this will be less than 65535; when it is not
we set it to zero but issue an xid assignment xlog record.

However, I think XactLockTableWait() doesn't need to know the parent
either. (This feels more like wishful thinking, but here goes anyway).
We release locks *after* TransactionIdAbortTree() has fully executed, so
the test for TransactionIdIsInProgress(xid) will always see the abort
status, if set. Notice that if we are awake at all it is because the
top-level transaction is complete or our subxid is aborted. So there is
never any need to look at the status of the parent, nor in fact any need
to look at the procarray at all, which is always a waste of effort. If
you believe that, you'll want to commit the attached patch (or something
similar with comments refactored etc).

> For xl_rel_lock you add a field called xid and then ask
> > /* xid of the *parent* transaction. XXX why parent? */.
> > You've done this because it replaced slotid. But looking at that, I
> > think the 6a patch had a bug there: a subtransaction abort record would
> > release locks for the whole top level xact. So we need to pass both top
> > level xid (or slotid) and xid for each lock, then release using the
> > actual xid only.
>
> Hmm, would just the subtransaction xid be enough? If we use that to
> release, what do we need the top-level xid for?

So we can release all locks taken by subxids at once, when we learn that
a top level xid has disappeared. A minor point.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

Attachment Content-Type Size
subtrans_direct_to_top.v1.patch text/x-patch 1.5 KB

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot standby, slot ids and stuff
Date: 2009-01-12 11:50:48
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Heikki, can I get your feedback on this urgently please? I want to
respond positively to your review comments and complete something you
will find acceptable. But I need to know what that is, please.

On Sun, 2009-01-11 at 11:55 +0000, Simon Riggs wrote:
> On Sun, 2009-01-11 at 10:41 +0200, Heikki Linnakangas wrote:
> > Simon Riggs wrote:
> > > There's a confusion in the patch between top level xid and parent xid.
> > > The xl field is named parentxid but actually contains top level. That
> > > distinction is important because the code now uses the top level xid to
> > > locate the recovery proc, formerly the role of the slotid.
> >
> > True, I changed the meaning halfway through. My thinking was that we can
> > get away by only tracking the top-level xact of each subtransaction, not
> > the whole tree of subtransactions.
> >
> > > This leads to an error when we SubTransSetParent(child_xid, top_xid);
> > > since this assumes that the top_xid is the parent, which it is not.
> > > Mostly you wouldn't notice unless you were looking up the subtrans
> > > status for an xid that had committed but was the child of an aborted
> > > subtransaction, with the top level xid having > 64 subtransactions.
> >
> > Hmm. When a subtransaction aborts, RecordTransactionAbort is called and
> > clog is updated to show the subtransaction and all it's subcommitted
> > children as aborted. I think we're safe, though it wouldn't hurt to
> > double-check.
>
> That part of your argument is correct but we are not safe. But please
> look at XactLockTableWait() and see what you think. According to
> comments this would lead to different lock release behaviour.
>
> Beauty, thy name is not subtransaction.
>
> If you agree that we need the parent xid then we have further problems.
> Adding another xid onto the header of each WAL record will add 8 bytes
> onto each record because of alignment. This was the other reason for
> slotid, since I was able to fit that into just 2 bytes and avoid the 8
> byte loss. Moving swiftly on, I have this morning though of an
> alternative, so at least we now have some choice. Rather than store the
> parent xid itself we store the difference between the current xid and
> the parent xid. Typically this will be less than 65535; when it is not
> we set it to zero but issue an xid assignment xlog record.
>
> However, I think XactLockTableWait() doesn't need to know the parent
> either. (This feels more like wishful thinking, but here goes anyway).
> We release locks *after* TransactionIdAbortTree() has fully executed, so
> the test for TransactionIdIsInProgress(xid) will always see the abort
> status, if set. Notice that if we are awake at all it is because the
> top-level transaction is complete or our subxid is aborted. So there is
> never any need to look at the status of the parent, nor in fact any need
> to look at the procarray at all, which is always a waste of effort. If
> you believe that, you'll want to commit the attached patch (or something
> similar with comments refactored etc).
>
> > For xl_rel_lock you add a field called xid and then ask
> > > /* xid of the *parent* transaction. XXX why parent? */.
> > > You've done this because it replaced slotid. But looking at that, I
> > > think the 6a patch had a bug there: a subtransaction abort record would
> > > release locks for the whole top level xact. So we need to pass both top
> > > level xid (or slotid) and xid for each lock, then release using the
> > > actual xid only.
> >
> > Hmm, would just the subtransaction xid be enough? If we use that to
> > release, what do we need the top-level xid for?
>
> So we can release all locks taken by subxids at once, when we learn that
> a top level xid has disappeared. A minor point.
>
--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot standby, slot ids and stuff
Date: 2009-01-12 12:10:13
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs wrote:
> Rather than store the
> parent xid itself we store the difference between the current xid and
> the parent xid. Typically this will be less than 65535; when it is not
> we set it to zero but issue an xid assignment xlog record.

That sounds pretty hacky.

> However, I think XactLockTableWait() doesn't need to know the parent
> either. (This feels more like wishful thinking, but here goes anyway).
> We release locks *after* TransactionIdAbortTree() has fully executed, so
> the test for TransactionIdIsInProgress(xid) will always see the abort
> status, if set. Notice that if we are awake at all it is because the
> top-level transaction is complete or our subxid is aborted. So there is
> never any need to look at the status of the parent, nor in fact any need
> to look at the procarray at all, which is always a waste of effort.

Right, we don't currently write a WAL record at subtransaction commit,
only at subtransaction abort or top-level commit. So the problem
described in the comment at XactLockTableWait() can't arise in the standby.

Actually, I wonder if we should write a WAL record at subtransaction
commit too, to save on shared memory in the standby as well.

> If
> you believe that, you'll want to commit the attached patch (or something
> similar with comments refactored etc).

Umm, we still need the SubTransGetParent() call in normal operation.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot standby, slot ids and stuff
Date: 2009-01-12 13:04:23
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers


On Mon, 2009-01-12 at 14:10 +0200, Heikki Linnakangas wrote:

> > However, I think XactLockTableWait() doesn't need to know the parent
> > either. (This feels more like wishful thinking, but here goes anyway).
> > We release locks *after* TransactionIdAbortTree() has fully executed, so
> > the test for TransactionIdIsInProgress(xid) will always see the abort
> > status, if set. Notice that if we are awake at all it is because the
> > top-level transaction is complete or our subxid is aborted. So there is
> > never any need to look at the status of the parent, nor in fact any need
> > to look at the procarray at all, which is always a waste of effort.
>
> Right, we don't currently write a WAL record at subtransaction commit,
> only at subtransaction abort or top-level commit. So the problem
> described in the comment at XactLockTableWait() can't arise in the standby.

Good. So we can exclude parent_xid and just include top level xid. So we
are now officially back inside the current xlog record padding: we don't
need to increase WAL record length and therefore we can lose slotid
(yay!).

> Actually, I wonder if we should write a WAL record at subtransaction
> commit too, to save on shared memory in the standby as well.

You understand that the current design never performs
XactLockTableInsert() for individual top/sub transactions? There would
be no memory overhead to save.

The only locks allowed are AccessShareLocks which never conflict with
anything except for AccessExclusiveLocks. If an AEL conflicts with an
ASL we blow away the ASL holders. If an ASL requestor is blocked by a an
AEL the wait is performed on the single lock table entry for the Startup
process, which acts as a proxy for all emulated AEL lock holders.

> > If
> > you believe that, you'll want to commit the attached patch (or something
> > similar with comments refactored etc).
>
> Umm, we still need the SubTransGetParent() call in normal operation.

OK, that was really just a thought experiment anyway to prove the point
one way or the other.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot standby, slot ids and stuff
Date: 2009-01-13 19:10:09
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers


On Sat, 2009-01-10 at 09:40 +0000, Simon Riggs wrote:

> But looking at that, I
> think the 6a patch had a bug there: a subtransaction abort record
> would release locks for the whole top level xact.

Fixed.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot standby, slot ids and stuff
Date: 2009-01-15 22:10:24
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers


On Thu, 2009-01-08 at 22:31 +0200, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > If you want to do things a different way you need to say what you want
> > to do and what effects those changes will have.
>
> I want to reduce the coupling between the primary and the master. The
> less they need to communicate, the better. I want to get rid of slotid,
> and as many of the other extra information carried in WAL records that I
> can. I believe that will make the patch both simpler and more robust.
>
> > Are there tradeoffs? If so what are they?
>
> I don't think there's any big difference in user-visible behavior.

I notice that we are no longer able to record the databaseId for a
connection, so query conflict resolution cannot be isolated to
individual databases any longer.

We might sometimes infer a transaction's databaseId from certain WAL
records but that is only going to be possible within each rmgr, which is
fairly strange.

I'll leave all of the databaseId stuff in there in case we think of
anything good.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot standby, slot ids and stuff
Date: 2009-01-16 00:55:51
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers


On Thu, 2009-01-15 at 22:10 +0000, Simon Riggs wrote:

> I notice that we are no longer able to record the databaseId for a
> connection, so query conflict resolution cannot be isolated to
> individual databases any longer.

Wrong way round. Incoming WAL records from dbOid on them, so we can
still check for conflicts against the db of the standby backends.
Phew!

> I'll leave all of the databaseId stuff in there in case we think of
> anything good.

Ripping out now as final part of earlier refactoring.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support