Lists: | pgsql-committerspgsql-hackers |
---|
From: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
---|---|
To: | pgsql-committers(at)postgresql(dot)org |
Subject: | pgsql: Add isolationtester spec for old heapam.c bug |
Date: | 2016-02-26 20:12:12 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-committers pgsql-hackers |
Add isolationtester spec for old heapam.c bug
In 0e5680f4737a, I fixed a bug in heapam that caused spurious deadlocks
when multiple updates concurrently attempted to modify the old version
of an updated tuple whose new version was key-share locked. I proposed
an isolationtester spec file that reproduced the bug, but back then
isolationtester wasn't mature enough to be able to run it. Now that
38f8bdcac498 is in the tree, we can have this spec file too.
Branch
------
master
Modified Files
--------------
src/test/isolation/expected/tuplelock-update.out | 24 ++++++++++++++++++++
src/test/isolation/isolation_schedule | 1 +
src/test/isolation/specs/tuplelock-update.spec | 28 ++++++++++++++++++++++++
3 files changed, 53 insertions(+)
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
Cc: | pgsql-committers(at)postgresql(dot)org |
Subject: | Re: pgsql: Add isolationtester spec for old heapam.c bug |
Date: | 2016-02-27 05:39:47 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-committers pgsql-hackers |
Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> writes:
> Add isolationtester spec for old heapam.c bug
Hmmm ....
This failure looks a lot like the timing-related problems I was chasing
last week with deadlock-hard on the CLOBBER_CACHE_ALWAYS critters.
spoonbill isn't CLOBBER_CACHE_ALWAYS, but it uses some weird malloc debug
stuff that slows it down by similar orders of magnitude. You seem to need
to think about how to make this test less timing-dependent.
regards, tom lane
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
Cc: | pgsql-committers(at)postgresql(dot)org |
Subject: | Re: pgsql: Add isolationtester spec for old heapam.c bug |
Date: | 2016-02-29 17:06:56 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-committers pgsql-hackers |
I wrote:
> Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> writes:
>> Add isolationtester spec for old heapam.c bug
> Hmmm ....
> This failure looks a lot like the timing-related problems I was chasing
> last week with deadlock-hard on the CLOBBER_CACHE_ALWAYS critters.
> spoonbill isn't CLOBBER_CACHE_ALWAYS, but it uses some weird malloc debug
> stuff that slows it down by similar orders of magnitude. You seem to need
> to think about how to make this test less timing-dependent.
The plot thickens:
guaibasaurus is not a particularly slow machine, and it's not using any
special build flags AFAICT. So I'm not sure what to make of this case,
except that it proves the timing problem can manifest on normal builds.
regards, tom lane
From: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-committers(at)postgresql(dot)org |
Subject: | Re: pgsql: Add isolationtester spec for old heapam.c bug |
Date: | 2016-02-29 17:44:06 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-committers pgsql-hackers |
Tom Lane wrote:
> I wrote:
> > Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> writes:
> >> Add isolationtester spec for old heapam.c bug
>
> > Hmmm ....
>
> > http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=spoonbill&dt=2016-02-27%2000%3A00%3A06
>
> > This failure looks a lot like the timing-related problems I was chasing
> > last week with deadlock-hard on the CLOBBER_CACHE_ALWAYS critters.
> > spoonbill isn't CLOBBER_CACHE_ALWAYS, but it uses some weird malloc debug
> > stuff that slows it down by similar orders of magnitude. You seem to need
> > to think about how to make this test less timing-dependent.
>
> The plot thickens:
>
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=guaibasaurus&dt=2016-02-29%2016%3A17%3A01
>
> guaibasaurus is not a particularly slow machine, and it's not using any
> special build flags AFAICT. So I'm not sure what to make of this case,
> except that it proves the timing problem can manifest on normal builds.
Hmm, I suppose I could fix this by using three different advisory locks
rather than a single one. (My assumption is that the timing dependency
is the order in which the backends are awakened when the advisory lock
is released.) I would release the locks one by one rather than all
together.
--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
Cc: | pgsql-committers(at)postgresql(dot)org |
Subject: | Re: pgsql: Add isolationtester spec for old heapam.c bug |
Date: | 2016-02-29 18:08:42 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-committers pgsql-hackers |
Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> Tom Lane wrote:
>> guaibasaurus is not a particularly slow machine, and it's not using any
>> special build flags AFAICT. So I'm not sure what to make of this case,
>> except that it proves the timing problem can manifest on normal builds.
> Hmm, I suppose I could fix this by using three different advisory locks
> rather than a single one. (My assumption is that the timing dependency
> is the order in which the backends are awakened when the advisory lock
> is released.) I would release the locks one by one rather than all
> together.
Sounds plausible. You would probably need several seconds' pg_sleep() in
between the lock releases to ensure that even on slow/overloaded machines,
there's enough time for all wakened backends to do what they're supposed
to do.
regards, tom lane
From: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: [COMMITTERS] pgsql: Add isolationtester spec for old heapam.c bug |
Date: | 2016-02-29 19:42:09 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-committers pgsql-hackers |
Tom Lane wrote:
> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> > Tom Lane wrote:
> >> guaibasaurus is not a particularly slow machine, and it's not using any
> >> special build flags AFAICT. So I'm not sure what to make of this case,
> >> except that it proves the timing problem can manifest on normal builds.
>
> > Hmm, I suppose I could fix this by using three different advisory locks
> > rather than a single one. (My assumption is that the timing dependency
> > is the order in which the backends are awakened when the advisory lock
> > is released.) I would release the locks one by one rather than all
> > together.
>
> Sounds plausible.
Pushed. We'll see how they behave now.
> You would probably need several seconds' pg_sleep() in between the
> lock releases to ensure that even on slow/overloaded machines, there's
> enough time for all wakened backends to do what they're supposed to
> do.
I don't really like sleeps in tests, because instead of 0.01 seconds in
the normal case it now takes 10.01 seconds. But fixing it is more
trouble than it's probably worth, so I added 5s sleeps.
If someone wants to get rid of that idle time (i.e. have a mechanism
that causes it to stop sleeping when the update is done) I would be
happy to give it a look.
--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services