Re: Repeating Append operation

Lists: pgsql-hackers
From: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>
To: PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Repeating Append operation
Date: 2010-03-19 18:09:50
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hi all,

explain
select v from (
select array(
select 1
union all
select 2) as v
from (select 1) ) as s
where v is not null;

The plan looks like:

QUERY PLAN
--------------------------------------------------------
Result (cost=0.08..0.10 rows=1 width=0)
One-Time Filter: ($1 IS NOT NULL)
InitPlan
-> Append (cost=0.00..0.04 rows=2 width=0)
-> Result (cost=0.00..0.01 rows=1 width=0)
-> Result (cost=0.00..0.01 rows=1 width=0)
-> Append (cost=0.00..0.04 rows=2 width=0)
-> Result (cost=0.00..0.01 rows=1 width=0)
-> Result (cost=0.00..0.01 rows=1 width=0)
-> Result (cost=0.00..0.01 rows=1 width=0)
(10 rows)

It seems that that the UNION ALL part of the query will be executed
twice. If I remove the WHERE clause the I see only one Append operation. I
had a suspicion that its just the display of the plan that showed the same
plan twice, but then I noticed that the overall cost of the query also drops
making me think that this UNION ALL will actually be executed twice.

The plan without the WHERE clause is:
QUERY PLAN
----------------------------------------------------------------------
Subquery Scan __unnamed_subquery_0 (cost=0.04..0.06 rows=1 width=0)
InitPlan
-> Append (cost=0.00..0.04 rows=2 width=0)
-> Result (cost=0.00..0.01 rows=1 width=0)
-> Result (cost=0.00..0.01 rows=1 width=0)
-> Result (cost=0.00..0.01 rows=1 width=0)
(6 rows)

I had seen this with a bigger query on actual tables, and this is just a
reproducible test case. In the original query I see 'SubPlan' instead of the
'InitPlan' seen here.

Head of plan with WHERE clause:
Seq Scan on "Person" p (cost=0.00..280486580881.10 rows=1373415 width=4)

Head of plan without WHERE clause:
Seq Scan on "Person" p (cost=0.00..140594841889.03 rows=1380317 width=4)

Is there a way to avoid this double evaluation?

Thanks in advance.
--
gurjeet.singh
@ EnterpriseDB - The Enterprise Postgres Company
http://www.enterprisedb.com

singh(dot)gurjeet(at){ gmail | yahoo }.com
Twitter/Skype: singh_gurjeet

Mail sent from my BlackLaptop device


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>
Cc: PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Repeating Append operation
Date: 2010-03-21 20:29:53
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Mar 19, 2010 at 2:09 PM, Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com> wrote:
>     Is there a way to avoid this double evaluation?

Maybe with a CTE?

WITH x AS (...) SELECT ...

It does look like surprising behavior.

...Robert


From: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Repeating Append operation
Date: 2010-03-23 18:09:34
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Mar 21, 2010 at 4:29 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Fri, Mar 19, 2010 at 2:09 PM, Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>
> wrote:
> > Is there a way to avoid this double evaluation?
>
> Maybe with a CTE?
>
> WITH x AS (...) SELECT ...
>
> It does look like surprising behavior.
>

It was discussed on the IRC that same day, and RhodiumToad (Andrew) pointed
out that this behaviour is because of subquery un-nesting. Putting an OFFSET
0 clause (hint) in the inline view prevents it from being merged with the
outer query:

explain
select v from (
select array(
select 1
union all
select 2) as v
from (select 1) offset 0) as s
where v is not null;
QUERY PLAN
----------------------------------------------------------------------------------
Subquery Scan s (cost=0.04..0.07 rows=1 width=32)
Filter: (v IS NOT NULL)
-> Limit (cost=0.04..0.06 rows=1 width=0)
InitPlan
-> Append (cost=0.00..0.04 rows=2 width=0)
-> Result (cost=0.00..0.01 rows=1 width=0)
-> Result (cost=0.00..0.01 rows=1 width=0)
-> Subquery Scan __unnamed_subquery_0 (cost=0.00..0.02 rows=1
width=0)
-> Result (cost=0.00..0.01 rows=1 width=0)
(9 rows)

This raises the point that we do subquery un-nesting purely on
heuristics, and not on cost basis. I guess we should be be doing a cost
comparison too. I think that this un-nesting happens quite before we start
generating alternative plans for cost comparisons, and that we might not
have costs to compare at this stage, but IMHO we should somehow incorporate
cost comparisons too.

Regards,

--
gurjeet.singh
@ EnterpriseDB - The Enterprise Postgres Company
http://www.enterprisedb.com

singh(dot)gurjeet(at){ gmail | yahoo }.com
Twitter/Skype: singh_gurjeet

Mail sent from my BlackLaptop device


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>
Cc: PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Repeating Append operation
Date: 2010-03-23 18:55:42
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Mar 23, 2010 at 2:09 PM, Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com> wrote:
> On Sun, Mar 21, 2010 at 4:29 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>
>> On Fri, Mar 19, 2010 at 2:09 PM, Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>
>> wrote:
>> >     Is there a way to avoid this double evaluation?
>>
>> Maybe with a CTE?
>>
>> WITH x AS (...) SELECT ...
>>
>> It does look like surprising behavior.
>
> It was discussed on the IRC that same day, and RhodiumToad (Andrew) pointed
> out that this behaviour is because of subquery un-nesting. Putting an OFFSET
> 0 clause (hint) in the inline view prevents it from being merged with the
> outer query:
>
> explain
> select v from (
> select array(
>         select 1
>         union all
>         select 2) as v
> from (select 1) offset 0) as s
> where v is not null;
>                                     QUERY PLAN
> ----------------------------------------------------------------------------------
>  Subquery Scan s  (cost=0.04..0.07 rows=1 width=32)
>    Filter: (v IS NOT NULL)
>    ->  Limit  (cost=0.04..0.06 rows=1 width=0)
>          InitPlan
>            ->  Append  (cost=0.00..0.04 rows=2 width=0)
>                  ->  Result  (cost=0.00..0.01 rows=1 width=0)
>                  ->  Result  (cost=0.00..0.01 rows=1 width=0)
>          ->  Subquery Scan __unnamed_subquery_0  (cost=0.00..0.02 rows=1
> width=0)
>                ->  Result  (cost=0.00..0.01 rows=1 width=0)
> (9 rows)
>
>    This raises the point that we do subquery un-nesting purely on
> heuristics, and not on cost basis. I guess we should be be doing a cost
> comparison too. I think that this un-nesting happens quite before we start
> generating alternative plans for cost comparisons, and that we might not
> have costs to compare at this stage, but IMHO we should somehow incorporate
> cost comparisons too.

I don't think this is right. Flattening the subquery doesn't prevent
the join from being implemented a nested loop, which is essentially
what happens when it's treated as an initplan. It just allows other
options also.

...Robert