Lists: | pgsql-committerspgsql-hackers |
---|
From: | Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi> |
---|---|
To: | pgsql-committers(at)postgresql(dot)org |
Subject: | pgsql: In COPY, insert tuples to the heap in batches. |
Date: | 2011-11-09 09:06:59 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-committers pgsql-hackers |
In COPY, insert tuples to the heap in batches.
This greatly reduces the WAL volume, especially when the table is narrow.
The overhead of locking the heap page is also reduced. Reduced WAL traffic
also makes it scale a lot better, if you run multiple COPY processes at
the same time.
Branch
------
master
Modified Files
--------------
src/backend/access/heap/heapam.c | 484 ++++++++++++++++++++++++++++++++++----
src/backend/commands/copy.c | 166 ++++++++++++-
src/backend/postmaster/pgstat.c | 6 +-
src/include/access/heapam.h | 2 +
src/include/access/htup.h | 31 +++
src/include/pgstat.h | 2 +-
6 files changed, 629 insertions(+), 62 deletions(-)
From: | Simon Riggs <simon(at)2ndQuadrant(dot)com> |
---|---|
To: | Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [COMMITTERS] pgsql: In COPY, insert tuples to the heap in batches. |
Date: | 2011-11-09 13:25:46 |
Message-ID: | CA+U5nMLeWrdDJK32AhCzdpshrHhDNdew1ppBiu+AOpwnCnNBPw@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-committers pgsql-hackers |
On Wed, Nov 9, 2011 at 9:06 AM, Heikki Linnakangas
<heikki(dot)linnakangas(at)iki(dot)fi> wrote:
> In COPY, insert tuples to the heap in batches.
>
> This greatly reduces the WAL volume, especially when the table is narrow.
> The overhead of locking the heap page is also reduced. Reduced WAL traffic
> also makes it scale a lot better, if you run multiple COPY processes at
> the same time.
Sounds good.
I can't see where this applies backup blocks. If it does, can you
document why/where/how it differs from other WAL records?
There's no need for conflict processing on replay with this new WAL
record type. But you should document that and alter the comments that
say it is necessary. Search "conflict".
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> |
---|---|
To: | Simon Riggs <simon(at)2ndQuadrant(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [COMMITTERS] pgsql: In COPY, insert tuples to the heap in batches. |
Date: | 2011-11-09 18:42:53 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-committers pgsql-hackers |
On 09.11.2011 15:25, Simon Riggs wrote:
> On Wed, Nov 9, 2011 at 9:06 AM, Heikki Linnakangas
> <heikki(dot)linnakangas(at)iki(dot)fi> wrote:
>> In COPY, insert tuples to the heap in batches.
>>
>> This greatly reduces the WAL volume, especially when the table is narrow.
>> The overhead of locking the heap page is also reduced. Reduced WAL traffic
>> also makes it scale a lot better, if you run multiple COPY processes at
>> the same time.
>
> Sounds good.
>
> I can't see where this applies backup blocks. If it does, can you
> document why/where/how it differs from other WAL records?
Good catch, I missed that. I copied the redo function from normal
insertion, but missed that heap_redo() takes care of backup blocks for
you, while heap2_redo() does not.
I'll go fix that..
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com