Lists: | pgsql-committerspgsql-hackers |
---|
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | pgsql-committers(at)lists(dot)postgresql(dot)org |
Subject: | pgsql: Use perfect hashing, instead of binary search, for keyword looku |
Date: | 2019-01-10 00:48:01 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-committers pgsql-hackers |
Use perfect hashing, instead of binary search, for keyword lookup.
We've been speculating for a long time that hash-based keyword lookup
ought to be faster than binary search, but up to now we hadn't found
a suitable tool for generating the hash function. Joerg Sonnenberger
provided the inspiration, and sample code, to show us that rolling our
own generator wasn't a ridiculous idea. Hence, do that.
The method used here requires a lookup table of approximately 4 bytes
per keyword, but that's less than what we saved in the predecessor commit
afb0d0712, so it's not a big problem. The time savings is indeed
significant: preliminary testing suggests that the total time for raw
parsing (flex + bison phases) drops by ~20%.
Patch by me, but it owes its existence to Joerg Sonnenberger;
thanks also to John Naylor for review.
Discussion: https://postgr.es/m/[email protected]
Branch
------
master
Modified Files
--------------
src/common/Makefile | 9 +-
src/common/kwlookup.c | 73 +++---
src/include/common/kwlookup.h | 4 +
src/include/parser/kwlist.h | 3 +-
src/interfaces/ecpg/preproc/Makefile | 13 +-
src/interfaces/ecpg/preproc/c_keywords.c | 51 ++--
src/interfaces/ecpg/preproc/c_kwlist.h | 3 +-
src/interfaces/ecpg/preproc/ecpg_kwlist.h | 3 +-
src/pl/plpgsql/src/Makefile | 13 +-
src/pl/plpgsql/src/pl_reserved_kwlist.h | 5 +-
src/pl/plpgsql/src/pl_unreserved_kwlist.h | 7 +-
src/tools/PerfectHash.pm | 376 ++++++++++++++++++++++++++++++
src/tools/gen_keywordlist.pl | 53 ++++-
src/tools/msvc/Solution.pm | 10 +-
14 files changed, 516 insertions(+), 107 deletions(-)
From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: pgsql: Use perfect hashing, instead of binary search, for keyword looku |
Date: | 2019-01-10 02:11:20 |
Message-ID: | CA+TgmoaSku2c9JQf2fLMJEcFwQb+FnR+NkNaSu=fR4kqHohwqg@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-committers pgsql-hackers |
On Wed, Jan 9, 2019 at 7:48 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Use perfect hashing, instead of binary search, for keyword lookup.
>
> We've been speculating for a long time that hash-based keyword lookup
> ought to be faster than binary search, but up to now we hadn't found
> a suitable tool for generating the hash function. Joerg Sonnenberger
> provided the inspiration, and sample code, to show us that rolling our
> own generator wasn't a ridiculous idea. Hence, do that.
>
> The method used here requires a lookup table of approximately 4 bytes
> per keyword, but that's less than what we saved in the predecessor commit
> afb0d0712, so it's not a big problem. The time savings is indeed
> significant: preliminary testing suggests that the total time for raw
> parsing (flex + bison phases) drops by ~20%.
>
> Patch by me, but it owes its existence to Joerg Sonnenberger;
> thanks also to John Naylor for review.
Wow. That is a VERY significant improvement.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company