Re: BUG #14441: trim function bug (hit a keyword somewhere internally)

Lists: pgsql-bugs
From: mark(dot)pether(at)gmail(dot)com
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #14441: trim function bug (hit a keyword somewhere internally)
Date: 2016-11-30 01:17:10
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 14441
Logged by: mark pether
Email address: mark(dot)pether(at)gmail(dot)com
PostgreSQL version: 9.5.4
Operating system: macos
Description:

select TRIM(LEADING 'EXCLUDE' FROM 'C001');

Will incorrectly strip off the 'C' leaving '001'.

select TRIM(LEADING 'exclude' FROM 'C001');

Will not strip trim anything as expected.

Cheers


From: David Gould <daveg(at)sonic(dot)net>
To: mark(dot)pether(at)gmail(dot)com
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #14441: trim function bug (hit a keyword somewhere internally)
Date: 2016-11-30 06:09:32
Message-ID: 20161129220932.123e13a1@engels
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-bugs

On Wed, 30 Nov 2016 01:17:10 +0000
mark(dot)pether(at)gmail(dot)com wrote:

> The following bug has been logged on the website:
>
> Bug reference: 14441
> Logged by: mark pether
> Email address: mark(dot)pether(at)gmail(dot)com
> PostgreSQL version: 9.5.4
> Operating system: macos
> Description:
>
> select TRIM(LEADING 'EXCLUDE' FROM 'C001');
>
> Will incorrectly strip off the 'C' leaving '001'.

Perhaps I'm misreading the doc, but that looks like the correct result.
That is, any character in 'EXCLUDE' will be stripped from the leading part of
'C001', ie, the 'C'. Why do you think this is a bug?

-dg

--
David Gould daveg(at)sonic(dot)net
If simplicity worked, the world would be overrun with insects.


From: mark pether <mark(dot)pether(at)gmail(dot)com>
To: David Gould <daveg(at)sonic(dot)net>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #14441: trim function bug (hit a keyword somewhere internally)
Date: 2016-11-30 06:16:16
Message-ID: CAEnjbrTZ+GMF_XCV1r4Q42TLv3mR_EAeutUGTE8=Fn3nTrTxRw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-bugs

Sorry, I misread the document it's not a bug, doco was unclear.

It replaces the largest match which may be all characters or simply
individual characters. I realised this after I created the bug.

I have switched to REGEXP_REPLACE to do the correct operation.

Regards

On Wed, Nov 30, 2016 at 5:09 PM, David Gould <daveg(at)sonic(dot)net> wrote:

> On Wed, 30 Nov 2016 01:17:10 +0000
> mark(dot)pether(at)gmail(dot)com wrote:
>
> > The following bug has been logged on the website:
> >
> > Bug reference: 14441
> > Logged by: mark pether
> > Email address: mark(dot)pether(at)gmail(dot)com
> > PostgreSQL version: 9.5.4
> > Operating system: macos
> > Description:
> >
> > select TRIM(LEADING 'EXCLUDE' FROM 'C001');
> >
> > Will incorrectly strip off the 'C' leaving '001'.
>
>
> Perhaps I'm misreading the doc, but that looks like the correct result.
> That is, any character in 'EXCLUDE' will be stripped from the leading part
> of
> 'C001', ie, the 'C'. Why do you think this is a bug?
>
> -dg
>
>
>
> --
> David Gould daveg(at)sonic(dot)net
> If simplicity worked, the world would be overrun with insects.
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: mark(dot)pether(at)gmail(dot)com
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #14441: trim function bug (hit a keyword somewhere internally)
Date: 2016-11-30 06:24:04
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-bugs

mark(dot)pether(at)gmail(dot)com writes:
> select TRIM(LEADING 'EXCLUDE' FROM 'C001');
> Will incorrectly strip off the 'C' leaving '001'.

The code appears to believe that the first argument is a set of characters
that are to be removed from the front of the string argument until none
remain. For instance:

regression=# select TRIM(LEADING 'ABC' FROM 'CAABC001');
ltrim
-------
001
(1 row)

So the behavior you mention is expected.

I think what you're expecting is to strip off one or more occurrences of a
substring, but that's not what our version of TRIM does.

A look at the SQL standard indicates that it requires the first argument
to be *exactly one* character in length, and then TRIM strips leading
occurrence(s) of that character. So our existing implementation is a
plausible extension of that ... as is your expectation, but there's
certainly no reason to favor yours over what's there. Also:
* the PG code has acted this way since it was introduced in 1997.
* the code comments claim, and a bit of googling seems to confirm,
that this matches Oracle's behavior.
So I do not see us changing this.

You can easily build the behavior you want out of other spare parts.
Regexes would be my first weapon of choice:

regression=# select substring('C001' from '^(?:EXCLUDE)*(.*)$');
substring
-----------
C001
(1 row)

regression=# select substring('EXCLUDEC001' from '^(?:EXCLUDE)*(.*)$');
substring
-----------
C001
(1 row)

regression=# select substring('EXCLUDEEXCLUDEC001' from '^(?:EXCLUDE)*(.*)$');
substring
-----------
C001
(1 row)

See
https://www.postgresql.org/docs/current/static/functions-matching.html#FUNCTIONS-POSIX-REGEXP

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: mark pether <mark(dot)pether(at)gmail(dot)com>
Cc: David Gould <daveg(at)sonic(dot)net>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #14441: trim function bug (hit a keyword somewhere internally)
Date: 2016-11-30 06:51:36
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-bugs

mark pether <mark(dot)pether(at)gmail(dot)com> writes:
> Sorry, I misread the document it's not a bug, doco was unclear.
> It replaces the largest match which may be all characters or simply
> individual characters. I realised this after I created the bug.

Hmm, yeah, docs fail to make clear what happens with a multicharacter
trim spec. I think we could fix this with a better example.
Will do something about it.

regards, tom lane