String methods, functions and operators

Cast to String

Function HTSQL SQL Javascript Python
Cast to String string(value) CAST(value AS TEXT) String(value) str(value)

Signatures:

  • htsql:string(Any value) -> String

Type resolution: Untyped => String.

Question: Should we use more sophisticated cast for date/time related functions to make the output uniform across different database engines? Should the output be modeled after the string representation of corresponding Python types then?

TODO: Demonstrate a string representation for each type.

String comparison

Issues with the string comparison (no longer relevant).

The following comparison operators could be useful:

  • normal comparison with: NULL = NULL => NULL, NULL != NULL => NULL
  • case insensitive comparison, with the same NULL semantics, for strings only
  • comparison with different NULL semantics: NULL = NULL => TRUE, NULL != NULL => FALSE

The current comparison semantics:

  • the operators =, !=: the normal comparison for all domains except Strings; for Strings, it is a case-insensitive comparison.
  • the operators ==, !==: for Strings only, case-sensitive comparison.
  • no way to express NULL = NULL is TRUE semantics.

Problems:

  • x=y is true while x<=y is false?! Since we have two classes of the equality operator, shouldn't we be able to perform > and < comparisons in both case-sensitive and case-insensitive manner?
  • no uniform "is equal to" and "is not equal to" operator. you need to use == for String and = for everything else.
  • no "is identical to" and "is not identical to" operator, like IS NOT DISTINCT FROM/IS DISTINCT FROM in PostgreSQL, is/is not in Python and '==='/!== in Javascript.

Solution:

  • ???
  • make =/!= be a normal comparison operator (???)
  • add another, case insensitive comparison operator (symbol? PostgreSQL uses ~* and !~* for case insensitive regular expression search. Maybe =*, !=*, <*, <=*, >*, >=*?)
  • make ==/!== be an "is identical to" operator (in JS: ===/!== ???)
  • ???

Basic string operations

Function HTSQL SQL Javascript Python
Number of characters s.length() CHARACTER_LENGTH(s) s.length len(s)
The k-th character s[k] SUBSTRING(s FROM k+1 FOR 1) s[k] s[k]
Extract a substring (left, size) s.substr(l[,sz]) SUBSTRING(s FROM l+1 FOR sz) s.substr(l[, sz]) --
Extract a substring (left, right) s[l:r] -- s.slice(l[, r]) (NB s.substring(l[, r])) s[l:r]
Replace a substring (left, size) s.overlay(new,l[,sz]) OVERLAY(s PLACING new FROM l+1 FOR sz) -- --
Concatenation s1+s2 s1 || s2 s1 + s2, s1.concat(s2, ...) s1 + s2
Repetition s.repeat(n) REPEAT(s, n) -- s * n

Signatures:

  • String . htsql:length() -> Number
  • htsql:operator:"[.]"(String s, Number k) -> String
  • String . htsql:substr(Number l[, Number sz]) -> String
  • htsql:operator:"[.:.]"(String s, Number l, Number r) -> String
  • htsql:operator:"[.:]"(String s, Number l) -> String
  • htsql:operator:"[:.]"(String s, Number r) -> String
  • htsql:operator:"[:]"(String s) -> String
  • String . htsql:overlay(String new, Number l[, Number sz])
  • htsql:operator:"+"(String a, String b) -> String
  • String . htsql:repeat(Number n) -> String

Note: In HTSQL, we count characters from 0 (Python, JS), rather than from 1 (SQL).

Filtering and formatting functions

Function HTSQL SQL Javascript Python
Strip leading spaces/other characters s.ltrim([chars]) TRIM(LEADING [chars] FROM s) -- s.lstrip([chars])
Strip trailing spaces/other characters s.rtrim([chars]) TRIM(TRAILING [chars] FROM s) -- s.rstrip([chars])
Strip both leading and trailing spaces s.trim([chars]) TRIM(BOTH [chars] FROM s) -- s.strip([chars])
Left-justify a string s.lpad(w[,fillchar]) LPAD(s, w[, fillchar]) -- s.ljust(w[, fillchar])
Right-justify a string s.rpad(w[,fillchar]) RPAD(s, w[, fillchar]) -- s.rjust(w[, fillchar])
Convert all characters to lowercase s.lower() LOWER(s) s.toLowerCase() s.lower()
Convert all characters to uppercase s.upper() UPPER(s) s.toUpperCase() s.upper()
Translate (apply a map for) characters s.translate(fromchars,tochars) TRANSLATE(s, fromchars, tochars) -- s.translate(fromtotable, delchars)

Signatures:

  • String . htsql:ltrim([String chars]) -> String
  • String . htsql:rtrim([String chars]) -> String
  • String . htsql:trim([String chars]) -> String
  • String . htsql:lpad(Number w[, String fillchar]) -> String
  • String . htsql:rpad(Number w[, String fillchar]) -> String
  • String . htsql:lower() -> String
  • String . htsql:upper() -> String
  • String . htsql:translate(String fromchars, String tochars) -> String

Note that padding/justifying functions in SQL (but not in Python) truncate the original string if it is too long.

Search and replace functions

Function HTSQL SQL Javascript Python
Check if a string contains a substring s.contains(sub) POSITION(sub IN s) != 0 s.indexOf(sub[, l]) != -1 sub in s
A position of a substring s.position(sub) POSITION(sub IN s) - 1 s.indexOf(sub[, l]) s.find(sub[, l[, r]])
The number of occurences of a substring -- -- -- s.count(sub[, l[, r]])
Replace a substring s.replace(old,new) REPLACE(s, old, new) s.replace(old, new) s.replace(old, new[, count])

Signatures:

  • String . htsql:contains(String sub) -> Boolean
  • String . htsql:position(String sub) -> Number
  • String . htsql:replace(String old, String new) -> String

Regular expressions functions and operators

Function HTSQL SQL Javascript Python
Check a string against a pattern (case-insensitive) s~pat, s!~pat s ~* pat, s !~* pat (/pat/i).test(s) re.search(pat, s, re.I) is not None
Check a string against a pattern (case-sensitive) s~~pat, s!~~pat s ~ pat, s !~ pat (/pat/).test(s) re.search(pat, s) is not None
Find a position of a matching substring -- -- s.search(/pat/) re.search(pat, s).start()
Find a matching substring s.match(pat) SUBSTRING(s FROM pat) s.match(/pat/)[0] re.search(pat, s).group()
Replace all occurences of a pattern with an argument s.substitute(pat, new) REGEXP_REPLACE(s, pat, new[, flags]) s.replace(/pat/, new[, count]) re.sub(pat, new, s[, count])

Signatures:

  • htsql:operation:"~"(String s, String pat) -> Boolean
  • htsql:operation:"!~"(String s, String pat) -> Boolean
  • htsql:operation:"~~"(String s, String pat) -> Boolean
  • htsql:operation:"!~~"(String s, String pat) -> Boolean
  • String . htsql:match(String pat) -> Boolean
  • String . htsql:substitute(String pat, String new) -> Boolean

FIXME: Unicode issues?