String methods, functions and operators
Cast to String
| Function | HTSQL | SQL | Javascript | Python |
| Cast to String | string(value) | CAST(value AS TEXT) | String(value) | str(value) |
Signatures:
- htsql:string(Any value) -> String
Type resolution: Untyped => String.
Question: Should we use more sophisticated cast for date/time related functions to make the output uniform across different database engines? Should the output be modeled after the string representation of corresponding Python types then?
TODO: Demonstrate a string representation for each type.
String comparison
Issues with the string comparison (no longer relevant).
The following comparison operators could be useful:
- normal comparison with: NULL = NULL => NULL, NULL != NULL => NULL
- case insensitive comparison, with the same NULL semantics, for strings only
- comparison with different NULL semantics: NULL = NULL => TRUE, NULL != NULL => FALSE
The current comparison semantics:
- the operators =, !=: the normal comparison for all domains except Strings; for Strings, it is a case-insensitive comparison.
- the operators ==, !==: for Strings only, case-sensitive comparison.
- no way to express NULL = NULL is TRUE semantics.
Problems:
- x=y is true while x<=y is false?! Since we have two classes of the equality operator, shouldn't we be able to perform > and < comparisons in both case-sensitive and case-insensitive manner?
- no uniform "is equal to" and "is not equal to" operator. you need to use == for String and = for everything else.
- no "is identical to" and "is not identical to" operator, like IS NOT DISTINCT FROM/IS DISTINCT FROM in PostgreSQL, is/is not in Python and '==='/!== in Javascript.
Solution:
- ???
- make =/!= be a normal comparison operator (???)
- add another, case insensitive comparison operator (symbol? PostgreSQL uses ~* and !~* for case insensitive regular expression search. Maybe =*, !=*, <*, <=*, >*, >=*?)
- make ==/!== be an "is identical to" operator (in JS: ===/!== ???)
- ???
Basic string operations
| Function | HTSQL | SQL | Javascript | Python |
| Number of characters | s.length() | CHARACTER_LENGTH(s) | s.length | len(s) |
| The k-th character | s[k] | SUBSTRING(s FROM k+1 FOR 1) | s[k] | s[k] |
| Extract a substring (left, size) | s.substr(l[,sz]) | SUBSTRING(s FROM l+1 FOR sz) | s.substr(l[, sz]) | -- |
| Extract a substring (left, right) | s[l:r] | -- | s.slice(l[, r]) (NB s.substring(l[, r])) | s[l:r] |
| Replace a substring (left, size) | s.overlay(new,l[,sz]) | OVERLAY(s PLACING new FROM l+1 FOR sz) | -- | -- |
| Concatenation | s1+s2 | s1 || s2 | s1 + s2, s1.concat(s2, ...) | s1 + s2 |
| Repetition | s.repeat(n) | REPEAT(s, n) | -- | s * n |
Signatures:
- String . htsql:length() -> Number
- htsql:operator:"[.]"(String s, Number k) -> String
- String . htsql:substr(Number l[, Number sz]) -> String
- htsql:operator:"[.:.]"(String s, Number l, Number r) -> String
- htsql:operator:"[.:]"(String s, Number l) -> String
- htsql:operator:"[:.]"(String s, Number r) -> String
- htsql:operator:"[:]"(String s) -> String
- String . htsql:overlay(String new, Number l[, Number sz])
- htsql:operator:"+"(String a, String b) -> String
- String . htsql:repeat(Number n) -> String
Note: In HTSQL, we count characters from 0 (Python, JS), rather than from 1 (SQL).
Filtering and formatting functions
| Function | HTSQL | SQL | Javascript | Python |
| Strip leading spaces/other characters | s.ltrim([chars]) | TRIM(LEADING [chars] FROM s) | -- | s.lstrip([chars]) |
| Strip trailing spaces/other characters | s.rtrim([chars]) | TRIM(TRAILING [chars] FROM s) | -- | s.rstrip([chars]) |
| Strip both leading and trailing spaces | s.trim([chars]) | TRIM(BOTH [chars] FROM s) | -- | s.strip([chars]) |
| Left-justify a string | s.lpad(w[,fillchar]) | LPAD(s, w[, fillchar]) | -- | s.ljust(w[, fillchar]) |
| Right-justify a string | s.rpad(w[,fillchar]) | RPAD(s, w[, fillchar]) | -- | s.rjust(w[, fillchar]) |
| Convert all characters to lowercase | s.lower() | LOWER(s) | s.toLowerCase() | s.lower() |
| Convert all characters to uppercase | s.upper() | UPPER(s) | s.toUpperCase() | s.upper() |
| Translate (apply a map for) characters | s.translate(fromchars,tochars) | TRANSLATE(s, fromchars, tochars) | -- | s.translate(fromtotable, delchars) |
Signatures:
- String . htsql:ltrim([String chars]) -> String
- String . htsql:rtrim([String chars]) -> String
- String . htsql:trim([String chars]) -> String
- String . htsql:lpad(Number w[, String fillchar]) -> String
- String . htsql:rpad(Number w[, String fillchar]) -> String
- String . htsql:lower() -> String
- String . htsql:upper() -> String
- String . htsql:translate(String fromchars, String tochars) -> String
Note that padding/justifying functions in SQL (but not in Python) truncate the original string if it is too long.
Search and replace functions
| Function | HTSQL | SQL | Javascript | Python |
| Check if a string contains a substring | s.contains(sub) | POSITION(sub IN s) != 0 | s.indexOf(sub[, l]) != -1 | sub in s |
| A position of a substring | s.position(sub) | POSITION(sub IN s) - 1 | s.indexOf(sub[, l]) | s.find(sub[, l[, r]]) |
| | -- | -- | -- | s.count(sub[, l[, r]]) |
| Replace a substring | s.replace(old,new) | REPLACE(s, old, new) | s.replace(old, new) | s.replace(old, new[, count]) |
Signatures:
- String . htsql:contains(String sub) -> Boolean
- String . htsql:position(String sub) -> Number
- String . htsql:replace(String old, String new) -> String
Regular expressions functions and operators
| Function | HTSQL | SQL | Javascript | Python |
| Check a string against a pattern (case-insensitive) | s~pat, s!~pat | s ~* pat, s !~* pat | (/pat/i).test(s) | re.search(pat, s, re.I) is not None |
| Check a string against a pattern (case-sensitive) | s~~pat, s!~~pat | s ~ pat, s !~ pat | (/pat/).test(s) | re.search(pat, s) is not None |
| | -- | -- | s.search(/pat/) | re.search(pat, s).start() |
| Find a matching substring | s.match(pat) | SUBSTRING(s FROM pat) | s.match(/pat/)[0] | re.search(pat, s).group() |
| Replace all occurences of a pattern with an argument | s.substitute(pat, new) | REGEXP_REPLACE(s, pat, new[, flags]) | s.replace(/pat/, new[, count]) | re.sub(pat, new, s[, count]) |
Signatures:
- htsql:operation:"~"(String s, String pat) -> Boolean
- htsql:operation:"!~"(String s, String pat) -> Boolean
- htsql:operation:"~~"(String s, String pat) -> Boolean
- htsql:operation:"!~~"(String s, String pat) -> Boolean
- String . htsql:match(String pat) -> Boolean
- String . htsql:substitute(String pat, String new) -> Boolean
FIXME: Unicode issues?
![(please configure the [header_logo] section in trac.ini)](/chrome/site/your_project_logo.png)