Table of Contents
Installing HTSQL creates a command-line application htsql-ctl:
$ htsql-ctl
The htsql-ctl script is a collection of subcommands called routines. The command-line syntax of htsql-ctl is
$ htsql-ctl <routine> [options] [arguments]
To get a list of routines, run:
$ htsql-ctl help
To describe a specific routine, run:
$ htsql-ctl help <routine>
Many routines require a connection URI parameter, which specifies how to connect to a database. The connection URI has the form:
<engine>://<user>:<pass>@<host>:<port>/<database>
For SQLite, <user>:<pass> and <host>:<port> are omitted, and <database> specifies the path to the database file. Thus, to connect to SQLite database htsql_demo.db located in the current directory, use the URI:
sqlite:htsql_demo.db
For PostgreSQL, if user:pass is omitted, the credentials of the current user are used; if host:port is omitted, the server is assumed to run on the local machine. Thus, to connect to a database htsql_demo running on the same host under credentials of the current user, use the URI:
pgsql:htsql_demo
Other database servers use similar conventions.
You can use option -p to prompt for a password if you do not want to specify the database password in a command line.
To start a command-line HTSQL shell, run:
$ htsql-ctl shell <DBURI>
That starts an interactive HTSQL shell, where you could type and execute HTSQL queries against the specified database.
For example, to start the shell on a PostgreSQL database htsql_demo, run:
$ htsql-ctl shell pgsql:htsql_demo
Interactive HTSQL Shell
Type 'help' for more information, 'exit' to quit the shell.
htsql_demo$
For more details on the shell routine, run:
$ htsql-ctl help shell
To start a demonstration web server running HTSQL, run:
$ htsql-ctl server <DBURI> [<HOST> [<PORT>]]
That starts an HTTP server on the address <HOST>:<PORT>. If <HOST> and <PORT> are omitted, the server is started on *:8080.
For example, to start the HTSQL web server against PostgreSQL database htsql_demo on localhost:3128, run:
$ htsql-ctl server pgsql:htsql_demo localhost 3128
Starting an HTSQL server on localhost:3128 over htsql_demo
If database connection configuration is provided by -C, you could use - as a place holder for the mandatory database URI parameter so that you could provide a HOST and PORT. For example, to run the server on localhost:80 you would write:
# htsql-ctl serve - localhost 80 -C demo-config.yaml
For more details on the server routine, run:
$ htsql-ctl help server
HTSQL has an extensive addon system that can be used to override almost every aspect of server operation or query construction with an adapter. Extensions can live in third party modules or be included in the HTSQL distribution as part of our supported “tweaks”. To list supported extensions, you could type:
$ htsql-ctl extension
To find out more about an extension, such as tweak.autolimit, write:
$ htsql-ctl extension tweak.autolimit
An extension can be enabled using -E parameter on the htsql-ctl command line. For example, to enable the tweak.meta addon on the HTSQL demo database, you’d write:
$ htsql-ctl shell pgsql:htsql_demo -E tweak.meta
Then, you could use the /meta() command registered by this addon:
Interactive HTSQL Shell
Type 'help' for more information, 'exit' to quit the shell.
htsql_demo$ /meta(/table)
Some addons have parameters which can be added to the command line. For example, the tweak.autolimit extension truncates output at limit number of rows. The default is 10k, but this value can be changed:
$ htsql-ctl shell pgsql:htsql_demo -E tweak.autolimit:limit=10
If more than one parameter is possible, use ”,” to separate them:
$ htsql-ctl serve pgsql:htsql_demo \
-E "tweak.shell:server-root='https://demo.htsql.org',limit=10"
HTSQL plugins are found using Python’s entry point feature. When a Python package is installed, it can register itself as an htsql.addon extension so that it could be loaded in this manner.
Extension configuration can be provided with a YAML (or JSON) file using -C on the command line. The top level of this file is a dictionary listing the plugins that are enabled. The second nesting level are plugin parameters, if any.
# demo-config.yaml
htsql:
db:
engine: pgsql
database: htsql_demo
username: htsql_demo
password: secret
host: localhost
port: 5432
tweak.autolimit:
limit: 1000
tweak.shell.default:
In this example, there are three plugins enabled, htsql (which is a mandatory plugin), tweak.autolimit and tweak.shell.default. The htsql plugin has one argument, db which has sub-structure providing connection information. You could then use this configuration file using -C:
# htsql-ctl shell -C demo-config.yaml
If both -E and -C are used, explicit command line options override values provided in the configuration file. This permits a configuration file to be used as a default perhaps using a different database URI.
A configuration parameter may be loaded from another file using !include tag. In the following example, definitions of custom functions for tweak.override addon are loaded from globals.yaml.
# demo-config.yaml
htsql:
db:
engine: pgsql
database: htsql_demo
tweak.override:
globals: !include "globals.yaml"
# globals.yaml
num_school: (count(@school))
trunc_month($d): (date(year($d), month($d), 1))
The HTSQL distribution ships with several built-in extensions which we describe here.
The HTSQL core that provides the SQL translator and an HTTP server is implemented in form of an addon htsql. This addon is always included.
Parameters:
The parameter db specifies how HTSQL connects to the database. It could be written in a short or an expanded form. In the short form, the parameter is an URI:
htsql:
db: pgsql://htsql_demo:secret@localhost:5432/htsql_demo
In the expanded form, a mapping notation is used:
htsql:
db:
engine: pgsql
username: htsql_demo
password: secret
host: localhost
port: 5432
database: htsql_demo
Every component except engine and database is optional.
The parameter password allows you to override the database password keeping the other connection parameters intact.
When debug option is set, text output, as well as some error messages will contain a respective SQL query.
Extensions in the engine namespace implement database backends. An appropriate extension is loaded automatically based on the engine parameter of the connection URI.
HTSQL supports the following database servers:
This addon truncates query output to a given number of rows (10,000 by default). It could be used to prevent accidental denial of service caused by queries returning a large number of rows.
The addon adds LIMIT <limit> to every generated SQL query.
Parameters:
tweak.autolimit:
limit: 1000
This addon adds CORS support to permit AJAX requests to the HTSQL server by web pages hosted on a different domain.
To prevent data leaks, web browsers do not allow AJAX requests to cross domain boundaries. The CORS (Cross-Origin Resource Sharing) specification defines a way for a server to provide a list of domains which are permitted to make AJAX requests.
CORS relies on browser support and may not work with older web browsers.
Parameters:
The origin parameter is a list of domains which are allowed to access the server. The value must either be * (means any, which is the default) or a space-separated list of host names:
http[s]://domain[:port]
Warning
The default settings permit HTSQL queries from any domain. Do not use the default settings with non-public data!
For example, to enable AJAX requests to the HTSQL demo server (located at https://demo.htsql.org/) from domains http://htsql.org/ and http://htsql.com/, we could write:
tweak.cors:
origin: http://htsql.org http://htsql.com
This extension provides protection against cross-site request forgery (CSRF) attacks.
A CSRF attack tricks the user to visit the attacker’s website, which then submits database queries to the HTSQL service from the user’s account. Even though the browser would not permit the malicious website to read the output of the queries, this form of attack can be used for denial of service or changing the data in the database. For background on CSRF, see http://en.wikipedia.org/wiki/Cross-site_request_forgery.
This addon requires all HTSQL requests to submit a secret token in two forms:
If the token is not submitted, the addon prevents the request from reading or updating any data in the database.
Parameters:
tweak.csrf:
allow_cs_read: true
allow_cs_write: false
This extension provides integration with Django web framework. It replaces built-in database introspection and connection handling with facilities provided by Django.
When using tweak.django addon, do not specify the connection URI as it is determined from the Django project database settings.
Parameters:
tweak.django:
settings: mysite.settings
This extension provides the following commands:
tweak.etl:
Warning
This extension is work-in-progress. The interface and implementation of commands may change in the future. Currently only PostgreSQL is supported.
This extension makes a database from a set of CSV files. Each source CSV file becomes a table in the database. The name of the table is derived from the file name; the column names are taken from the first row of the CSV file. The remaining rows become the records in the table.
The database is realized as an in-memory SQLite database. Use optional parameter cache-file to specify a persistent storage for the database.
Parameters:
List of entries describing the source files; each entry has the following fields:
tweak.filedb:
sources:
- file: school.csv
- file: department.csv
- file: program.csv
- file: course.csv
cache-file: cache.sqlite
This extensions allows you to create a gateway to another database and execute HTSQL queries against it.
Parameters:
A mapping of names to datebase configuration.
Each mapping entry creates a function which takes a query as a parameter and execute it against the gateway database.
Database configuration could be either connection URI or nested HTSQL configuration.
tweak.gateway:
gateways:
sqlite_gw: sqlite:database.sqlite
remote_gw:
htsql:
db:
engine: pgsql
database: remote_db
The gateway name becomes a command which executes a query against the gateway database:
/table/:sqlite_gw
This extension provides a meta database describing tables, columns and links of the primary database.
The tweak.meta addon has no parameters.
tweak.meta:
The meta database is composed of the following tables:
Use function meta() to make a query against the meta database.
To get a list of tables:
name |
---|
appointment |
class |
classification |
confidential |
Mapping call notation is also permitted:
name |
---|
appointment |
class |
classification |
confidential |
To list all columns of a given table in the output order:
table_name | name | domain | is_mandatory |
---|---|---|---|
course | department_code | text | true |
course | no | integer | true |
course | title | text | true |
course | credits | integer | false |
To get all links to and from a specific table:
table_name | name | is_singular | target_name | reverse_name |
---|---|---|---|---|
appointment | department | true | department | appointment |
course | department | true | department | course |
department | appointment | false | appointment | department |
department | course | false | course | department |
To describe the meta database itself, apply meta() twice:
name |
---|
column |
field |
link |
table |
tweak.meta:
This extension provides several ways to adjust database metadata. It allows the user to restrict access to specific tables and columns, specify additional database constraints, change the generated names for tables, columns and links, and define calculated attributes.
Parameters:
To restrict access to a specific set of tables, use parameters included-tables and excluded-tables. Parameter included-tables is a list of tables allowed to be used by HTSQL. If this parameter is provided, any table not in this list is completely hidden from the HTSQL processor. Parameter excluded-tables allows you to forbid access to a set of tables.
To forbid use of table confidential:
tweak.override:
excluded-tables: [confidential]
To allow access only to tables in ad and ed schemas:
tweak.override:
included-tables: [ad.*, ed.*]
We could also use block form of a sequence:
tweak.override:
included-tables:
- ad.*
- ed.*
In general, the table name may have the form <table> or <schema>.<table> and could include * meta-character to indicate any number of characters. Table names are case-insensitive and normalized: any non-alphanumeric character is replaced with _.
Similarly, to restrict access to a specific set of columns, use parameters included-columns and excluded-columns. Parameter exclude-columns is a list of column forbidden for use by the HTSQL processor.
To exclude column SSN of table confidential, write:
tweak.override:
excluded-columns: [confidential.ssn]
The column name may have the form <column>, <table>.<column>, or <schema>.<table>.<column> and could include * meta-character.
Note that columns listed in excluded-columns are removed together with all associated key constraints. If you want to hide a column from output, but keep associated primary and foreign keys, use the unlabeled-columns parameter.
HTSQL discovers database constraints from the schema definition. If some constraints are not explicitly defined in the schema, you may provide them using parameters not-nulls, unique-keys and foreign-keys.
Warning
When specifying additional constraints, make sure they are respected by the data; otherwise, the output produced by HTSQL may be invalid.
Parameter not-nulls is a list of columns with NOT NULL constraints.
To indicate that all columns named code and id, as well as column student.full_name do not contain NULL value, write:
tweak.override:
not-nulls: ["*.code", "*.id", student.full_name]
Note that we need to put the column patterns into quotes since YAML syntax does not permit * character at the beginning of a scalar value.
Parameter unique-keys is a list of key specifications of the form <table>(<column>,...)[!]. The trailing symbol ! indicates a PRIMARY KEY constraint. All columns in a primary key are marked as NOT NULL.
To indicate that school.code is a primary key and school.name is unique, write:
tweak.override:
unique-keys:
- school(code)!
- school(name)
Parameter foreign-keys is a list of foreign key specifications, which have the form <origin>(<column>,...) -> <target>(<column>,...). Target columns could be omitted when they coincide with the target primary key.
To define two foreign keys on table program, write:
tweak.override:
foreign-keys:
- program(school_code) -> school(code)
- program(school_code, part_of_code) -> program
In HTSQL, database tables, columns and links have a label, an identifier by which they are referred in HTSQL queries. Normally, entity labels coincide with their names, but parameters class-labels and field-labels allow you to assign them arbitrary labels. In addition, these parameters allow you to assign a label to an arbitrary HTSQL expression.
Use parameter class-labels to assign custom labels to tables and top-level HTSQL expressions. Parameter class-labels is a mapping; each key is a label, the corresponding value is either a table name or an HTSQL expression enclosed in parentheses.
To rename table classification to c14n and to assign a label to expression school^campus, write:
tweak.override:
class-labels:
c14n: classification
campus: (school^campus)
Calculated classes defined by class-labels may accept parameters. For instance, to add a class students_by_year() which takes the year of admission as an argument, write:
tweak.override:
class-labels:
students_by_year($year): (student?year(start_date)=$year)
Use parameter field-labels to assign custom labels to table fields. This parameter is a mapping; each key has a form <table>.<field>, where <table> is the table label, <field> is the field label to define. The corresponding value is one of:
A link specification is a comma-separated list of patterns <origin>(<column>,...) -> <target>(<column>,...). Each pattern must match a foreign key or a reverse foreign key. Column lists could be omitted if the foreign key could be determined uniquely.
To rename a column student.name to full_name, write:
tweak.override:
field-labels:
student.full_name: name
To add a many-to-many link between student and class via enrollment table, write:
tweak.override:
field-labels:
student.class: student -> enrollment, enrollment -> class
class.student: class -> enrollment, enrollment -> student
Note that link specifier student -> enrollment uniquely matches foreign key enrollment(student_id) -> student(id) while enrollment -> class matches foreign key enrollment(class_seq) -> class(class_seq) so we do not need to provide column lists.
The self-referential link from program to all included programs is called, by default, program.program_via_part_of. To assign a different label to this link, write:
tweak.override:
field-labels:
program.includes:
program(school_code, code) -> program(school_code, part_of_code)
To define a calculated field student.avg_grade, write:
tweak.override:
field-labels:
student.avg_grade: (avg(enrollment.grade))
Calculated fields may accept a parameter. To define a calculated field department.students_by_year() accepting the year of admission as a parameter, write:
tweak.override:
field-labels:
department.students_by_year($year): (student?year(start_date)=$year)
By default, when an HTSQL query does not contain a selector expression, all table columns are displayed. To set a custom list of fields for this case, use parameter field-orders.
tweak.override:
field-orders:
program: [code, title, degree]
Parameter unlabeled-tables is a list of tables without an assigned labels, which effectively hides the tables from the users. The tables could still be used in SQL generated by the HTSQL translator.
tweak.override:
unlabeled-tables: [enrollment]
Parameter unlabeled-columns is a list of columns without an assigned tables. Unlabeled columns are hidden from the users, but could be used in SQL generated by the HTSQL translator.
To hide all id columns, write:
tweak.override:
unlabeled-columns: [id, "*_id"]
Use parameter globals define global attributes and functions. This parameter is a mapping: each key is the attribute name with an optional list of parameters, the value is an HTSQL expression.
tweak.override:
globals:
num_school: (count(@school))
trunc_month($d): (date(year($d), month($d), 1))
This addons caches open database connections so that the same connection could be reused to execute more than one query. Use this addon with backends where opening a database connection is an expensive operation.
tweak.pool:
This extension adds a mechanism for serving static files via HTTP. This mechanism is used by other extensions to provide access to static resources such as Javascript and CSS files.
Parameters:
Usually you don’t need to enable this extension explicitly as it is done by extensions with static resources.
This extension adds an in-browser HTSQL editor called the HTSQL shell. The shell provides a visual query editor (based on CodeMirror) with support for syntax highlighting and code completion.
The shell is invoked by command /shell(), which takes an optional query to edit.
Parameters:
tweak.shell:
server-root: https://demo.htsql.org
limit: 100
Enable addon tweak.shell.default to make the shell the default output format.
tweak.shell.default:
This extension provides integration with SQLAlchemy toolkit. It replaces built-in HTSQL database introspection and connection handling with SQLAlchemy facilities.
When using tweak.sqlalchemy addon, do not specify the connection URI as it is determined from the SQLAlchemy engine settings.
Parameters:
The value must have the form <module>.<attr> or <package>.<module>.<attr>.
tweak.sqlalchemy:
engine: sademo.engine
metadata: sademo.metadata
This extension adds access to system catalog tables.
Currently, only PostgreSQL backend is supported.
This extension limits query execution to a given amount of time (1 minute by default). Use it to ensure against accidental denial of service caused by complex queries.
Parameters:
tweak.timeout:
timeout: 300
Currently, this addon is only supported with PostgreSQL.