Table of Contents
HTSQL is not a full-fledged database system. As opposed to regular data stores, it does not include a storage layer, but relies on a SQL database server to physically store and retrieve data.
HTSQL is designed to work on top of existing relational databases and does not impose any restrictions on how information is modeled and stored there. At the same time, HTSQL works best when the data in the database is highly normalized.
Even though HTSQL wraps a relational database, it does not expose the relational model directly to the users. Instead it derives HTSQL data model from the underlying database and uses this model when presenting data to the users and interpreting user queries. HTSQL data model is very close to traditional network data model utilized by CODASYL and various OODBMS and ORM systems.
In the next sections, we describe HTSQL data model and how it is inferred from the relational model of the underlying database.
When describing how information is represented by HTSQL, we differentiate between a database model and a database instance.
A database model specifies the structure of the database: what types of business entities are represented and how the entities may relate to each other. A database instance is the actual data in the database and must satisfy the constraints imposed by the model. The difference between a model and an instance is the difference between the shape of data and data itself.
Let’s consider the model of a student enrollment system in a fictional university. This model may contain schools, programs administered by a school, departments associated with a school, and courses offered by a department. A concrete instance of this model may contain a school of Engineering with an associated department of Computer Science, which offers a Database Theory course, etc.:
HTSQL represents a database model as a directed graph, or a collection of nodes connected by arrows.
A model graph may contain several types of nodes:
Arrows in the model graph are categorized by the type of nodes they connect:
Each arrow has a name. The arrow name must be unique among all arrows with the same origin node.
The following diagram shows a fragment of the model graph for the student enrollment database (most of the domain nodes and attribute arrows are omitted for clarity).
A navigation in the model graph is a path, or a sequence of arrows, where the first arrow starts at the unit node and the target of each arrow coincides with the origin of the next arrow. We denote a navigation using arrow names separated by a period (.). For example, in the diagram above, the selected navigation is denoted by school.department.name.
A navigation expresses a respective HTSQL query:
We will routinely use the navigational notation to refer to various components of the model graph. Since there is a one-to-one correspondence between class arrows and class nodes, we can use the arrow name to refer to the target class; for example, we say “class school” referring to the target of the class arrow called school. In the same manner, we will often say “link school.department” or “attribute department.name” referring to the last component of the navigation.
As we focus from the database model to a specific instance, nodes are populated with values and entities, and each arrow splits into connections between individual node elements.
A domain node is populated with all values of the respective type. Thus, boolean domain acquires two values: true and false, integer domain is filled with all integer numbers, and so on.
A class node becomes a set of entities of the respective class; e.g. school class becomes a set of university schools, department a set of departments, etc.
The unit node contains a single value, which is called a unit and denoted by @.
In HTSQL, we can only observe entity attributes, but not the entities themselves. When we need to refer to a specific entity in writing, we enclose in brackets the value of an entity attribute which uniquely identifies the entity. For example, attribute school.code uniquely identifies school entities, therefore we may say that [eng], [ns], [sc] are respectively entities representing schools of Engineering, of Natural Sciences, and of Continuing Studies.
An arrow between two nodes splits into a binary relation between elements of these nodes:
The following diagram visualizes the navigation school.department.name on a specific database instance.
Arrows may enforce constraints on connections between elements. We recognize the following constraints: singularity, totality and uniqueness.
Note that arrow constraints are defined on the database model and applied to all instances of the model.
An arrow is called singular if any element of the origin node is connected to no more than one element of the target node. Otherwise, the arrow is called plural.
The following diagram visualises a singular link program.school and a plural link school.department.
An arrow is called total if each element of the origin node is connected to at least one element of the target node. Otherwise, the arrow is called partial.
The following diagram shows a total link program.school and a partial attribute school.campus.
An arrow is called unique if any element of the target node is connected to no more than one element of the origin node. Otherwise, the arrow is non-unique.
The following diagram shows a unique attribute department.name and a non-unique link department.school.
In this section, we explain how underlying relation database model is translated to HTSQL data model.
For the most part, translation of relational structure to HTSQL model is straightforward. SQL data types become domain nodes, SQL tables become class nodes, table columns become attributes. Links between classes are inferred from FOREIGN KEY constraints.
HTSQL allows the administrator to restrict access to specific tables and columns, configure additional database constraints, and rename links. Here we describe how HTSQL creates a database model from the given SQL database in the absence of any configuration.
A name in HTSQL is a sequence of letters, digits and _ characters which does not start with a digit. When an HTSQL name is generated from a SQL name which contains non-alphanumeric characters, those are replaced with an underscore (_).
Each SQL table induces a class node, which, in general, borrows its name from the table.
Some SQL database servers support a notion of schemas, or collections of tables. Tables in the same schema must have unique names, but two or more tables in different schemas may share the same name, in which case HTSQL cannot use the name directly. This naming conflict is resolved as follows:
Each table column induces a class attribute with the same name.
When the column has a FOREIGN KEY constraint, the column name is also used to refer to the respective link. In this case, whether the name refers to an attribute or a link is determined from context; compare
Column constraints are trivially translated to properties of the respective attribute arrows.
Consider, for example, the following fragment of an SQL schema:
CREATE SCHEMA ad; CREATE TABLE ad.school ( code VARCHAR(16) NOT NULL, name VARCHAR(64) NOT NULL, campus VARCHAR(5), CONSTRAINT school_pk PRIMARY KEY (code), CONSTRAINT school_name_uk UNIQUE (name), CONSTRAINT school_campus_ck CHECK (campus IN ('old', 'north', 'south')) ); CREATE TABLE ad.department ( code VARCHAR(16) NOT NULL, name VARCHAR(64) NOT NULL, school_code VARCHAR(16), CONSTRAINT department_pk PRIMARY KEY (code), CONSTRAINT department_name_uk UNIQUE (name), CONSTRAINT department_school_fk FOREIGN KEY (school_code) REFERENCES ad.school(code) ); CREATE TABLE ad.program ( school_code VARCHAR(16) NOT NULL, code VARCHAR(16) NOT NULL, title VARCHAR(64) NOT NULL, degree CHAR(2), part_of_code VARCHAR(16), CONSTRAINT program_pk PRIMARY KEY (school_code, code), CONSTRAINT program_title_uk UNIQUE (title), CONSTRAINT program_degree_ck CHECK (degree IN ('bs', 'pb', 'ma', 'ba', 'ct', 'ms', 'ph')), CONSTRAINT program_school_fk FOREIGN KEY (school_code) REFERENCES ad.school(code), CONSTRAINT program_part_of_fk FOREIGN KEY (school_code, part_of_code) REFERENCES ad.program(school_code, code) ); CREATE TABLE ad.course ( department_code VARCHAR(16) NOT NULL, no INTEGER NOT NULL, title VARCHAR(64) NOT NULL, credits INTEGER, description TEXT, CONSTRAINT course_pk PRIMARY KEY (department_code, no), CONSTRAINT course_title_uk UNIQUE (title), CONSTRAINT course_dept_fk FOREIGN KEY (department_code) REFERENCES ad.department(code) );
In this schema, four tables ad.school, ad.department, ad.program, ad.course generate four classes:
|art||School of Art & Design||old|
|bus||School of Business||south|
|edu||College of Education||old|
|art||gart||Post Baccalaureate in Art History||pb|
|art||uhist||Bachelor of Arts in Art History||ba|
|art||ustudio||Bachelor of Arts in Studio Art||ba|
|acc||200||Introduction to Accounting||3||The initial course in the theory and practice of financial accounting. Topics emphasized include the preparation, reporting, and analysis of financial data.|
|acc||234||Accounting Information Systems||3||This course bridges the gap between two disciplines critical to business operations. This course of study teaches students to design and deploy information technology to improve the accounting systems of an organization.|
Foreign key constraints department_school_fk, program_school_fk, course_dept_fk generate three direct and three reverse links:
A foreign key program_part_of_fk induces two self-referential links on program: