The Learning Center Store
Home
Chapter 3:
Collections, Tables, and Indexes

Excerpted from SQL/400 Developer's Guide, (c) Paul Conte and Mike Cravitz, 29th Street Press, 2000.

Chapter Overview

SQL/400 Developer's Guide
click here
for more information
.

In this chapter, you'll learn how to use several kinds of Create statements to create the basic building blocks of an SQL database: collections, tables, distinct types, indexes, and aliases. We'll also look at the Alter Table statement, which lets you change the definition of a table. In addition, the chapter covers various Drop statements to delete SQL objects.

Creating a Collection

In SQL/400, a collection is an object that contains other database objects, such as tables, views, distinct types, indexes, and stored procedures.1 An SQL/400 collection is implemented as an OS/400 library object, which means that a collection can contain programs, message files, and other OS/400 objects in addition to SQL database objects. (See Chapter 1 if you need to refresh your memory about OS/400 libraries.)

You create a collection using a statement such as the following:2

Create Collection AppDta

This Data Definition Language (DDL) statement

  • creates an OS/400 library named AppDta
  • creates in the collection a set of SQL views that reference the system catalog tables (discussed in Chapter 2)
  • creates an OS/400 journal named QSqJrn and a journal receiver named QSqJrn0001 in the collection. A journal is an AS/400 object that keeps track of all database changes (inserts, updates, and deletes). By default, changes to SQL base tables in the collection are automatically journaled to this journal.

Once you create a collection, you can create SQL base tables, views, and indexes, as well as other types of OS/400 objects (e.g., non-SQL files and programs) in the collection.3

The Create Collection statement has an optional In ASP clause that lets you designate which AS/400 auxiliary storage pool (ASP) OS/400 should use to store any objects created in the collection. If no ASP is specified, the default system ASP is used. ASPs are an AS/400 feature that the AS/400 system administrator can use to set up named groups of disk units. For information about setting up ASPs, see the manual OS/400 Backup and Recovery.

Creating a Table

In a relational database, a base table contains the actual data. To create a table with SQL/400, you enter a Create Table statement that specifies

  • the collection to contain the table (this is optional; if you don't specify a collection, SQL/400 uses an implicit collection4)
  • the table name, which must be different from the name of any other base table, view, index, or non-SQL file in the same collection
  • specifications for one or more columns
  • an optional primary key constraint
  • optionally, one or more unique and/or foreign key constraints
  • optionally, one or more check constraints

Figure 3.1 shows the Create Table statement for a more complete Customer table than the one presented in Chapter 2.

 

Figure 3.1
Create Table Statement for Customer Base Table

Create Table AppDta.Customer
     ( CustID    Dec(    7, 0 ) Not Null,
       Name      Char(  30    ) Not Null,
       ShipLine1 Char( 100    ) Not Null,
       ShipLine2 Char( 100    ) Not Null,
       ShipCity  Char(  30    ) Not Null,
       ShipState Char(   2    ) Not Null,
       ShipPsCd1 Char(  10    ) Not Null,
       ShipPsCd2 Char(  10    ) Not Null,
       ShipCntry Char(  30    ) Not Null,
       PhnVoice  Char(  15    ) Not Null,
       PhnFax    Char(  15    ) Not Null,
       Status    Char(   1    ) Not Null
                                With Default ' ',
       CrdLimit  Dec(    7, 0 ) With Default Null,
       EntDate   Date           Not Null,
Primary Key( CustID ) )

When this statement is executed, SQL/400 creates an externally described physical file object (see Chapter 1 for a discussion of physical files) in the AppDta collection with the following attributes:

  • The file is tagged as an SQL table.
  • It has a maximum of one member.
  • It has no maximum size.
  • When a new record is inserted into the file, UDB/400 reuses record locations marked internally as "deleted" (if any).
  • Both before and after record images are journaled.
  • The file's record format has the same name as the table, and the file has one member with the same name as the table.
  • The file has a uniquely keyed access path.5 As part of the physical file object, UDB/400 creates an internal index on the CustID column, which is the table's primary key.

When the statement is executed, UDB/400 also automatically starts journaling row inserts, updates, and deletes in the table to the QSqJrn journal in the same collection.6 As you can see in Figure 3.1, the Create Table statement first lists the name of the table to be created and then lists the column and constraint definitions, separated by commas and enclosed in a set of parentheses. In the following sections of this chapter, we explain how to code column definitions. Chapter 4 covers constraints, including primary key constraints, such as the one used on Figure 3.1. In brief, this constraint specifies that no two rows will have the same CustID column value; this column thus serves as a unique identifier for each row.

SQL is a free-format language, and you can use multiple lines for a statement as well as blanks between words. The example shows a coding style that puts each column definition and constraint on a separate line and aligns the similar parts of each column definition. Although this columnar style isn't required, it makes the statement much easier to read than an unaligned stream of text.

SQL isn't case sensitive: "Create Table," "CREATE TABLE", and "CrEaTe TaBlE" are all correct. Be aware, however, that string literals are case sensitive, and 'x' isn't generally treated the same as 'X'. Also, when SQL/400 creates OS/400 file and field names for tables, the names are generally stored in OS/400 system dictionary files (see Chapter 2) as upper case. Thus, if you display a list of the tables in the AppDta collection, the table created by the statement in Figure 3.1 will be listed as CUSTOMER. However, you can still refer to the table as "Customer" or "customer" in SQL statements.

SQL/400 Naming Options

The table name used in Figure 3.1 is a qualified name, which includes the collection name (AppDta) before the unqualified table name (Customer). SQL/400 has two alternative approaches to names, including different syntax rules for qualified names:

  • the SQL naming option
  • the system naming option

The SQL naming option conforms closely to the naming conventions established in several official SQL standards (e.g., the ANSI standard), while the system naming option generally follows the rules that apply to OS/400 objects. When you use Interactive SQL (ISQL), execute a RunSqlStm (Run SQL Statement) command, or create an HLL program with embedded SQL, you specify which naming option to use. For example, when you execute a RunSqlStm command, you specify either Naming(*Sql) or Naming(*Sys); if you specify neither, the default is system naming.

With SQL names, you use a period (.) as the separator:

AppDta.Customer

With system names, you use a slash (/) between the collection name and the table (or other object) name:

AppDta/Customer

When you specify an unqualified name, SQL/400 determines the implicit collection that qualifies the object based on the naming convention in effect (SQL or system), the type of SQL statement being executed, and several other options you can specify when you execute the statement. Chapter 13 provides more information about the SQL naming convention, which is the one we follow in this book.7

Also be aware that SQL has many reserved words, such as Create, Table, and Order, that have special meaning. If you want to use one of these reserved words as the name of a table, column, or other SQL object, you must use quotation marks (") around the name when it appears in an SQL statement. The following example shows how you would code an SQL Select statement that retrieves rows from a table named Order:

Select * 
  From "Order"
  Where CustID = 499320

Column Definitions

On the Create Table statement, you code one or more column definitions after the new table name. Each column definition specifies the column name and a data type. Figure 3.2 shows the data types available with SQL/400. (The "Column Data Types" section later in this chapter discusses these data types in more detail.)

 

Figure 3.2
SQL Column Data Types

Column data typeDescription
String
Char( length )Fixed-length character string with a length value from 1 to 32766. If length is omitted, it defaults to 1.
Graphic( length )Fixed-length graphic string with a length value from 1 to 16383. If length is omitted, it defaults to 1.
VarChar( max-length )Variable-length character string with a maximum length (max-length ) from 1 to 32740.
VarGraphic( max-length )Variable-length graphic string with a maximum length from 1 to 16370.
Long VarCharVariable-length character string with a maximum length determined by the amount of space available in the row.
Long VarGraphicVariable-length graphic string with a maximum length determined by the amount of space available in the row.
Blob( max-length ) Binary large object string with a maximum length from 1 to 15728640 bytes. If max-length is omitted, it defaults to 1048576 (1 megabyte). The maximum length can be specified as an integer in the allowable range, as an integer from 1 to 15360 followed by K (increments of 1,024 bytes), or by an integer from 1 to 15 followed by M (increments of 1,048,576 bytes). For example, Blob(2M) is equivalent to Blob(2048K) or Blob(2097152).
Clob( max-length ) Character large object string with a maximum length from 1 to 15728640 bytes. If max-length is omitted, it defaults to 1048576 (1 megabyte). The maximum length can be specified in kilobytes (K) or megabytes (M), as described for the Blob data type.
DbClob( max-length ) Double-byte character large object string with a maximum length from 1 to 7864320 bytes. If max-length is omitted, it defaults to 1048576 (1 megabyte). The maximum length can be specified in kilobytes (K) or megabytes (M), as described for the Blob data type; however, the maximum length is 7680K or 7M.
Column data typeDescription
Numeric
Decimal( precision, scale )Packed-decimal number. The precision value specifies the number of digits and can range from 1 to 31. The scale value specifies the number of digits to the right of the decimal point and can range from 0 to the value specified for precision . You can use Dec(p ) for Dec(p , 0). You can also use Dec by itself for Dec(5, 0); however, always using an explicit precision with Dec provides clearer documentation.
Numeric( precision, scale )Zoned-decimal number. The precision value specifies the number of digits and can range from 1 to 31. The scale value specifies the number of digits to the right of the decimal point and can range from 0 to the value specified for precision . You can use Numeric(p ) for Numeric(p , 0). You can also use Numeric by itself for Numeric(5, 0); however, always using an explicit precision with Numeric provides clearer documentation.
SmallIntTwo-byte, binary integer
IntegerFour-byte, binary integer
BigIntEight-byte binary integer
RealSingle-precision floating-point number
DoubleDouble-precision floating-point number
Float( precision )Floating-point number; precision specifies the number of digits and can range from 1 to 53. The values 1 through 24 specify single-precision, and the values 25 through 53 specify double-precision. You can use Float by itself for a double-precision number.
Column data typeDescription
Date, Time, and Timestamp
DateDate
TimeTime
TimestampTimestamp
Column data typeDescription
DataLink
DataLink( length )A datalink that references a nondatabase file. The value length must be from 1 to 32718; if omitted, it defaults to 200. A datalink value is an internal representation of a Universal Resource Locator (URL) and an optional comment.

       SQL/400 allows the following synonyms for data-type names:

Column data typeSynonym
CharCharacter
VarCharChar Varying
 Character Varying
VarGraphicGraphic Varying
BlobBinary Large Object
ClobChar Large Object
 Character Large Object
DecimalDec
IntegerInt
DoubleDouble Precision

For the VarChar, VarGraphic, Blob, Clob, and DbClob data types, you can optionally specify an Allocate clause after the data type and length. The Allocate clause specifies how many bytes should be reserved for data in the fixed portion of a row's data storage. For example, the column definition

PartDesc VarChar( 500 ) Allocate( 50 )

specifies that the PartDesc column can contain character strings up to 500 bytes long and that the first 50 bytes are stored in the fixed portion of the row. Values longer than 50 bytes have their first 50 bytes stored in the fixed portion of the row and additional data stored in the variable portion of the row.

Following the data type, you can optionally code either or both of the following clauses:8

  • Not Null
  • Default default-value

In database terminology, null means "no value" or "not known," and a column that is null-capable lets you set the column to null, rather than an actual value, as a placeholder to represent one of these meanings. (Chapters 18 and 19 discuss the concept of null in greater detail.) If you specify Not Null when you define a column, the column is not null-capable; otherwise, it is. UDB/400 keeps track of whether a null-capable column is null or not null by setting an associated hidden bit. When the column is null, this hidden bit is 1; when the column has a normal value, this bit is 0. SQL statements can test whether a column is null using the Is Null test described in Chapter 5, as well as set a column null. In Figure 3.1, only the CrdLimit column is defined as null-capable.

The Default clause specifies a default column value, which is used for the column when you insert a row through a view that doesn't include all the underlying table's columns (we discuss views in Chapter 6) or when an explicit value is not provided on an Insert statement (covered in Chapter 7). UDB/400 puts a default value in each column that isn't in the view or the Insert statement's column list. In Figure 3.1, the Status column has a default of blank, and the CrdLimit column has a default of null.

In SQL/400, you can specify the Default keyword with or without an explicit value. The following examples show both alternatives.

PersonName  Char ( 30 ) Not Null 
                        Default 'no name', ... 

PersonName  Char ( 30 ) Not Null
                        Default, ...

The table in Figure 3.3 shows the default values SQL/400 assigns to a column depending on the column type, whether a Default clause was specified (with or without an explicit default value), and whether the Not Null clause was specified. Note that SQL/400 does not assign any default for a column that specifies the Not Null clause but does not specify a Default clause.

Column Data Types

When a column is stored on disk or in memory, the AS/400 uses one or more eight-bit bytes to represent the value. For example, to represent characters, the AS/400 normally uses the Extended Binary-Coded Decimal Interchange Code (EBCDIC) representation, as we describe later. The AS/400 also has several encoding schemes for numeric values, each offering advantages and disadvantages. Although it's useful to know how data is encoded on the AS/400, the more important aspect to understand about SQL/400 column data types is the purpose and relative advantages and disadvantages of each data type. SQL provides data types for character and binary strings, numbers, date and time values, and Universal Resource Locators (URLs) that identify nondatabase files. In the following sections, we look at each of these data-type categories in more detail.

String Data Types
SQL/400 supports both character and binary string data types. Character string columns are typically used for textual data in applications — names of things, addresses, descriptions, and so on. Many applications also use short character codes in place of longer descriptions; for example, a customer status value of A might mean "active, in good standing." In some cases, codes can save significant disk storage and simplify data entry. The Customer table defined in Figure 3.1 has several character columns.

As we mentioned, the AS/400 normally uses an EBCDIC single-byte character set (SBCS) encoding to store characters.9 Because an SBCS encoding uses one byte of storage for each character, there are 256 possible characters, including the uppercase and lowercase letters, digits, punctuation marks, and an assortment of graphical symbols and control characters. For example, the letter A is represented in the default EBCDIC character set by the eight bits 11000001.

UDB/400 also supports international applications with features for non-English, single-byte character sets and double-byte character sets (DBCS), which use two bytes for each character. One of the supported DBCS character sets is the widely used International Standards Organization (ISO) Universal Multiple-Octet Coded Character Set 2 (UCS-2). You select the character set for a table or specific column by specifying one of the available Coded Character Set Identifiers (CCSID).10

The Char column type defines a fixed-length column. Whatever length you specify for the column is the actual length of the character string that's stored in the database, including trailing blanks necessary to pad out shorter values. For example, if you define a column as Char(10) and you attempt to store a value of ABC, the actual value stored will be

The Graphic column type defines a fixed-length double-byte (i.e., DBCS) character column.

The VarChar column type defines a variable-length character column. For a VarChar column, you specify the maximum-length string that the column can hold. A column defined as VarChar(100) can hold strings up to 100 characters long. If you were to store the value ABC in this column, only three characters would be stored. For VarChar columns, UDB/400 also stores the current string length for the column's value — 3 in this example. UDB doesn't store additional blanks to pad VarChar column values.

VarGraphic columns are variable-length character strings that use one of the DBCS encodings.11

When you use VarChar or VarGraphic, it's usually a good idea to specify the Allocate keyword. As we mentioned earlier, this keyword specifies how much space is reserved within the row for the column, with excess data for any string that's longer than the Allocate value being placed in an overflow area. You can improve SQL performance, particularly for larger tables, by choosing an Allocate length that is greater than the length of most of the values that will be stored in the column. For example, suppose you have a description column defined as VarChar(200) in which roughly 80 percent of your descriptions are 100 or fewer characters in length. In this case, specifying Allocate(100) would be a good choice.

Tip:
Don't over-economize when specifying the maximum length for a VarChar or VarGraphic column. Although it's not advisable to greatly exceed the expected maximum size when specifying a length for a VarChar or VarGraphic column, it's better to err on the side of too large rather than too small. Remember, the Allocate keyword (or its default value of 0) is what determines how much space is set aside in each row in auxiliary storage for this type of column. Unless the row is approaching the 32 KByte size limit, you gain little by specifying too small a maximum length.
      For example, if you expect most values to be no more than 100 bytes, a few values to be between 100 and 400 bytes, and only an occasional value to be longer than 400 bytes and no more than 500 bytes, you should define the column as VarChar(500) Allocate (100) rather than VarChar(400) Allocate (100). With either alternative, the column will take up 100 bytes of storage in each row, plus any overflow for strings longer than 100 bytes. But only the VarChar(500) alternative lets you store strings longer than 400 bytes without truncating them.

For data that's represented by a string of noncharacter bytes (e.g., a graphical image), you can add the For Bit Data clause to a Char or VarChar column:

Photo VarChar( 20000 ) For Bit Data ...

The For Bit Data clause tells UDB/400 that the contents of a column should never be interpreted as character data. Among other things, this instruction prevents UDB/400 from attempting to convert the data when it's transmitted to other systems. Note that Char and VarChar columns are limited to a maximum of 32 KBytes, so you may need to use one of the large object data types (discussed next) for binary data.

With SQL/400, you can define columns to hold strings longer than the 32 KByte maximum of Char and VarChar columns. The strings can be binary (the Blob data type), single-byte character (the Clob data type), or double-byte character (the DbClob data type). Large objects can be as much as 15 MBytes in size, making them suitable for large text documents, graphical images, audio data, and other such information. Note that you use the Clob or DbClob data type for character text and the Blob data type for noncharacter data, such as images and audio. When defining a large object column, you specify the maximum length, as in the following example:

EmploymentApplication Clob( 1000000 ) ...

Although you can specify the Allocate keyword for large object columns, in most cases it's best to omit this keyword and let the column assume the default allocation value of 0. With the default allocation, all the object's data is stored in the overflow area. This generally is suitable for the type of data stored in large object columns.

Choosing the Correct String Data Type

To store single-byte character data, use a Char, VarChar, or Clob data type. To store double-byte character data, use Graphic, VarGraphic, or DbClob. And to store binary (noncharacter) data, use Char or VarChar with a For Bit Data clause or use Blob.

To store strings that may be longer than the Char, VarChar, Graphic, or VarGraphic limit (generally 32 KBytes, as shown in Figure 3.2), you must use one of the large object data types (Blob, Clob, or DbClob). As an alternative to storing the data in the UDB/400 database, you can store it in an integrated file system (IFS) file and store a datalink (discussed later) in the database.

For strings that will fit in a Char, VarChar, Graphic, or VarGraphic column, you must decide whether to use a fixed-length string (Char or Graphic) or a variable-length string (VarChar or VarGraphic). The easy category includes those strings that naturally have a fixed length, such as two-character state codes, five-character zip codes, and four-character zip-plus codes.

Other columns are naturally varying in length; name columns and description columns are two examples. If a great deal of variation exists in the lengths of values, variable-length strings may save some disk space, but this often is not a major issue for strings with a relatively small maximum length. Variable-length columns add some processing overhead — how much overhead depends on the number of values that are longer than the allocated space and thus are partially stored in the overflow area. Variable-length columns are more convenient for some string manipulations (as you'll see in Chapter 5) because you don't have to deal with the extra blanks that pad a fixed-length string. The guideline (admittedly, a somewhat arbitrary one) that we generally follow in this book is to specify Char or Graphic for columns whose maximum length is at most 100 and to specify VarChar or VarGraphic for columns whose maximum length is greater than 100.

Numeric Data Types

For numeric columns, you have a choice of several data types:

  • Decimal (or Dec)
  • Numeric
  • SmallInt
  • Integer (or Int)
  • BigInt
  • Real
  • Double
  • Float

Each data type uses a different scheme to store numbers as a series of bytes in the table, and each has its advantages and disadvantages.

Each numeric data type has a precision and a scale. The precision is the maximum number of digits (including digits to the right of the decimal point). The scale is the number of decimal places. For some column data types (e.g., Decimal), you can explicitly define the precision and scale, while for others (e.g., Integer), the precision and scale are completely determined by the data type.

A Decimal column specifies that numbers are stored in packed-decimal format. When specifying Decimal, you generally specify the precision and the scale (if not 0). So to specify a packed-decimal column with a precision of 7 and a scale of 2, you would code Decimal(7, 2) or Dec(7, 2).

A packed-decimal column is implemented by using a half-byte (four bits) for each digit and the low-order half-byte (rightmost four bits) of the low-order byte for the sign. The half-byte 1111 is used for positive values, and the half-byte 1101 is used for negative values. Thus, a four-byte packed-decimal field can hold a number up to seven digits long. The number +1234567 would be represented in a Dec(7, 0) column as

Note that this also would be how +12345.67 would be represented in a Dec(7, 2) column because the decimal point is implicit, rather than actually stored in the data.

A six-digit number would also require a four-byte packed-decimal field because a field must be an integral number of bytes; in a six-digit packed-decimal field, one half-byte would be unused. Thus, the number -123456 would be represented in a Dec(7, 0) column as

In general, you should declare packed-decimal fields with an odd number for the precision.

Packed decimal has been around for a long time on IBM systems and is both compact and fast to process (on the AS/400). Its main disadvantage is that it's not found on many other systems or as a built-in data type in some widely used languages, such as C/C++ and Java. (IBM's C and C++ compilers for the AS/400 do support packed-decimal fields, and Java has the BigDecimal class. In both languages, however, support for decimal data types is slower than for integer and floating point.) If portability or cross-system data access is a concern, be sure to check on the language and data translation capabilities that will be used.

A Numeric column specifies that numbers are stored in zoned-decimal format. The precision and scale are specified the same way as for Decimal columns. Zoned decimal represents a number as one byte (eight bits) per digit. Thus, to hold a seven-digit number, you need a zoned-decimal column that occupies seven bytes. Each byte is simply the EBCDIC character representation of the digit, except the last byte has the high-order half-byte (leftmost four bits) set to indicate the number's sign (1111 for positive, 1101 for negative, or 1110 for unsigned). The number +123 would be represented in a Numeric(4, 0) column as the following four bytes:

The number -123 would be represented in the same column as

Zoned decimal is a legacy of computer punch card days and is not as compact or efficient to process as packed decimal or binary. In general, you should avoid the Numeric column data type.

A SmallInt, Integer (Int), or BigInt column specifies a binary numeric format. A SmallInt column is two bytes and can hold integer values from -32,768 to +32,767. An Int column is four bytes and can hold integer values from
-2,147,483,648 to +2,147,483,647. A BigInt column is eight bytes and can hold integer values from
-9,223,372,036,854,775,808 to +9,223,372,036,854,775,807. You can't specify a precision or scale for SmallInt, Int, and BigInt columns — both of these attributes are implicit in the data type.

In general, the binary format of integers represents a number as a base 2 value, with one bit used for the number's sign.12 The value +1234 would be represented in a SmallInt column as

0000 0100 1101 0010

On the AS/400, negative numbers are represented in binary format using "two's complement" encoding. This approach flips bits for negative numbers and then adds 1, so that -1 is represented by the bit pattern 11111111. This encoding lets the processor perform faster arithmetic. Thus, the value -1234 would be represented in a SmallInt column as

1111 1011 0010 1110

An Int column would represent +1234 as

0000 0000 0000 0000 0000 0100 1101 0010

and -1234 as

1111 1111 1111 1111 1111 1011 0010 1110

Binary fields are compact, and most computer systems and languages support binary values. As we said, SQL/400 BigInt columns can hold values up to approximately ±9 sextillion (a maximum of 19 digits), whereas Decimal columns can hold numbers up to 31 digits long. Also, you can specify a scale for a Decimal column but not for a SmallInt, Int, or BigInt column. Thus, for business applications, the Decimal column data type is a good choice for large numeric fields or those that require precise decimal fraction values (as is often the case in financial applications). The SmallInt, Int, and BigInt column data types are preferable for integers within the supported ranges.

The Real, Double, and Float numeric data types specify a floating-point numeric format. Floating point is generally used for scientific or mathematical applications (and by C, C++, and Java programs) and is not very common in AS/400 business applications, although RPG IV and ILE COBOL/400 now include support for floating point. SQL/400 supports single-precision (Float(24) or Real) and double-precision (Float(53) or Double) floating-point columns. A single-precision column occupies four bytes, and its values can range from approximately 1.2 × 10-38 to approximately 3.4 ×10+38. A double-precision column occupies eight bytes, and its values can range from approximately 2.2 × 10-308 to approximately 1.8 × 10+308. Although SQL/400 has many variations on how you may specify floating-point types for columns, it's sufficient to use Float(24) for single-precision and Double for double-precision columns.

Date, Time, and Timestamp Data Types

UDB/400 supports special data types for date, time, and timestamp values. For these three data types, you specify the Date, Time, or Timestamp type in the column definition but not a length (or number of decimal positions) — the system determines the length of these types of fields.

SQL/400 supports date arithmetic on date columns. You'll learn more about this in Chapter 5. Date columns and date arithmetic can simplify the way many common business applications handle dates.

UDB/400 stores all date fields in the same four-byte internal format on disk. When you execute SQL statements or compile a program that contains SQL statements, you specify an external format (and in some cases, the separator character) to use. SQL/400 uses this information to translate internal values when they're displayed or returned to your program. For example, with ISQL, the StrSql (Start SQL) command's DatFmt and DatSep parameters specify the date format that's used. Figure 3.4 shows the external formats available for date columns in UDB/400.

For several of the date formats, you can also specify a separator character. The allowable values are slash
('/'), dash ('-'), period ('.'), comma (','), and blank (' ').

Time columns work similarly to date columns. UDB/400 stores time values in three-byte fields internally. Figure 3.5 shows the time formats available in UDB/400.

For some time formats, you can specify a separator, which can be a colon (':'), period ('.'), or blank (' '). The *USA time format does not include seconds (even though the internal storage of the field does).

Timestamp columns use 10 bytes internally. The timestamp data type has a system-defined external format that looks like this:

yyyy-mm-dd-hh.mm.ss.uuuuuu

where uuuuuu is millionths of a second. Use timestamps when you need to maintain a precise record of the sequence in which events occur.

Datalink Data Types

Datalink columns are used to hold references to nondatabase files stored in the local integrated file system (IFS), in a remote AS/400's IFS, or in the file system of any attached system (e.g., Windows 2000 Server) that has IBM's DataLink Manager installed.13 In other words, instead of storing the data in the column itself, you can use a datalink to specify a file location for the value of the column. Columns defined as type DataLink contain a reference to the file (in the form of a URL) and, optionally, a comment. When defining a column as DataLink, you should specify a length large enough to hold the largest expected URL plus any comment you may want to associate with the link. Here is an example:

ProductImage  Datalink( 500 ) ...

SQL and System Table Names

As we said in Chapter 1, OS/400 object names have a maximum length of 10 characters. SQL table names, however, can be up to 128 characters. Because SQL tables are implemented as OS/400 physical files, each table always has an associated 10-character system name. If you specify a table name of 10 characters or less in the Create Table statement, the table's system name is the same as its SQL name. However, if you specify a table name longer than 10 characters, UDB/400 assigns a 10-character system name using the first five characters of your longer name followed by a five-digit sequence number. For example, if you create a table named PartSupplier, UDB/400 will assign a system name such as PartS00001. An SQL statement can refer to a table using either the table's full SQL name or its (possibly shorter) system name.

Because you may work with OS/400 file objects using languages other than SQL (e.g., using CL to perform backup operations), it's preferable to have meaningful system names for all tables. After you create an SQL table (or view or index, as we discuss later), you can use the SQL Rename statement to change the system name. Here's an example of the statements you might use:

Create Table PartSupplier ... 

Rename Table PartSupplier 
  To System Name PartSuplr

You can also use the Rename statement to change the SQL name, as in the following example:

Rename Table Acct
   To  Account

When you create a new table for which you want an SQL name longer than 10 characters as well as a meaningful system name, the best practice is to first create the table with the system name and then use a Rename statement to change the SQL name:

Create Table PartSuplr ...

Rename Table PartSuplr
   To  PartSupplier
   For System Name PartSuplr

The advantage of this approach is that PartSuplr is used as the name for the physical file, the physical file member, and the file's record format. Creating the table with the SQL name and then using Rename to change the system name leaves the file member and record format names as their system-generated names (e.g., PartS00001). Be sure to include the Rename statement's For System Name clause when you follow this practice. Without this clause, the Rename statement will change the system name to a generated name, such as PartS00001.

SQL and System Column Names

In Figure 3.1 , we used column names that conformed to the 10-character limit that UDB/400 has for database file field names. As with tables, columns have an SQL name and a system name. If you specify a column name longer than 10 characters, UDB/400 assigns a 10-character system name using the first five characters of the longer name followed by a five-digit sequence number. And, as with table names, it's preferable to use meaningful system names for all columns.14 The Create Table statement makes it quite easy to explicitly assign system names for one or more columns, using the For clause as shown in Figure 3.6.

 

Figure 3.6
Create Table with Long Column Names

Create Table Customer
     ( CustID                           Dec(    7, 0 ) Not Null,
       CustName            For Name     Char(  30    ) Not Null,
       CustShipLine1       For ShpLine1 Char( 100    ) Not Null,
       CustShipLine2       For ShpLine2 Char( 100    ) Not Null,
       CustShipCity        For ShpCity  Char(  30    ) Not Null,
       CustShipState       For ShpState Char(   2    ) Not Null,
       CustShipPostalCode1 For ShpPsCd1 Char(  10    ) Not Null,
       CustShipPostalCode2 For ShpPsCd2 Char(  10    ) Not Null,
       CustShipCountry     For ShpCntry Char(  30    ) Not Null,
       CustPhoneVoice      For PhnVoice Char(  15    ) Not Null,
       CustPhoneFax        For PhnFax   Char(  15    ) Not Null,
       CustStatus          For Status   Char(   1    ) Not Null
                                                       With Default ' ',
       CustCreditLimit     For CrdLimit Dec(    7, 0 ) With Default Null,
       CustEntryDate       For EntDate  Date           Not Null,
  Primary Key( CustID ) )/pre>

For columns with an SQL name of 10 characters or less, you don't have to specify a For clause if you want identical SQL and system names. In SQL statements, you can refer to a column by either the SQL name or the system name.

Descriptive Text and Column Headings

After you create a table, you can define descriptive text and longer comments for the table and its columns. Descriptive text is saved in the SQL catalog and is therefore available to be queried. Moreover, this text shows up with various AS/400 utilities. For example, you see it when prompting (pressing the F4 key) in ISQL. You can also define column headings. ISQL uses column headings when displaying the results of interactive queries.

A Label On statement stores up to 50 characters of descriptive text for a table. The following example adds descriptive text for the Customer table:

Label On Table Customer 
  Is 'Current and past customers'

To add descriptive text for columns, you use the form of the Label On statement shown in Figure 3.7.

 

Figure 3.7
Defining Descriptive Text for Columns

Label On Customer
 ( CustID              Text Is 'Customer ID',
   CustName            Text Is 'Customer name',
   CustShipLine1       Text Is 'Customer shipping address line 1',
   CustShipLine2       Text Is 'Customer shipping address line 2',
   CustShipCity        Text Is 'Customer shipping address city',
   CustShipState       Text Is 'Customer shipping address state',
   CustShipPostalCode1 Text Is 'Customer shipping address postal code 1',
   CustShipPostalCode2 Text Is 'Customer shipping address postal code 2',
   CustShipCountry     Text Is 'Customer shipping address country',
   CustPhoneVoice      Text Is 'Customer voice phone number',
   CustPhoneFax        Text Is 'Customer Fax phone number',
   CustStatus          Text Is 'Customer status',
   CustCreditLimit     Text Is 'Customer credit limit',
   CustEntryDate       Text Is 'Customer info entry date' )

After the table name, there are one or more entries for the column text, separated by commas. Each column text entry has the form

column-name Text Is 'text-string'

You can also define column headings with SQL/400 as shown in Figure 3.8.

 
Figure 3.8
Defining Column Headings
Label On Customer
  ( CustID              Is 'Cust.               ID',
    CustName            Is 'Customer            Name',
    CustShipLine1       Is 'Customer            Shipping            Line 1',
    CustShipLine2       Is 'Customer            Shipping            Line 2',
    CustShipCity        Is 'Customer            Shipping            City',
    CustShipState       Is 'Customer            Shipping            State',
    CustShipPostalCode1 Is 'Customer            Shipping            Postal code 1',
    CustShipPostalCode2 Is 'Customer            Shipping            Postal code 2',
    CustShipCountry     Is 'Customer            Shipping            Country',
    CustPhoneVoice      Is 'Customer            Phone',
    CustPhoneFax        Is 'Customer            Fax',
    CustStatus          Is 'Cust.               Status',
    CustCreditLimit     Is 'Cust.               Credit              Limit',
    CustEntryDate       Is 'Cust.               Entry               Date' )

The Text keyword is not used when defining column headings, and the text string is treated as three 20-character segments. The first 20 characters are used for the first line of the heading; characters 21 through 40 (if present) are the second heading line, and characters 41 through 60 (if present) are the third heading line. You must carefully align the contents of each string to get the right column headings.

SQL also lets you specify longer comments — up to 2,000 characters — for tables, views, columns, and many other SQL objects and their components. Such comments are stored in the SQL catalog, along with descriptive information supplied by Label On statements. To define a comment for a table, you use a Comment On statement such as the following:

Comment On Table Customer 
        Is 'Customer master file ...'

The following examples show how to add comments for a distinct type (covered in the next section) and an index:

Comment On Distinct Type CustIdType 
  Is 'Customer ID Data Type' 

Comment On Index CustNameIdx 
  Is 'Index on Customer Name'

Figure 3.9 shows a Comment On statement for column comments.

 
Figure 3.9
Defining Long Comments for Columns
Comment On Customer
( CustID              Is 'Customer ID',
  CustName            Is 'Customer name',
  CustShipLine1       Is 'Customer shipping address line 1',
  CustShipLine2       Is 'Customer shipping address line 2',
  CustShipCity        Is 'Customer shipping address city',
  CustShipState       Is 'Customer shipping address state',
  CustShipPostalCode1 Is 'Customer shipping address postal code 1',
  CustShipPostalCode2 Is 'Customer shipping address postal code 2',
  CustShipCountry     Is 'Customer shipping address country',
  CustPhoneVoice      Is 'Customer voice phone number',
  CustPhoneFax        Is 'Customer Fax phone number',
  CustStatus          Is 'Customer status',
  CustCreditLimit     Is 'Customer credit limit',
  CustEntryDate       Is 'Customer info entry date' )

Notice the similarity to the Label On statement in Figure 3.7 — except the statement name is different and the Comment On statement doesn't use the Text keyword in the column entries. (This Comment On example uses the same text as the Label On statement in Figure 3.7, but remember that long comments can be up to 2,000 characters while descriptive text is limited to 50 characters.)

Creating and Using User-Defined Types

SQL/400 lets you create a new column data type, known as a distinct type, based on one of the built-in data types listed in Figure 3.2.15The following statement creates the distinct type CustIdType as equivalent to a Dec(7, 0) data type:

Create Distinct Type CustIDType 
  As Dec( 7, 0 ) 
  With Comparisons

Once you create a distinct type, you can use it in column definitions:

CustID  CustIDType Not Null ...

When you use a distinct type to define a column, SQL/400 creates the column using the based-on type — Dec(7, 0) in this example. The Create Distinct Type statement's With Comparisons clause (the default in SQL/400) specifies that you can use equality (i.e., =) and inequality (e.g., <) tests between two columns defined with the same distinct type.16 SQL/400 generally lets you assign a column with the based-on type to a column with the distinct type — for example, assigning a Dec(7, 0) column to a CustIdType column — and vice versa. However, you can't automatically compare a distinct type to any other type of column except for the same distinct type. For example, you couldn't compare a Dec(7, 0) column to a CustIdType column unless you created your own user-defined function to do so. (Chapters 8 and 13 discuss user-defined functions.) Nor can you use operations such as addition for numbers or string concatenation for a distinct type, unless you create the appropriate function. (An SQL technique known as "casting" also enables comparison, concatenation, and arithmetic or other operations on distinct types; we cover casting in Chapter 8.)

On the Create Distinct Type statement, you can specify a built-in type and a length, precision, or scale, if appropriate for the built-in type. You can also specify CCSID, For Bit Data, For SBCS Data, or For Mixed Data for any of the character data types (e.g., Char, VarChar, Clob). You can't specify Allocate, Not Null, Default, or any other column attributes; these must be explicitly specified when you code a column definition on the Create Table statement. The implicit default value (as discussed earlier) for a distinct type is the same as for the based-on type.

Distinct types are often referred to as user-defined types (UDTs) and serve two main purposes. First, a UDT provides a convenient way to define columns that contain the same kind of data. In the example we've been using, every column that should hold a customer ID can be defined with CustIdType. This use of a UDT improves consistency as well. Second, because of the restrictions in using columns defined with a distinct type, you have tighter control over which operations are valid for columns. For example, SQL/400 won't let you add two columns that are defined with CustIdType — a sensible restriction because customer IDs don't represent quantitative values. For UDTs that do need arithmetic or other operations, you can use user-defined functions, as we mentioned earlier, or casting.

Distinct types provide some of the capabilities of domains as described in Chapter 18. However, because they don't let you restrict which values can be stored in a column, they fall short of implementing the full domain concept. Also, a distinct type can't be based on another distinct type — only on one of the built-in data types. These limitations diminish the usefulness of distinct types; however, distinct types can help standardize the definition of primary key columns, and, when used in conjunction with user-defined functions, they can provide additional flexibility in your database definitions.

Using Standard Data Types

You can improve the consistency of your database and simplify application development by deciding on a set of standard data types to be used for column definitions. For example, you might decide to use

VarChar(500) Allocate(100) 
Not Null 
Default

as the standard data type for all columns containing descriptions (e.g., of products, courses). You might also decide to use

Char(50) 
Not Null 
Default

for columns that store one part of a person's name (e.g., first or last name). To keep track of standard data types, you should establish a data dictionary, either manually or by using a CASE tool17. Chapters 19 and 21 provide additional suggestions for using data dictionaries.

For some or all of your standard data types, you might also choose to create SQL distinct types, as described above. However, keep in mind the limitations of distinct types (e.g., you can't specify Not Null) and the fact that you can standardize the data types you use regardless of whether or not you also use distinct types.

Adding, Dropping, or Changing Table Columns

After you create a table, you may need to add or drop (remove) a column or change a column definition. You use the Alter Table statement to do so. The following example shows how to add a column to the Customer table created in Figure 3.1:

Alter Table Customer 
  Add Discount Dec( 5, 3 ) Not Null 
                           Default 0

If you use the Not Null clause when you add a new column, you must also specify a Default clause so that UDB/400 can assign the default value for the column in all existing rows. If you don't specify Not Null or a Default clause, the column is set to null in all existing rows. Be aware that under the covers, UDB/400 actually creates a new OS/400 physical file object for the table, copies data from the old file, and then deletes the old file. All related objects, such as views and indexes, are disassociated from the old file and associated with the new file. This process is all handled by UDB/400, but it may take a while and the table may not be available for use during some of this time. (The same process occurs when you drop a column and for many situations in which you change a column definition.)

Here's how you can drop a column:

Alter Table Customer 
  Drop Discount

To change a column definition, you specify any of the column attributes (e.g., data type) that you want to change:

Alter Table Customer 
  Alter Discount 
       Set Default 0.01

Any column attributes that you don't specify remain unchanged, and, of course, both new and unchanged attributes must be compatible (e.g., the column's data type and default value).

Creating an Index

Although you don't specify a particular order of rows on either the Create Table or the Create View (discussed in Chapter 6) statement, UDB/400 does use internal indexes for efficient row selection and ordering. UDB/400 automatically selects which index (or indexes) to use when a Data Manipulation Language (DML) statement, such as Select, is executed or when an SQL cursor embedded in an HLL program is opened. When you specify a primary, unique, or foreign key constraint (discussed in Chapter 4), UDB/400 creates or shares an internal index as part of the physical file created for a base table. You can create additional indexes using the SQL Create Index statement.

The Create Index statement is fairly simple. You code an index name (which can't be the same name as any base table, view, alias, other index, or non-SQL file in the same collection) and then specify a single base table and the columns over which the index should be created.18 The following example creates an index over the ShipCity and CrdLimit columns of the Customer table:

Create Index AppDta.CustCtyX01 
  On Customer 
   ( ShipCity, 
     CrdLimit Desc )

When you create an SQL index, UDB/400 creates an OS/400 logical file with a keyed access path.19 You cannot access an SQL index directly with any SQL DML statement. In SQL, indexes are solely for the internal use of UDB/400, and their purpose is generally related to performance.

As the above example shows, you can optionally add the Desc keyword after any column name to specify a descending order. You also can optionally specify Unique or Unique Where Not Null in an index definition:

Create Unique Index AppDta.BldRoomX 
  On Building 
   ( BldName, 
     RoomNbr )

The Unique keyword enforces unique key values, including null, the same way as explained earlier for a base table primary key constraint. If you specify Unique Where Not Null, UDB/400 enforces unique values, except for null, as explained in Chapter 4 for unique constraints. In general, you should use a primary key or unique constraint on the Create Table statement for a base table — which causes UDB/400 to create an internal index — rather than create a separate index with Unique or Unique Where Not Null. In UDB/400, independent indexes are primarily useful for two cases:

  • unique indexes in addition to the primary key of a base table
  • non-unique indexes (as a performance aid)

Note that you can specify only a single primary key constraint for each base table, so if you need to enforce a second fully unique key (that is, including nulls) for the table, you need a Create Unique Index statement. Because you can specify multiple unique constraints (which exclude nulls) on a base table, there's little reason to use a Create Unique Where Not Null Index statement in SQL/400. The type of indexes built by the examples discussed so far are called binary radix tree, or b-tree, indexes. As the name suggests, these indexes use a tree structure to store key values and associated record locations.

You can create another type of index, known as an Encoded Vector Index (EVI). This type of index uses an internal table that maps key values to binary codes and a vector of these codes (one per row in the database table) to identify which records have a particular key value. EVIs can dramatically speed certain types of queries.20 Here's a Create Index statement for an EVI:

Create Encoded Vector Index PartWarehouseIdx 
 On Part 
  ( WarehouseID )

Indexes affect database performance both positively and negatively. In brief, indexes can speed data retrieval, but they may slow updates due to the time UDB/400 spends updating the entries in the index (or indexes) over a base table as rows are inserted, deleted, or updated in the table. UDB/400 is fairly efficient at maintaining indexes, so you don't generally need to worry if you have five or fewer indexes over an active table or 10 to 20 over a less frequently modified table.21 And, for tables that are primarily read-only, with little update activity, there's not likely to be any significant overhead from having numerous indexes. However, for actively updated tables, you want to avoid an excessive amount of index maintenance, or you may slow down your application's database updates.

Creating an Alias

You can create an alternative name, called an alias, for any SQL/400 table or view.22 Here's an example that creates the PartSuplr alias for a PartSupplier table:

Create Alias PartSuplr 
   For PartSupplier

You might use an alias when you rename an object and still have existing applications that refer to the old object name. You'll also find an alias helpful when you want to use SQL to access an existing OS/400 database file object that has multiple members (as explained in Chapter 1). The following statement creates the SaleJune alias that can be used to reference the June member of the Sale multimember file — without an alias, there's no way in SQL to directly access a file member other than the first member:

Create Alias SaleJune
   For Sale ( June )

The optional member name is placed in parentheses immediately after the table (or file) name. SQL DML statements can use an alias anywhere that the corresponding table or view name can be used. For example, the following statement retrieves all rows from the June member of the Sale file, using the SaleJune alias created above:

Select * 
  From SaleJune

Although an alias can be invaluable in certain circumstances, you shouldn't use aliases just to create a variety of different names for the same object. In general, you should use one consistent name for each database table and view.

Dropping Collections, Tables, Distinct Types, Indexes, and Aliases

To delete any of the database objects we've looked at in this chapter, you use one of the following statements:

  • Drop Collection collection-name
  • Drop Table table-name
  • Drop Distinct Type type-name
  • Drop Index index-name
  • Drop Alias alias-name

Caution:
Be careful with these statements! Unless you specify the optional Restrict keyword on the Drop Collection statement, all objects in the collection are also deleted. Similarly, without the Restrict keyword, when you drop a table or view, all dependent objects, including views, indexes, and foreign key constraints that reference a table being dropped are also deleted. Here's an example of coding a "safe" Drop Collection statement:

Drop Collection AppDta Restrict
Before this statement can be executed successfully, you must have dropped all objects from the AppDta collection.

Guidelines for SQL/400 Names

A good naming convention for SQL/400 objects goes a long way toward standardizing your database definitions. Table, view, index, column, and other names should meet three criteria:

  • They must be syntactically correct.
  • They should be consistently formed.
  • They should communicate as clearly as possible.

Names should begin with a letter (A—Z) and should use only letters and digits (0—9) in the rest of the name. Avoid national characters (e.g., #, $, and @ in the United States) and the underscore (_) because of international character set conflicts and HLL syntax restrictions. Don't use SQL reserved words (these are listed in an appendix of the DB2 UDB for AS/400 SQL Reference).

It's much easier to work with a database if its names follow a regular pattern and use consistent and clear abbreviations. The value of these two principles is easy to see if you consider the problems that arise when names aren't formed this way. If you came across the column name CAL1, would you have any idea that it meant "customer address, line 1"? Even if you could guess what the C, A, and L stood for, would you know whether it was a shipping or a billing address? Although a more meaningful name such as ShipLine1 is not completely unambiguous, you have a reasonable chance of comprehending it when you first encounter it. In addition, once you've had a brief period to familiarize yourself with the table definition, you'll find it very easy to follow programs that use this column.

Consistency aids comprehension as well — especially when names must be formed with abbreviations. Consider the frequently occurring need to have a column that stores a count of something (e.g., how many of an item are ordered). If some columns use different abbreviations for "count" — Cnt, Cou, Cn — and others use a variety of abbreviations for "number (of)" — Nbr, No, Num — it becomes difficult to quickly recognize the meaning of a column name, especially when you're working with code that uses the columns. You can avoid this problem by following a simple naming convention, such as:

The term "count" and its abbreviation Cnt are used for columns that store an integral count of some item.

This convention specifies both the term that will be used and its abbreviation.

The people who designed the OS/400 interface studied abbreviations extensively and came up with a simple and effective scheme for creating abbreviations. The two main principles they followed are:

  • Whenever possible, use three-letter abbreviations, except for items, such as ID, that have a well-established shorter abbreviation.
  • Form a three-letter abbreviation with the first letter of the term and the two consonants that are most prominent in pronouncing the word (e.g., for "number" use Nbr, not Num).

Because system object and column names are limited to 10 characters, you'll inevitably have to squeeze one or more of the abbreviations in a system name. Thus, it's a good idea to develop a list of terms that appear in your applications and two- and three-character abbreviations for each of them. Figure 3.10 provides a partial list of terms that occur in many applications.

Create and maintain your own list of terms using a word-processor document or a table. Here are some further naming guidelines:

  • If you have only one or two terms in a name and the name can be formed using the full spelling of one or both terms, you can use one or both full terms: Name, Status, ShipState, FileName.
  • Within a single table's definition, don't use the full spelling of a term in some column names and its abbreviation in others: not PhoneDay and PhnEvening, but PhnDay and PhnEvening or PhoneDay and PhoneEvn.
  • Try to avoid a mix of two- and three-character abbreviations for the same term in the same table: BilLine1 and BilPosCod1, not BillLine1 and BilPosCod1, in the same table.
  • When you create a new table, consider the consistency among the column names as a group. Although you may have to vary from your general conventions slightly, it's more important that the column names within a table be as descriptive as possible and consistent among themselves.

Also, most application-oriented names shouldn't include any indication of the table that contains them. For example, in Figure 3.1, we use Name instead of CustName. When Name is defined in the Customer table, it's obvious that Name means "customer's name." If there were a separate Employee table that also had a Name column, it would be clear that Name in that table meant "employee's name."

There's one case in which the column name should incorporate some indication of the containing table — a column that serves as a table's primary key. Notice in Figure 3.1 that we followed this convention with the CustID column. The reason for including Cust in the field name is not that ID alone would be an inadequate name within the Customer table itself. Rather, the reason for this explicit indication of the containing file is that a CustID column may also be defined in other tables that have some relationship to the Customer file. For example, a Sale table would normally have a CustID column to contain the customer ID of the customer placing the order. Obviously, it wouldn't be a good idea to use just ID as the name of the customer ID column in the Sale table. (What name would you give the column that contained the order ID?) Because CustID (or a similar name) is the best choice for a column in a table that's related to the Customer table, it's good practice to use the same name in the Customer table itself.

Coding Suggestions

  • Store most DDL statements to create production collections, tables, views, indexes, and other SQL/400 objects in source file members, and execute them using the RunSqlStm command.
  • Source files are created using a CrtSrcPf (Create Source Physical File) command, such as the following:
  • CrtSrcPf AppDta/SqlClcSrc
    (Note that you use the slash, or /, character as a qualifier with OS/400 object names.)
  • Use a separate source file for different types of SQL objects, and use meaningful source file names, such as the following:
    • SqlClcSrc — SQL statements to create collections
    • SqlTblSrc — SQL statements to create tables and associated indexes
    • SqlViewSrc — SQL statements to create views
  • Use the name of the main SQL object (e.g., the table name) you're creating as the source member name.
  • Enter a source member description that's the description of the SQL object.
  • Place comments at the beginning of your SQL source to describe the object being created.
  • Use a consistent order for statements such as Create, Label On, and Comment On in a source member.
  • Use spaces, blank lines, and separator line comments to improve the readability of your source code.
  • Align column names, data types, compound search conditions, and so on in multiline statements for readability.
  • Establish and follow a good naming convention for all SQL names. Review the "Guidelines for SQL/400 Names" section of this chapter for guidelines to follow when setting up a naming standard.
  • Use Label On and Comment On statements to add descriptive text and column headings for tables, views, and columns.
  • Create tables, indexes, and other SQL objects using meaningful SQL names. If a name is larger than 10 characters, use Rename to create a meaningful system name of 10 characters or less.
  • Use the For clause in column definitions as needed to define meaningful system names for columns.
  • Use standard column definitions for the same type of column — for example, use Char(50) for all columns that contain descriptive text.
  • On the Create Table statement, code the Not Null and Default options for all columns, except columns that should be null-capable.
  • Consider update, as well as retrieval, performance implications to decide which indexes to create. In general, indexes speed retrieval while slowing update.
  • To simplify object management, create each index in the same collection as the table on which it's based.

Chapter Summary

The Create Collection statement creates an SQL collection, which is an OS/400 library object, to contain SQL tables, views, indexes, packages, and other SQL and non-SQL objects.

The Create Table statement creates a base table, which is a UDB/400 externally described, physical file object. A base table contains the actual data for an application. Data in SQL tables is stored as rows. Each table definition specifies one or more columns, which represent different properties (e.g., customer name) of an application object. Each column has an SQL name and a system name, a data type, and an explicit or implicit size. A column may optionally have an explicit default value. A column can be defined to allow or prohibit the null placeholder.

The Label On and Comment On statements can be used to specify descriptive text for tables, indexes, columns, and other SQL objects and components. You can also use Label On to specify column headings of up to three lines.

The Create Distinct Type statement lets you create a user-defined data type (UDT) based on one of the SQL/400 built-in data types. You can use a distinct type in a column definition.

The Alter Table statement lets you add, drop, or change a base table column.

The Create Index statement creates an SQL index, which is a logical file object with a keyed access path. An SQL index is created over a single base table and specifies one or more key columns. You can also create Encoded Vector Indexes (EVIs) to speed up certain types of queries. In general, UDB/400 uses indexes to improve performance, and SQL DML statements cannot directly use an SQL index to access data.

The Drop Collection, Drop Table, Drop Distinct Type, Drop Index, and Drop Alias statements delete the respective SQL objects and (optionally) all dependent or contained objects.


Key Terms

alias
base table
binary
binary radix tree (b-tree) index
binary string data type
character string data type
Coded Character Set Identifiers (CCSID)
collection
column definition
column name
data type
database
datalink
date
distinct type
double precision
double-byte character set (DBCS)
Encoded Vector Index (EVI)
Extended Binary-Coded Decimal
Interchange Code (EBCDIC)

floating point
index
naming convention
null-capable
numeric column
packed decimal
precision
scale
single precision
single-byte character set (SBCS)
SQL naming option
standard data type
system name
system naming option
time
timestamp
Universal Multiple-Octet Coded Character
Set 2 (UCS-2)
user-defined type (UDT)
zoned decimal


Exercises

  1. Describe the purpose of the following five SQL database objects:


    • collection
    • table
    • distinct type
    • index
    • alias

  2. Describe the type of OS/400 object (or objects) created for a collection, a table, and an index.


  3. Give one example of the appropriate Create statement to create each of the five SQL database objects listed in Exercise 1.


  4. Code the Create Table, Label On, and Comment On statements for a base table to hold the following employee information:


    • identification number
    • name (last, first, middle initial)
    • birth date
    • address (street, city, state, postal code)
    • identification number of employee's manager
    • annual salary
    • health-insurance plan (coded)

    Be sure to consider the following:

    • appropriate table and column names
    • column data types and lengths
    • default values
    • the table's primary key
    • good documentation

  5. Code the SQL statement to create a distinct type, and then code a Create Table statement that uses this distinct type in a column definition.


  6. Explain the relationship among the following three database items:


    • a base table primary key constraint (specified with the Create Table or Alter Table statement)
    • an SQL Unique index (specified with the Create Index statement)
    • an SQL Unique When Not Null index (specified with the Create Index statement)

  7. Code the Create Index statement for an index that SQL/400 can use to improve performance when retrieving rows from the Customer table (Figure 3.1) in order by state (major key) and city (minor key).


  8. Give one example of the appropriate Drop statement to delete each of the five SQL database objects listed in Exercise 1.

 


Footnotes

  1. SQL also has the concept of a database object. With SQL/400, each AS/400 system is a distinct database; that is, on any AS/400, there’s always just one SQL database — the entire local system. You can assign the name of this database using the AddRdbDirE (Add Relational Database Directory Entry) command. It follows that an SQL/400 database is a set of collections — the collections that exist on the AS/400.

    Note that with some other relational database management systems (RDBMSs), such as DB2 UDB for Windows 2000 Server, a “database” is more like an SQL/400 collection. That is, multiple databases can exist on a single system, and each database can contain tables, views, stored procedures, and other database objects. With these RDBMSs, nondatabase objects are typically stored in the operating system’s file system, outside the database (or databases). In addition, RDBMSs such as these generally don’t support a collection object. Don’t be tripped up by the two ways that the “database” construct is used in different RDBMSs.


  2. You can also create a collection using the Create Schema statement. Essentially, the Create Schema statement just combines a Create Collection statement and one or more Create Table, Create View, Create Index, Comment On, Label On, Grant, or other statements into a single statement. This can be an efficient way to transmit a single (usually program-generated) SQL statement to create an entire collection and all the SQL objects in it on a remote system. You can execute the Create Schema statement only by placing it in a source file member and executing the RunSqlStm (Run SQL Statement) command.


  3. You can create any type of UDB/400 physical or logical file — for example, using Data Description Specifications (DDS) and the CrtPf (Create Physical File) or CrtLf (Create Logical File) CL command — in a collection as long as you don’t specify the optional With Data Dictionary clause on the Create Collection statement. The With Data Dictionary clause is provided strictly for compatibility with earlier releases of SQL/400 and generally should not be used because it limits the types of files that can be placed in a collection to physical files with one member or source physical files.

    You can also create SQL tables, views, and indexes in a noncollection library created with the CrtLib (Create Library) command. Noncollection libraries don’t have the set of views over the system catalog files, however. You also must explicitly create a journal and journal receiver if you want to journal tables in a noncollection library.


  4. Later in this chapter, we discuss explicit and implicit qualification of SQL names.


  5. The physical file for a base table with no primary key or other constraint does not have a keyed access path, and the file object doesn’t have an internal index.


  6. If the journal exists. The Create Collection statement automatically creates the QSqJrn journal, but you can delete the journal, or you can specify a noncollection library (without a QSqJrn journal) on the Create Table statement. To control journaling explicitly on SQL/400 tables, you can use the StrJrnPf (Start Journaling Physical File) and EndJrnPf (End Journaling Physical File) commands.


  7. The manual DB2 UDB for AS/400 SQL Reference documents the rules for implicit qualifiers for all SQL statements.


  8. The following optional keywords can also be specified in Char, VarChar, and Clob column definitions to indicate the type of character data contained in the column: For SBCS Data, For Mixed Data, and CCSID. See the DB2 UDB for AS/400 SQL Reference for details about these keywords.


  9. Many other systems, such as Unix and DOS on PCs, use the American National Standard Code for Information Interchange, or ASCII, set of character codes. The AS/400 also provides ASCII support both within UDB/400 and in the integrated file system.


  10. DB2 UDB for AS/400 SQL Reference lists CCSID values and the associated character sets. This book generally assumes the default USA EBCDIC encoding.


  11. The Long VarChar and Long VarGraphic types are alternatives to VarChar and VarGraphic; for these alternatives, the system calculates the maximum length based on the definition of other columns in the table. In general, Long VarChar and Long VarGraphic are not recommended.


  12. Our normal numbering system is base 10 — from right to left, the positions in a number represent the number of 1s, 10s, 100s, and so on. In base 2, the positions correspond to powers of 2 — 1, 2, 4, 8, and so on. Both systems, of course, can represent the full range of integers — 1, 2, 3, and so on.


  13. See Chapter 1 for an explanation of the IFS and the various file systems available on the AS/400. IBM offers DataLink File Managers on all systems for which a version of DB2 Universal Database (DB2 UDB) is available.


  14. Among other considerations, non-SQL statements in ILE RPG programs must refer to columns by their system names.


  15. UDB/400 creates an OS/400 SQL User-Defined Type (*SqlUdt) object for an SQL distinct type.


  16. IBM recommends you code this clause explicitly for compatibility with DB2 implementations on other platforms.


  17. CASE stands for Computer Aided Software Engineering. CASE tools are often used to define databases logically and sometimes physically (as described in Part III).


  18. You can also create an SQL/400 index over a non-SQL, externally described physical file.


  19. For performance tuning, you may want to know that the logical file for an SQL/400 index has attributes that correspond to the MaxMbrs(1) and Maint(*Immed) parameters on the CrtLf command. The logical file has the same record format as the physical file over which it’s created. Note that you can change the index maintenance of non-unique indexes to delayed (*Dly) or rebuild (*Rebld) by using the ChgLf (Change Logical File) command.


  20. In particular, queries with search conditions where each predicate uses only equal predicates and where the left side specifies column names (e.g., PartColor = 'RED' and PartWhseLoc = 56). See Chapter 5 for a discussion of search conditions and predicates. See the manual DB2 UDB for AS/400 SQL Programming for more details about EVIs.


  21. Keep in mind that these are very approximate rules of thumb. Tuning a database implementation requires an in-depth knowledge of the operating characteristics of the particular release of UDB/400 you’re using (performance characteristics can change with each new release) and careful measurement.


  22. UDB/400 creates a Distributed Data Management (DDM) file object for an SQL alias
.


You are at a pentontech.com site.
If you have a question about our products or about an order you have placed,
please contact the customer service department at (800) 650-1804 or (970) 203-2914 or via e-mail.
If you have a technical question about this site, please contact us.
Terms & Conditions | Privacy Policy
Copyright © 2009 Penton Technology Media – Loveland.