SQL 101: A Simple Way to Add Data to the DB: the Insert Statement

SQL
Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times

You've learned how to query the database with Select; the next logical step is to learn how to add data to your tables. You can do that with RPG and DFU, but SQL's Insert statement provides an interesting alternative.

 

RPG and DFU provide ways to add new records to your files. RPG forces you to write a program, regardless of how simple or small the data you want to add is, and DFU is a not the friendliest tool to manipulate data. It can be useful to quickly change a couple of records, but it's not adequate for massive data insertion. SQL's Insert provides an interesting functionality, somewhere between the flexibility-yet-slowness of the RPG program and the quickness-yet-unfriendliness of the DFU tool. Let's take a look this SQL instruction.

 

The Insert SQL instruction works in one of two ways: you can either specify the values for the fields that you want to add to the table, issuing an INSERT statement for each new row, or you can insert rows via a Select statement; think of this as a selective CPYF CL command. Let's start with the row-by-row Insert; its syntax is as follows:

 

INSERT INTO <table>

(column1, column2, ..., columnN)

VALUES (value-for-column1, value-for-column2, ..., value-for-columnN )

 

Just like the SELECT statement from a couple of articles ago (SQL 101: The Simplicity and Flexibility of the SELECT Statement) , this statement also has some mandatory reserved words. In this case, INSERT INTO must precede the table in which the data will be inserted and, in this first syntax, VALUE is required to signal the list of values to insert. Note that column1, column2, etc. represent actual column names; value-for-column1, value-for-column2, etc. are the values to insert into the respective columns, and they can be as simple as constant literals or as complex as expressions that include operators and functions. If the value-for-column is not compatible with the respective column data type, the Insert statement ends in error. If you're providing values for all the table's columns you can omit the (column1, column2, ..., columnN) part of the statement. However, I prefer to avoid ambiguity and specify all the column names. You might want to reuse the statement later, and if a column was added in the meantime things might not work as you expect. If you're only specifying a few values, name those columns and nothing else. You have to be aware that the columns for which you're not specifying a value will be filled with the respective default values, blanks or zeros for non-nullable columns (the default for a physical file) or null for nullable columns (SQL table's default).

 

Let's analyze an example; for simplicity sake, I'll use an oversimplified Item Master table named ItmMst. This table has two columns, ItemID, a CHAR(15), and ItemDesc, a CHAR(40). The following statement inserts a new row in the ItmMst table:

 

INSERT INTO ItmMst

(ItemID, ItemDesc)

VALUES ('B52', 'Brand new fictitious item from ACME')

 

That was easy, right? The method has pros and cons; while you retain full control over what's inserted, you also have to issue a statement for each record you want to insert. The alternative syntax of the Insert instruction opens the possibility of inserting multiples lines at a time, but it comes with a cost: you may lose some control over what's inserted. This alternative syntax is as follows:

 

INSERT INTO <table>

(column1, column2, ..., columnN)

SELECT              <other_column 1>

                    [, <other_column 2>, …., <other_columnN>

FROM                <other_table>

[WHERE              <condition 1> <logical operator> <condition 2>, etc]

 

No, it's not a copy/paste error: Insert's alternative syntax uses a SELECT statement to specify the values to insert. It may sound a bit strange, but it's extremely useful. Picture the following scenario: you need to add items to the inventory master file I mentioned in the previous article via a cargo manifest file you just received, but you need to create the item descriptions that don't exist in the ItmMst table first. To make things easier, imagine that the cargo manifest file, named CargoM, also has columns named ItemID and ItemDesc; the first step is to check if there are item IDs in the manifest file that don't match the IDs in the ItmMst table:

 

SELECT       DISTINCT ItemID, ItemDesc

FROM         CargoM

WHERE        ItemID NOT IN (Select ItemID From ItmMst)

 

Don't panic. I'll explain this step by step: remember the IN operator I presented earlier in this series' second article? As you can see here, it can also be used to search the results of a SELECT statement; by the way, a SELECT inside another instruction is commonly called subselect. You can have as many as you want, but this can seriously affect performance and lead to unexpected results, so use it carefully and scarcely. The subselect is processed first and returns a list. Each of CargoM's unique IDs is then compared with this list, using the IN predicate. Why the unique and not all? Because of the DISTINCT keyword; this acts kind of like a GROUP BY clause, returning unique values or, if you prefer, discarding duplicates. This statement produces a two-column list of all the item ids and descriptions that exist in the cargo manifest file but don't exist in the ItmMst table. Now I can use the statement to insert those items in the ItmMst table. Here's how:

 

INSERT INTO  ItmMst ItmMst

(ItmMst.ItemID, ItmMst.ItemDesc)

SELECT       DISTINCT CM.ItemID, CM.ItemDesc

FROM         CargoM CM

WHERE        CM.ItemID NOT IN (Select ItemID From ItmMst)

 

The only possible problem here is that, even though I can reuse this statement, it's not immediately obvious what data will be inserted into the ItmMst table. This is what I meant earlier when I said that you may lose some control over what's inserted when this alternative syntax is used.

 

This is the basic information about Insert. Go ahead and try it yourself. Just try not to clutter your files with useless test data! The next article will continue to focus on DML, explaining the Update statement, a real timesaver that can potentially free you from DFU. Until then, share your thoughts, doubts, and suggestions in the comments section.

BLOG COMMENTS POWERED BY DISQUS