18
Tue, Jun
2 New Articles

# Programming in ILE RPG - Creating and Using Files: Data Types and Data Storage

###### Typography
• Smaller Small Medium Big Bigger
• Default Helvetica Segoe Georgia Times

You must assign a data type to each column in a table.

By Brian Meyers and Jim Buck

The column’s data type determines how its values are stored, how much storage the column occupies, and what kinds of operations can be performed on the column when it is used within a program. Commonly used data types fall into three general categories:

• Character
• Numeric
• Date

To understand the various data types, let’s first examine how the computer stores data. You probably know that any numeric value can be converted from its familiar decimal, or base 10, value to a corresponding value in binary, or base 2, notation. The computer stores all data in this binary format, with all data represented as a combination of 1s and 0s. At first glance, data representation should be a simple matter of converting values from one base to another. Many characters and values that you want to represent to the computer are not numbers, though— instead, they are letters of the alphabet, for example, or special characters such as \$ and {.

A bit is a binary digit that can have a value of 1 or 0 (sometimes called on or off). A string of eight bits is called a byte. A byte can have up to 256 different possible combinations of bit states; the system can use these combinations to represent up to 256 different characters.

IBM developed a coding scheme to allow a data character—numeric or non-numeric—to be represented to the computer. This coding scheme, based on the English alphabet, is called Extended Binary Coded Decimal Interchange Code, or EBCDIC (generally pronounced as eb-si-dik). EBCDIC assigns a unique eight-bit binary pattern to each representable character. Capital A, for example, is 11000001 in EBCDIC, and the digit 1 is represented as 11110001. The leftmost four bits are often called zone or high-order bits, and the rightmost four bits are digit or low-order bits. Because eight bits constitute a byte, it takes one byte of storage to store each character in EBCDIC. IBM i stores all non-numeric, or character, data values this way: one byte represents a character.

Instead of reading long strings of 1s and 0s, most programmers find it easier to represent byte values by using hexadecimal (base 16) notation for the high and low order bits, according to the following table:

Using this table, you can see that the bit pattern 11000001 can be referred to as hex character C1 (usually noted as X'C1').

All computers use some kind of an encoding system to represent characters as bit patterns, but not all systems use EBCDIC, which is unique to some IBM systems. You may already be familiar with the encoding system called ASCII (American Standard Code for Information Interchange), generally pronounced as as-kee. EBCDIC was developed separately from ASCII. Consequently, the bit patterns in ASCII are not the same as those in EBCDIC. For example, the letter A in EBCDIC is represented by a hex C1, but in ASCII it is a hex 41. In addition, the two schemes use different collating sequences; that is, they do not sort all the characters in the same sequence. In EBCDIC, lowercase characters sort before uppercase characters, and letters sort before numbers (e.g., a, b, c, … A, B, C, … 7, 8, 9). In ASCII, the opposite is true (e.g., 1, 2, 3, … A, B, C, … x, y, z). Figure 3.5 shows some commonly used characters and their equivalent EBCDIC and ASCII encoding (in EBCDIC sequence).

Figure 3.5: EBCDIC and ASCII coding

Character Data Types

Character data can be any symbol that the system supports. Alphabetic letters, numbers, and special characters (e.g., punctuation marks, currency symbols) are all character data. A character string is a sequence of characters. An RPG program typically processes character data by assigning values from one character variable to another, by concatenating (joining) character strings, or by converting character data to another data type.

CHAR

When SQL defines a column as CHAR (or CHARACTER), it is defining a fixed-length character string. You specify the length attribute in parentheses. All values in the column have the same length in storage. Values that don’t fill the entire length are padded with blanks. The following column definition defines Address, a fixed-length character column 35 characters (bytes) long:

Most character columns in an IBM i database are fixed length. When you don’t specify a length, the column is one byte—the maximum length is 32,766.

VARCHAR

Specifying a data type of VARCHAR (or CHAR VARYING or CHARACTER VARYING) defines a column as a variable-length character string. You specify the maximum size (up to 32,740) in parentheses. The following column definition defines Email, a variable-length character column 1–256 characters long:

You use a variable-length character string when the length is uncertain. Typically, the value in a variable-length column is not padded with blanks to fill the entire allotted storage space. Instead, the same column in different rows contains different lengths, depending upon the column value. Use CHAR when most values in a column are likely to be the same size, and use VARCHAR when you anticipate the values will vary considerably in size.

Numeric Data Types

Numeric data is limited to numeric digits 0–9. Numeric variables also incorporate a sign embedded in the rightmost position of the variable. An RPG program typically processes numeric data by assigning values from one numeric variable to another, by evaluating arithmetic expressions, or by converting numeric data to another data type.

SQL defines numeric data with a precision and a scale. Precision refers to the total number of digits available in the number without regard to any sign. Scale indicates how many of those digits are fractional (i.e., to the right of the decimal point).

Numeric data types that SQL supports include numeric, decimal, and integer (as well as other less-frequently used types). Why are there different data types for the same class of values? RPG does not differentiate among numeric data types in determining the kinds of operations that you can perform on a field or the kinds of output editing that are possible; however, the data type of a numeric field—zoned numeric, packed decimal, or integer— determines how that field is represented and stored within the database.

To understand the various data representations, recall the EBCDIC coding scheme used to represent characters, in this case, numbers. The following table shows the EBCDIC codes for the digits 0–9:

Notice that the high-order bits (zone) of all digits are identical: 1111. This means that the zone portion is redundant for numeric data; that is, if the database already knows that the data is numeric, it knows that the zones of the data are all 1111, or F. The database can take advantage of this knowledge to store numeric data efficiently.

NUMERIC

Specifying a data type of NUMERIC (or NUM) defines a column as a zoned (or signed) decimal number. Zoned decimal numbers take a full byte to store each digit of a numeric value. Thus, a three-digit number occupies three bytes. The zone of the rightmost digit stores the sign of the data: 1111 (F) represents a positive value, whereas 1101 (D) represents a negative value. Zoned representation, then, is nearly identical to character representation except that the sign is embedded in the rightmost digit’s byte:

The following column definition defines CredLimit, a zoned decimal numeric column with a precision (total length) of nine digits and a scale (number of decimal places) of two digits:

Accordingly, CredLimit can hold a value up to 9,999,999.99. The maximum precision for a numeric column is 63 digits, and the scale can be any number from 0 to the precision of the number. If you omit scale, it is assumed to be 0; if you omit both precision and scale— specifying simply NUMERIC—SQL assumes NUMERIC(5,0).

Zoned decimal values can be useful when a program needs to present numeric values as text. For positive numeric values, the zoned numbers render as the correct characters (136, in the preceding example). For negative numbers, however, the last digit renders as an alphabetic character J–R (X'D1'–X'D9').

DECIMAL

Specifying a data type of DECIMAL (or DEC) defines a column as a packed decimal number. Packed numbers take advantage of the redundancy built into digit representation by simply not bothering to store the zones of numbers. In packed format, only the digit, or low order, bits of a number are stored, with the sign of the number represented by an additional four bits. These sign bits always occupy a packed decimal value’s last four bit positions:

Data in packed format takes just over half the amount of storage necessary to store the same number in zoned decimal format. A three-digit number occupies two bytes. This may not seem like a great difference, but larger numbers generate larger storage savings: a 63-digit number in packed format occupies only 32 bytes.

The following column definition defines the same CredLimit column in the earlier example, but this time it is a packed decimal number:

If you omit scale, it is assumed to be 0; if you exclude both precision and scale—specifying simply DECIMAL—SQL assumes DECIMAL(5,0). Because of the method packed numbers use to store numeric values, programmers usually define packed decimal columns with an odd precision so that the complete value (with the sign added) occupies whole bytes.

Notice that the character representation for packed numbers is not generally easily translatable to the numeric value. The hexadecimal representation usually makes sense, but the bit patterns for packed numbers render as nondisplayable or unrelated EBCDIC characters.

Most numeric columns in the typical IBM i database are packed decimal numbers. Although some programmers prefer to define numeric columns as zoned—because it’s easier to print or view the raw data in this format—the computer works more efficiently with numbers stored in packed decimal format. For most numeric processes, RPG implicitly converts numeric values to packed decimal format before processing them.

INTEGER, SMALLINT, BIGINT

You’ve seen that packed decimal numbers are more compact than zoned decimal numbers. Integer representation is even more compact than packed decimal representation. Integers store numeric values in a binary format. Each bit in a string is assigned a decimal value, and the values of the on bits are simply added together to determine the number’s signed value. SQL supports three integer-related data types. Which one you use will depend upon the range of values that the numbers must hold:

The following column definition defines the column Terms, an integer:

Notice that there is no need to indicate a precision or scale. The precision (10) is set by the INT data type, and the scale for all integers is 0.

Although the use of integers is common in RPG programs, its use in database objects is less so. Most numeric database columns are DECIMAL or NUMERIC data types, which have more flexible precision and scale options than Integer data types. If, however, the anticipated data values fit into the allowed range and scale for any of the Integer data types, they are an efficient and useful data type for numeric data.

Date and Time Data Types

IBM i supports several date and time data types, which represent dates, times, and time- stamps. An RPG program typically processes dates by assigning one date variable to another, by perform date arithmetic and comparisons, or by converting a date to another data type. The date and time data types are neither character nor numeric data types; they are distinct data types that the computer can easily recognize, process, and manipulate as dates without requiring special coding or conversion. They are sometimes called native dates, as opposed to legacy dates, which are character strings or numbers that have valid date or time values. Legacy dates require special processing to convert them to dates or to use them as dates.

DATE

Specifying a data type of DATE defines a column as a date. A date is a three-part value, incorporating a month, day, and year between 0001–9999. When defining a date, you do not need to specify a length because the system determines the length automatically. The following column definition defines the column Renewal, a date:

An ILE RPG program can retrieve a date value and process it. The presentation of the date in the program conforms to any of eight different formats in use by the program. Chapter 8 covers the date data type in more detail.

TIME

Specifying a data type of TIME defines a column as a time of day, integrating hours, minutes, and seconds. Time data follows many of the same principles as the date data type. When defining a time, you do not need to specify a length. The following column definition defines the column CallTime, a time:

An ILE RPG program can retrieve a time value and process it. As with dates, the presentation of the time in the program conforms to any of five different time formats in use by the program, which Chapter 8 details.

TIMESTAMP

A timestamp is a combination of a date and time in a single variable. Specifying a data type of TIMESTAMP defines a timestamp column. Timestamps have six or seven parts: year, month, day, hour, minute, second, and optionally fractional seconds. The following column definition defines the column LastUpdate, a timestamp with microseconds included:

By default, timestamps contain microseconds (six digits to represent fractional seconds):

If you need a different scale, the column definition can include a length 0–12 to represent the

desired number of fractional seconds’ digits. For example,

omits all fractional seconds:

Next time: Creating Views with SQL.  Can't wait? Want to learn more about Programming in ILE RPG?  Pick up the book in the MC Bookstore today!

Jim Buck's career in IT has spanned more than 35 years, primarily in the college education, manufacturing, and healthcare industries. Past president (13 years) of the Wisconsin Midrange Computer Professional Association, he has served on several teams developing IBM and COMMON certification tests. Jim has co-authored several IBM i textbooks with Bryan Meyers that are used by many companies and in colleges worldwide. Other accomplishments include: recipient of the 2007 IBM System i Innovation - Education Excellence Award, 2014 COMMON President's Award, and 2013/2016/2017 IBM Champion - Power Systems.

Jim is the president and founder of imPower Technologies, where he provides professional IBM i training and consulting services. He is active in the IBM i community, working to help companies train their employees in the latest IBM technologies and develop the next generation of IBM i professionals.

MC Press books written by Jim Buck available now on the MC Press Bookstore.

 Control Language Programming for IBM i Master the A-Z of CL, including features such as structured programming, file processing enhancements, and ILE. List Price \$79.95 Now On Sale Mastering IBM i Get the must-have guide to the tools and concepts needed to work with today's IBM i. List Price \$85.95 Now On Sale Programming in ILE RPG Get the definitive guide to the RPG programming language. List Price \$95.95 Now On Sale Programming in RPG IV Understand the essentials of business programming using RPG IV. List Price \$79.95 Now On Sale

\$0.00 Raised:
\$

### Resource Center

• Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application. You can find Part 1 here. In Part 2 of our free Node.js Webinar Series, Brian May teaches you the different tooling options available for writing code, debugging, and using Git for version control. Brian will briefly discuss the different tools available, and demonstrate his preferred setup for Node development on IBM i or any platform. Attend this webinar to learn:

• More than ever, there is a demand for IT to deliver innovation. Your IBM i has been an essential part of your business operations for years. However, your organization may struggle to maintain the current system and implement new projects. The thousands of customers we've worked with and surveyed state that expectations regarding the digital footprint and vision of the company are not aligned with the current IT environment.

• IBM announced the E1080 servers using the latest Power10 processor in September 2021. The most powerful processor from IBM to date, Power10 is designed to handle the demands of doing business in today’s high-tech atmosphere, including running cloud applications, supporting big data, and managing AI workloads. But what does Power10 mean for your data center? In this recorded webinar, IBMers Dan Sundt and Dylan Boday join IBM Power Champion Tom Huntington for a discussion on why Power10 technology is the right strategic investment if you run IBM i, AIX, or Linux. In this action-packed hour, Tom will share trends from the IBM i and AIX user communities while Dan and Dylan dive into the tech specs for key hardware, including:

• TRY the one package that solves all your document design and printing challenges on all your platforms. Produce bar code labels, electronic forms, ad hoc reports, and RFID tags – without programming! MarkMagic is the only document design and print solution that combines report writing, WYSIWYG label and forms design, and conditional printing in one integrated product. Make sure your data survives when catastrophe hits. Request your trial now!  Request Now.

• Forms of ransomware has been around for over 30 years, and with more and more organizations suffering attacks each year, it continues to endure. What has made ransomware such a durable threat and what is the best way to combat it? In order to prevent ransomware, organizations must first understand how it works.

• IT security is a top priority for businesses around the world, but most IBM i pros don’t know where to begin—and most cybersecurity experts don’t know IBM i. In this session, Robin Tatam explores the business impact of lax IBM i security, the top vulnerabilities putting IBM i at risk, and the steps you can take to protect your organization. If you’re looking to avoid unexpected downtime or corrupted data, you don’t want to miss this session.

• Can you trust all of your users all of the time? A typical end user receives 16 malicious emails each month, but only 17 percent of these phishing campaigns are reported to IT. Once an attack is underway, most organizations won’t discover the breach until six months later. A staggering amount of damage can occur in that time. Despite these risks, 93 percent of organizations are leaving their IBM i systems vulnerable to cybercrime. In this on-demand webinar, IBM i security experts Robin Tatam and Sandi Moore will reveal:

• Disaster protection is vital to every business. Yet, it often consists of patched together procedures that are prone to error. From automatic backups to data encryption to media management, Robot automates the routine (yet often complex) tasks of iSeries backup and recovery, saving you time and money and making the process safer and more reliable. Automate your backups with the Robot Backup and Recovery Solution. Key features include:

• Managing messages on your IBM i can be more than a full-time job if you have to do it manually. Messages need a response and resources must be monitored—often over multiple systems and across platforms. How can you be sure you won’t miss important system events? Automate your message center with the Robot Message Management Solution. Key features include:

• The thought of printing, distributing, and storing iSeries reports manually may reduce you to tears. Paper and labor costs associated with report generation can spiral out of control. Mountains of paper threaten to swamp your files. Robot automates report bursting, distribution, bundling, and archiving, and offers secure, selective online report viewing. Manage your reports with the Robot Report Management Solution. Key features include:

• For over 30 years, Robot has been a leader in systems management for IBM i. With batch job creation and scheduling at its core, the Robot Job Scheduling Solution reduces the opportunity for human error and helps you maintain service levels, automating even the biggest, most complex runbooks. Manage your job schedule with the Robot Job Scheduling Solution. Key features include:

• Business users want new applications now. Market and regulatory pressures require faster application updates and delivery into production. Your IBM i developers may be approaching retirement, and you see no sure way to fill their positions with experienced developers. In addition, you may be caught between maintaining your existing applications and the uncertainty of moving to something new.

• When it comes to creating your business applications, there are hundreds of coding platforms and programming languages to choose from. These options range from very complex traditional programming languages to Low-Code platforms where sometimes no traditional coding experience is needed. Download our whitepaper, The Power of Writing Code in a Low-Code Solution, and:

• Supply Chain is becoming increasingly complex and unpredictable. From raw materials for manufacturing to food supply chains, the journey from source to production to delivery to consumers is marred with inefficiencies, manual processes, shortages, recalls, counterfeits, and scandals. In this webinar, we discuss how:

• The MC Resource Centers bring you the widest selection of white papers, trial software, and on-demand webcasts for you to choose from. >> Review the list of White Papers, Trial Software or On-Demand Webcast at the MC Press Resource Center. >> Add the items to yru Cart and complet he checkout process and submit

• Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application.