Introduction to Sockets

General
Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times

In the new era of AS/400 openness, there are many options available to access programs and data to and from an AS/400. One option we've discussed in MC a great deal is Open Database Connectivity (ODBC), an open standard for accessing data on a wide variety of relational database management systems (RDBMSs). Sometimes we not only need to access data, but also to access another program on a system. Like ODBC, there is an open standard available to AS/400 programs for that purpose. It's called Sockets.

Some people call Advanced Program-to-Program Communications (APPC) an open standard because you can suggest a change to the protocol to IBM and they may implement it. You have to go through IBM's committees, prove to IBM that it will make them money, and then they will consider implementing it in a few years. That really is a proprietary standard.

The more traditional meaning of an open standard is that no one company holds the rights to define the standard. Sockets truly is an open standard in that respect. If you think a change is needed in how Sockets works, there is a well-documented method for how you submit that kind of suggestion and how it will be considered.

There is a standards group for Sockets that is in some situations capable of quickly evaluating and implementing a proposed solution. That makes it much more likely that a given suggestion will be implemented into the Sockets standard and implemented more quickly than for a proprietary standard.

What I hope to do in this article is introduce you to Sockets, which may be new to AS/400 programmers but is often old hat for UNIX programmers. I don't mean to imply that Sockets is only for UNIX programmers. If you are looking for an open standard to communicate with a program on another system?whether that system is running OS/400, a version of UNIX, or Windows NT?Sockets is something you ought to consider. This article will be an overview to help you understand Sockets and be able to decide whether it fits into your environment. Future articles will detail how to implement Sockets in an AS/400 program.

What Is Sockets?

Over a decade ago, Berkeley Systems Distribution (BSD) defined an API called Sockets under UNIX to allow programs on a single computer to communicate with each other. This is called local Interprocess Communication (IPC). Over the years, the Sockets standard was modified to support non-local IPC. This allowed programs on different systems to communicate with each other over a communications link. That is the basic genesis of Sockets as we know it today.

Sockets provides a way for two programs on different systems to communicate in both directions. The relationship between the two programs is normally a client/server relationship. However, as is true in many other communications standards, that relationship can change through negotiation. In the same connection, each computer can at different times be the server and the client.

Sockets is a standard that is, to some extent, protocol independent. When a programmer with some previous exposure to Sockets thinks of writing a program, he usually envisions that program running over TCP/IP. That doesn't have to be true. A carefully crafted Sockets implementation can run over anything from DECnet to IPX/SPX. The Sockets standard is protocol-independent, but an implementation is often protocol-dependent.

When you write a server program that uses Sockets, that program "listens" at an endpoint. I'll go into the details of what the client and server programs need to do shortly, but that endpoint (or socket) is an agreed-upon meeting place for the client and server application.

There are two common ways an AS/400 Sockets conversation can take place. One is called a datagram (or UDP) Socket, and the other is called a stream (or TCP) Socket. Other Sockets implementations include what is called raw Sockets, but that is a low-level implementation that is not often used in applications programming. Datagram and stream Sockets have different strengths and weaknesses. The appropriate implementation is application dependent.

The datagram Socket is the simpler of the two types. The primary difference between them is that datagram Sockets are unreliable; that means, for example, that your programs need to track whether a packet of data sent from the server ever arrived at the client. Data sent using datagram Sockets may also arrive out of order. These are the kinds of things your programs need to handle if you use datagram Sockets, but not if you use stream Sockets.

You may wonder why you might ever use datagram Sockets if you have to write all of that additional code in your programs. The answer is that stream Sockets don't retain record boundaries. That causes a problem in a situation, for example, in which your program sends two records of 25 bytes of data each.

Suppose your program is written to pick up data in 10-byte increments. The program using stream Sockets will pick up 10 bytes of data from the first record. When performing the second read, the program will pick up the next 10 bytes of data from the first record, not the data from the second record. This pattern of transferring 10-byte blocks of data continues through the first record and into the second.

Having heard that, you may be wondering why you would use stream Sockets. It is appropriate in situations in which you are going to pass fixed-length records. If you always know that you are going to pass 80-byte records in a terminal emulation program, for example, stream Sockets are quite appropriate.

Just as there are two common methods for passing data, there are also two common ways to address a Socket. They are often called addressing families. The two families the AS/400 supports are called AF_UNIX and AF_INET. The names come from their entries in a C header file.

AF_UNIX (or UNIX domain) Sockets allow two programs on the same physical machine to communicate. AF_INET uses IP addresses to reference another host system running a Sockets application. It is important to understand that Sockets (other than raw Sockets) are independent of the transport mechanism. With the use of AnyNet, TCP/IP traffic can be carried over an SNA network, for example.

When using AF_INET, there is a second part to the address called a port. Ports allow multiple Sockets applications to run on a single IP address. When you write a Sockets application, you decide what port to use unless you are going to write an application that uses one of the common servers.

For example, there is an application that can run on the AS/400 (and other systems) called a Telnet server. The Telnet server is what allows you to run terminal emulation. (For more information on configuring the AS/400 Telnet server, see "AS/400 Telnet" elsewhere in this issue.) A Telnet client application is a Sockets application that talks to a standard port number. The same is true of other common servers such as FTP and Finger.

Heterogeneous Considerations

Earlier, I said that Sockets is a standard and that most implementations follow the BSD standard. There are, however, a number of variations available on different systems. Almost every vendor who implements the BSD standard in his operating system finds a reason to create extensions.

For example, if you write an application that uses Sockets under Microsoft Windows, chances are you are going to use WinSock. WinSock is Microsoft's implementation of Sockets under Windows. WinSock supports the functionality of BSD Sockets but uses different function names and has additional functions because of the characteristics of the Windows environment.

Another issue that comes up because of the heterogeneous nature of Sockets is the byte-ordering schemes of different computer systems. If you're passing data between a system using an Intel processor and a Macintosh, you will have a problem with network byte ordering. The hex value 123F (which might refer to a specific port), when passed from Windows 3.X to a Macintosh, would be stored as 3F12.

Some functions to help you with this kind of problem should be available under any implementation. They are htons() and ntohs(), which convert unsigned short integers from the host to network order or vice versa. The htonl() and ntohl() convert unsigned long integers. (The difference between a short and long integer is dependent on the system you are working on.)

The actual implementation of these functions depends on the platform you're working on. On the Macintosh, data is stored in the same format the network is expecting. Because of that, when htons(), ntohs(), htonl(), or ntohl() functions are run, the data isn't changed. If you run the same functions on an Intel system, the byte-ordering of the host system is different from the standard for network byte-ordering. The point of all this is this: no matter what system you are writing on, always use the network byte-ordering conversion routines.

Why Use Sockets?

MIS shops are increasingly becoming mixed vendor environments. We can spend a lot of time arguing the merits of this situation, but it is reality. It is rare to find a non-IBM system that implements CPI-C. However, almost every system that uses TCP/IP applications (including Telnet and FTP) uses Sockets at some level. So Sockets is a way of accessing other systems without regard to whether the vendor is IBM or not.

Even if you are an all-IBM shop right now, consider the following scenario. Your company has either bought another company or been bought by another company. The senior management wants to reduce MIS costs. One of the things they may do is try to evaluate which staff is more flexible. The other company is using those lousy UNIX boxes that are difficult to support and program.

Your company is using AS/400s, which are easy to support and program, but the new management doesn't know that. They know only that they need to have the applications from your company and the other company work together. Management will ask the UNIX programmers how much they would need to change their application to work against an AS/400.

The UNIX programmers ask if the AS/400 supports Sockets. When they find out that the AS/400 does, their answer is that they can access the AS/400 with very few changes. Then the new management asks you what you will need to do to access the UNIX box. You've written your applications using APPC or CPI-C, which in this case the UNIX box doesn't support. Your programs would have to undergo major changes to work with the data on the UNIX boxes.

So, the UNIX programmers go to work making the minor changes, and you go to work polishing your resume. I admit, this is an extreme oversimplification. However, the point that I hope I've made is that Sockets programming, in certain situations, can lend a level of flexibility and adaptability to your company and your MIS department, both of which are helpful in competitive situations.

Using Sockets for your program-to-program communication requirements also provides a level of protocol independence. You have the option of using the existing protocols, or, in a sense, you can create your own protocol. If you use raw Sockets, you have to handle many of the protocol-level issues that are taken care of for you with AF_INET Sockets, for example. This allows you to create an implementation that may be more appropriate for your environment.

Here's an example of a situation in which you might use Sockets: a company that I previously worked for developed software for hospitals. The software needed to interface with other computers to gather data such as lab results. These interfaces were written using APPC or asynch communications, a perfect situation for using Sockets instead.

AS/400 Sockets Implementation

Sockets on the AS/400 are implemented just like most of the other APIs. However, you can access Sockets only from an ILE C program. As with almost any C program, you're going to need to include some header files supplied by IBM in your project.

There are about 20 different header files supplied by IBM, and you will find them in the library QSYSINC. The two files that you need to include in almost every project are types.h and socket.h. Depending on what your program will do, you may need to include additional header files in your project. 1 describes the Sockets-related header files available in QSYSINC and each file's purpose.

There are about 20 different header files supplied by IBM, and you will find them in the library QSYSINC. The two files that you need to include in almost every project are types.h and socket.h. Depending on what your program will do, you may need to include additional header files in your project. Figure 1 describes the Sockets-related header files available in QSYSINC and each file's purpose.

There is a fairly common pattern that you use each time you write a Sockets program. Depending on whether you are writing a client or server program, the function calls happen in a specific order. 2 is a flow chart that describes one typical way of creating a Sockets connection.

There is a fairly common pattern that you use each time you write a Sockets program. Depending on whether you are writing a client or server program, the function calls happen in a specific order. Figure 2 is a flow chart that describes one typical way of creating a Sockets connection.

The Socket() function creates a communications endpoint. It returns a Socket descriptor. A Socket descriptor is a way of referencing the communications endpoint and will be used in other places in the program.

The Bind() function assigns a name to a Socket descriptor. The Bind() function returns zero when successful and an error code otherwise. This function creates a link to a network address of the other system.

The Listen() function is used in the server program. It tells the communications layer that the program is ready to accept connections from other systems. Until the Listen() function is run, any system trying to attach to your program will be refused.

The Connect() function is used in the client program. That is how the client program attaches to a specific Socket. Once the Listen()/Accept() and Connect() function pairs are run on the server and the client, the Sockets conversation is ready to happen.

The conversation takes place using the send and receive functions. When either the server or the client is ready to end the connection, the Close() function is used.

It is important to understand that the function names may vary depending on the implementation. For example, under WinSock, the Close() function is actually called CloseSocket(). The functionality is always available, though the name may change.

CloseArticle()

Sockets are a tremendously powerful addition to OS/400. In a future article, we will give you a working example of a Sockets application between an AS/400 and a Windows program. I hope you've seen that, especially in a heterogeneous environment, Sockets is a key technology. Sockets is not only open, but also often the fastest communications method.

I've recently seen the results of tests comparing Sockets, Named Pipes, and NetBIOS. The test results showed that, almost without regard to platform, asynchronous Sockets were the fastest at every transfer size used. So, just as with ODBC, Sockets again proves my belief that just because a standard is open doesn't mean it is slow.

Hopefully, IBM will come to appreciate the importance of Sockets beyond convincing UNIX sites to migrate to the AS/400. If that happens, we may see the ability to use Sockets from ILE RPG and other ILE languages. That would be a big step in making the AS/400 the "open" server of choice.

Using Sockets also opens your applications up for use over the Internet. IBM announced recently that it will offer a Web server for the AS/400. The options to connect your AS/400 to the Internet are increasing all the time.

If you would like to learn more about Sockets in the meantime, I would encourage you to look at the references provided below. They will help you understand more about this new tool in our AS/400 open arsenal.

Jim Hoopes is a senior technical editor for Midrange Computing.

REFERENCES

AS/400 Sockets Programming (SC41-3422, CD-ROM QBKANN00).

Dumas, Arthur. Programming WinSock. Indianapolis: Sams Publishing, 1995.

Treadwell, David. "Developing Transport-Independent Applications Using the Windows Sockets Interface." Microsoft Development Library [CD-ROM]. Microsoft Corporation, 1995.


Introduction to Sockets

Figure 1: Include Files Used for Sockets Programming

 Include File File Supplied In Description socket.h SYS Defines the basic variables, structures, and functions. types.h SYS Defines data types used in Sockets programs. un.h SYS Defines variables used in AF_UNIX Sockets programs. ioctl.h SYS Defines macros for the ioctl()function. time.h SYS Defines macros needed for the select()function. param.h SYS Defines size limit variables. uio.h SYS Defines macros needed for the readv()and writev()functions. in.h NETINET Defines macros needed for AF_INET Sockets. ip_icmp.h NETINET Defines macros needed for ICMP. ip.h NETINET Defines macros needed for the IP protocol. tcp.h NETINET Defines macros needed for the TCP protocol. udp.h NETINET Defines macros, variables, and structures for UDP protocol. netdb.h H Defines macros and functions used for address resolution. errno.h H Defines errors that happen in Sockets programs. fcntl.h H Defines macros needed for the fcntl()function. unistd.h H Defines macros needed by the read()and write()functions. resolv.h H Defines macros needed by the res*()functions. if.h NET Defines macros needed when accessing interface data. route.h NET Defines functions and macros needed to use route entries. nameser.h ARPA Defines the resolver functions. inet.h ARPA Defines the INET functions. 
Introduction to Sockets

Figure 2: Typical Sockets Flow Chart


BLOG COMMENTS POWERED BY DISQUS