IBM is constantly
enhancing the HTTP Server (powered by Apache) and has released a series of PTFs
that repair some problems, add new functions, and extend the functions of the
GUI interfaces and forms supported in the administration server. The GUIs now
work as designed. In addition to the PTFs, a major refresh of the server was
released in December 2001, and that brought IBM's implementation of the Apache
Software Foundation's Tomcat Java Application Server (Version 3.2) to the
iSeries. Therefore, IBM's HTTP Server (powered by Apache) capability continues
to increase over time, and it is becoming a more powerful server with every
update.
In this article, I will discuss the virtual hosting features of
OS/400 Apache and show you how to configure your HTTP Server (powered by Apache)
server instance to configure several virtual hosts that can be run off one
server instance. By using these techniques, you can cut down on the number of
server instances you need to configure as well as decrease the OS/400 overhead
needed to run multiple Apache-based Web sites from a single iSeries or AS/400.
Getting the Right Start in Configuring Apache
The primary difference between the Apache-based server
and the original HTTP server is the control you have over the server's features.
This control is implemented by more options and parameters for each server
directive. It is extremely important to understand what each directive is doing
and what options are available for you to use. Train yourself to configure this
server using a text editor (e.g., Notepad or WordPad) and learn the directives
before using the GUIs. To implement many of the advanced functions
available within the Apache-based server, you need to thoroughly understand what
each directive does. The Apache Software Foundation's Web site
(www.apache.org) is the best source for
this documentation. In addition, IBM has published two online PDF documents
containing useful Apache-based information:
What Is a Virtual Server, and Why Do I Need One?
For each server instance,
OS/400 Apache runs a series of jobs (usually four or more jobs) in the QHTTPSVR
subsystem. You can view these jobs by performing a Work with Subsystem Jobs
(WRKSBSJOB) command, as follows:
WRKSBSJOB SBS(QHTTPSVR)
To find
your server jobs in this subsystem, look for the name of your server instance as
the OS/400 job name. A server daemon receives all TCP/IP requests from users.
The daemon locates a thread or causes a thread to be dispatched in one of the
helper (or processing) jobs and passes the request to the job and thread for
processing. Each server instance consumes a significant amount of resources and
introduces overhead on your system.
If you need to support multiple Web
sites in OS/400 Apache, you may create a server instance with its own
configuration file for each Web site uniform resource identifier (URI). (Note:
URI is synonymous with URL but more accurate, encompassing a broader scope than
the HTTP-associated with URLs.) This approach works well, but it requires that
each server instance be configured to use a unique port number or a unique IP
address. Separate server instances can also add overhead to your system as I
just demonstrated. That's where virtual servers and virtual hosts come in.
Before You Virtual Host
Before exploring virtual servers, it is important to
understand how the server connects to IP addresses and ports. All TCP/IP servers
connect to the communications subsystem by creating a programming device called
a socket. The socket is the interface between an application program (our
OS/400 Apache-based server) ) and the telecommunications subsystem (TCP/IP).
When a server starts, it issues a BIND command with various options that control
which IP address(es) and port(s) the server will listen to for messages from the
network.
When you configure the Apache-based server with the Listen 80
directive, you are telling the server to create a socket that will receive
messages addressed to port 80 for all of the IP addresses defined to your
system. An iSeries machine running OS/400 V5R1 can support up to eight
communications adapters with 2000 IP addresses defined for each adapter, up to a
maximum of 16,000 IP addresses. The Listen 80 directive will route any traffic
on any of the 16,000 addresses to port 80. This may be desirable sometimes, but
it is not what most users want. If you want to configure your server to accept
messages that only address port 80 on the 10.130.39.223 IP address, you have two
options. You can enter a Listen directive for 10.130.39.223 that explicitly
specifies the IP address and port number to listen for, as
follows:
Listen 10.130.39.223:80
Or you can enter a generic Listen
directive for the 10.130.39.223 address followed by a Port directive that
specifies the port numbers to listen to. This configuration would look like the
following example:
Listen 10.130.39.223 Port 80
It's important
to note that the Port directive in the second configuration is only valid when
there are no port numbers specified in your Listen directive. If you use the
port number parameter in a Listen directive, the Port directive for any IP
addresses that are specified in the Listen directive is ignored.
The IBM
HTTP Server (powered by Apache) offers extreme flexibility in choosing network
connection options. Unlike the original HTTP Server for AS/400, the Apache-based
server allows you to specify lists of IP addresses that you want your server to
listen for. To specify that the server should listen to ports 80 and 8081 on
three separate IP addresses, enter this group of directives in your server
instance:
Listen 10.130.39.223 Listen 10.130.39.225 Listen
10.130.39.227 Port 80,8081
Using these directives, each of these IP
addresses would act as a virtual server for your Apache-based server instance,
and your server would answer any HTTP requests on your network for files at
these particular IP addresses and port numbers. If you code these directives in
any of the sample configurations that were listed in my earlier article
"OS/400 Apache Has Arrived,"
then a request from a browser targeting any of the supported IP addresses will
receive exactly the same pages from the same directory structure.
Why Use Virtual Hosting?
Virtual hosting is used to share a single server
instance (one daemon job and one set of request processing jobs) among several
URIs. It provides a technique for allowing the instance to support multiple,
individual Web sites, each with its own domain name, IP address, and content.
This approach reduces the number of configuration files you must maintain, and
it reduces OS/400 system overhead because each site shares the same OS/400
processing jobs. The HTTP Server (powered by Apache) supports four types of
virtual hosting:
- IP-based virtual hosting
- Name-based virtual hosting
- Mixed-mode virtual hosting
- Dynamically configured mass virtual hosting
IP-Based Virtual Hosting
An IP-based virtual host allows a single server
instance to be bound to a list of IP addresses and allows you to control the
directories, CGI library paths, and other resources that can be accessed by the
server associated with the virtual host/IP address combination. For several
years, IP-based virtual hosting was popular with many ISPs that hosted Web
sites. To their clients, it appeared as if each client had its own isolated Web
server. If a reverse lookup on the IP address of the client's Web site were
performed, it would resolve to the domain name of the Web site. With a shortage
of IP addresses and serious limitations on the number of IP addresses that an
ISP will assign to a shop, this form of virtual hosting is becoming limited.
Today, it is impossible for an individual company to acquire their own block of
public IP addresses. Your ISP generally provides as few as it possibly can. A
commercial DSL connection from AT&T comes with 29 usable addresses. A few
ISPs are allocating a full 256-address class-C block, but this is rare. As a
result, IP-based virtual hosting is on the decline and is not as significant as
it was five years ago.
For more information on IP-based virtual host, see
the
Apache IP-based Virtual Host Support Web site.
Name-Based Virtual Hosting
Name-based virtual hosts were introduced with the
HTTP 1.1 specification and were supported through a special update to HTTP 1.0.
Name-based virtual hosts exploit a new request header field in the patched HTTP
1.0 and HTTP 1.1 HOST protocol. This new parameter specifies the host name typed
by the user or picked up from a hyperlink on an HTML page. If you typed
www.w3c.org on your browser address bar, an HTTP 1.1 or HTTP 1.0 browser will
send the following message to the server:
GET /pub/WWW/ HTTP/1.1 Host:
www.w3.org
The host parameter is passed to the Apache-based server and
parsed according to the rules defined in the server configuration
file.
HTTP 1.1 headers are discussed in a document published on the World
Wide Web Consortium's (W3C) Web site,
"Hypertext Transfer Protocol - HTTP/1.1".
Headers are documented in
section 14.23.
Some older browsers do not support the HOST extension of HTTP 1.0 or
HTTP 1.1. If an older browsers attempts to access a server that supports
multiple name-based hosts, it will be given the first virtual host defined to
the server. This server will become the default server.
Mixed-Mode Virtual Hosting
Unlike the original HTTP Server that supports virtual
hosting using either name-based hosting or IP-based virtual hosting, the
Apache-based server can support a mixture of both methods at the same time.
Dynamically Configured Mass Virtual Hosting
The cross-platform--generally UNIX--version of the
Apache-based server is the choice of ISPs and companies that host Web sites
because it is free. Most ISP-hosted Web sites are low activity sites and can be
hosted on a single machine. The three methods of virtual hosting--IP-based,
name-based, and mixed-mode--require custom configuration of the Apache
configuration file and the stopping and starting of the server to implement a
new virtual host. ISPs wanted a method in which new servers could be added to
existing systems without requiring the stopping, starting, and configuration
modification of the existing systems. The Apache Software Foundation developed a
remarkable technique called dynamically configured mass virtual hosting.
Once configured and implemented, you can add virtual hosts without stopping and
starting your server.
For more information on dynamically configured mass
virtual hosting, see
Apache's Dynamically Configured Mass Virtual Hosting Web site.
Which Method Works Best with OS/400 Apache?
The following paragraphs provide step-by-step
instructions to implement name-based virtual hosting. While the other methods
described earlier are certainly supported by IBM's HTTP Server (powered by
Apache), I am describing name-based hosting since it is the most likely choice
to be used for several reasons. IP addresses are now at a premium. Few shops
have IP addresses that would be required for using other methods. While
dynamically configured mass virtual hosting sounds like a neat solution, there
are some significant limitations regarding authentication, security, and CGI.
This might be a great solution if all you are doing is hosting a large number of
static page Web sites, but it's not a good solution for Web sites with
e-business requirements.
Configuring a Virtual Host
Before I begin to explain the configuration of
name-based virtual hosts and the directives required to implement two hosts in
one server instance, I want to discuss the Domain Name System (DNS) and the DNS
records necessary to support name-based virtual hosts.
DNS Considerations
Your DNS, either hosted by your company or by your
ISP, needs a type-A record for each domain name that you want to translate to an
IP address. If you use an NSLOOKUP tool while looking up the type-A record for
the URL www.iseries.ibm.com, you will get a reply that looks something like the
following:
www.iseries.ibm.com type = CNAME, class = 1, ttl = 38097,
dlen = 18 alias = as400.rochester.ibm.com as400.rochester.ibm.com type =
A, class = 1, ttl = 42289, dlen = 4 IP address = 208.222.150.11
You
will see that the URL www.iseries.ibm.com is a Canonical Name (CNAME), or alias,
for as400.Rochester.ibm.com on address 208.222.150.11. CNAMES, or separate
domain names, point to the same IP address for virtual-named hosting. Your DNS
administrator, your ISP, or you will need to implement your company's domain
name (e.g., www.yourco.com) in your company's DNS. A CNAME must be defined for
each alternate name within the same domain. For example, you might want to
create the following URLs:
- www.yourco.com
- www.importantproduct.yourco.com
- www.other.yourco.com
You can register other domains that you
might use, such as www.mybigproduct.com. Many companies register separate
domains for their well-known products. Motion picture studios and television
networks often create Web sites for each of their specific movies or television
shows.
Figure 1 below shows my sample configuration in which I use two
URLs (e.g., devtest.myco.com and devtest2.myco.com) that point to 10.130.39.223
because I did not code the UseCanonicalName off directive and I did require a
CNAME for both devtest and devtest2. If I had used the UseCanonicalName off
directive, I could have avoided implementing a CNAME for devtest.myco.com.
CNAME is a method used to describe how the Apache-based server
constructs a self-referencing URL. A self-referencing URL is the name by
which the server refers to itself, and it is required when translating
abbreviated names to full names. When the UseCanonicalName on directive is used,
the server will build a URL for itself as a server name and port number. When
the UseCanonicalName off directive is coded, the server will use the HTTP 1.1
HOST name field and the port number to build its self-referencing URL. For more
information on implementing the Apache-based server DNS limitations and
constraints I recommend that you read the Apache Software Foundation's online
article,
"Issues Regarding Apache and DNS."
Configuring a Name-Based Virtual Host
Figure 1 illustrates a name-based virtual host
configuration that supports two virtual hosts, devtest.myco.com and
devtest2.myco.com. (Note: this example provided specific IP addresses; my server
will start and run regardless of my DNS server's status.)
01 # Configuration originally created by Apache Setup Wizard Tue Mar 27 02 00:12:08 GMT+00:00 2001 03 ServerName devtest.myco.com 04 Listen 10.130.39.223:80 05 #DocumentRoot /www/devtest/htdocs 06 DefaultType text/plain 07 HostNameLookups Off 08 ErrorLog logs/basic_error_log 09 LogLevel warn 10 Options ExecCGI FollowSymLinks SymLinksIfOwnerMatch Indexes MultiViews 11 RuleCaseSense Off 12 DirectoryIndex index.htm 13 LogFormat "%h %l %u %t ""%r"" %>s %b ""%{Referer}i"" ""%{User-Agent}i""" combined 14 LogFormat "%{User-agent}i" agent 15 LogFormat "%{Referer}i -> %U" referer 16 LogFormat "%h %l %u %t ""%r"" %>s %b" common 17 CustomLog logs/access_log combined 18 BrowserMatch "Mozilla/2" nokeepalive 19 BrowserMatch "JDK/1.0" force-response-1.0 20 BrowserMatch "Java/1.0" force-response-1.0 21 BrowserMatch "RealPlayer 4.0" force-response-1.0 22 BrowserMatch "MSIE 4.0b2;" nokeepalive downgrade-1.0 force-response-1.0 23 AddHandler server-parsed .htm .html 24 ReWriteEngine On 25 26 NameVirtualHost 10.130.39.223 27 28 29 DocumentRoot /www/devtest/htdocs 30 ServerName devtest.myco.com 31 32 #Devtest Document Root Directory 33 34 AllowOverride None 35 Options +Includes 36 order allow,deny 37 allow from all 38 39 40 41 AllowOverride None 42 Options +Indexes 43 order allow,deny 44 ProfileToken On 45 AuthName "Developers Private Area" 46 AuthType Basic 47 UserID %%SERVER%% 48 PasswdFile %%SYSTEM%% 49 50 require valid-user 51 52 53 54 55 56 DocumentRoot /www/devtest2/htdocs 57 ServerName devtest2.myco.com 58 59 #Devtest2 Document Root Directory 60 61 AllowOverride None 62 Options +Includes 63 order allow,deny 64 allow from all 65 66 67 68 69 ScriptAliasMatch ^/cgi-bin/(.*) /qsys.lib/cgidev.lib/$1.pgm 70 Alias /test/ /www/devtest/htdocs/private/test/ 71 ScriptAlias /cgi-dta/ /qsys.lib/nddevtst.lib/nddevtst.pgm/ 72 73 #Server Root 74 75 Options +Indexes +Includes 76 AllowOverride None 77 order deny,allow 78 deny from all 79 80 81 #Net.Data Directory 82 83 AllowOverride None 84 Options +ExecCGI +Includes 85 order allow,deny 86 allow from all 87 CGIConvMode %%EBCDIC/MIXED%% 88 89 90 #CGI Directory 91 92 AllowOverride None 93 Options +ExecCGI +Includes 94 order allow,deny 95 allow from all 96 CGIConvMode %%EBCDIC/MIXED%% 97 |
|
Figure 1: This HTTP Server instance contains a name-based virtual host
configuration that supports two virtual hosts.
Lines 1 through 25 are identical to the single
server instances that I created and described earlier. On line 3, I named the
server instance using the ServerName devtest.myco.com directive. This becomes
the default server for older browsers that do not support the HTTP 1.0 extension
or HTTP 1.1 HOST name header field. On line 4, I am binding the server to a
single IP address that is listening on port 80 using the Listen 10.130.39.223:80
directive. Lines 1 through 25 apply to the server in general and affect both of
the virtual hosts that I will code.
Line 26 declares that I want to use
virtual hosts on IP address 10.130.39.223. The NameVirtualHost 10.130.39.223
directive turns on virtual hosting for this server instance and enables virtual
hosting for the specified IP address. I can code multiple NameVirtualHost
directives if I want to support instances on multiple IP addresses.
The
next important statements are the
container tags. Using XML-like syntax, the first tag defines the beginning of a
virtual host definition and the second terminates the container definition.
Information on the lines falling within the tags controls the behavior and
characteristics of the virtual host that I am defining.
On line 28, I
coded the start virtual host container directive for the first virtual host as
follows:
This directive tells
the server that I am beginning to define a virtual host and that I want the
server to listen for messages on 10.130.39.223. The cross-platform version of
Apache allows you to code the wildcard asterisk (*) symbol, which is not
supported in IBM's OS/400 implementation. If you were allowed to code the
asterisk, the statement would instruct the server to listen for this virtual
host on the IP address defined in the Listen directive as specified in line 4.
An alternative is to code this directive as , which instructs the server to perform a DNS lookup to
resolve the address.
The first DocumentRoot directive is coded on line 29
and it appears as follows:
DocumentRoot /www/devtest/htdocs
The
next DocumentRoot directive appears on line 56 and appears as
follows:
DocumentRoot /www/devtest2/htdocs Using these
directive, I am defining a specific Document root path for each of my virtual
hosts. The directive on line 56 is in the second virtual host container
(devtest2.myco.com), and it uses a document root directive that is different
from the document root directive that is listed in line 29 (which was inserted
for the first virtual host container, devtest.myco.com).
I want to serve
different Web pages for each virtual server so I added the following directives
to the first virtual server, devtest.myco.com. Remember, all of
that server's directives are defined between the and
tags on lines 28 and 53.
On line 30, I coded a
ServerName directive that provides the domain name for this specific virtual
host.
On lines 33 through 38, I have coded a container
to define the document root directory for this server.
On lines 40
through 51, I have defined a private authenticated directory that is protected
by OS/400 User Profiles. See
"Putting OS/400 Apache to Work with CGI, Authentication, and SSI"
for a description on how this authentication works.
The second virtual
server for devtest2.myco.com begins on line 55 with a second
tag and ends on line 67 with its matching tag. Line 56
defines a unique document root directory for this server. Similar to line 30,
Line 57 names this server devtest2.myco.com. Lines 60 through 65 define the
document directory used by this server.
Lines 69 through 97 are
identical to items that I described in
"Putting OS/400 Apache to Work with CGI, Authentication, and SSI,"
and they are available to both virtual servers, since they are coded outside the
and containers.
Any directives
that are not coded within a virtual host container apply to all virtual hosts.
This means that both servers can call CGI programs (lines 90 through 97) or
Net.Data macros (lines 81 through 88) from the same libraries. All of these
directives can be moved into the virtual host containers and be made unique for
each virtual host. The IBM HTTP Server (powered by Apache) gives you incredible
flexibility in this area.
If you want to know which directives can and cannot
be coded inside a virtual host container, refer to the
Apache Directives Web site.
You may also use the "HTTP Server (powered by Apache) Directives Organized
by Module" online reference manual I listed earlier. IBM's online reference
manual follows the Apache Software Foundation's convention for server directive
documentation. For example, the following is a small segment copied from the
Apache Software Foundation's documentation. IBM's OS/400 Apache documentation
also follows this format:
:
directive
Syntax: directory> ...
Context: server config, virtual host Status: Core
In this case, the and tags are
used to enclose a group of directives that will apply only to the named
directory and subdirectories of that directory. Any directive that is allowed in
a directory context may be used.
Pay particular attention to the context
line, which tells you that this directive may appear in the general server
configuration or in a virtual host. For more information on virtual host
matching, see Apache's online documentation,
"An In-depth Discussion of Virtual Host Matching."
The Apache-based server provides an amazing array of flexible features
and options to build an HTTP server that best suits your specific business
requirements. Unlike the original IBM HTTP Server that supported both
virtual IP and virtual name-based hosting, the Apache-based server provides many
more options, allowing you to mix and match to create the exact environment that
best suits your needs.
Earlier, I described virtual name-based virtual
hosting in detail, but you may find that a mixture of all four methods works
best for you. Still, you may want to run some single-instance servers to have
better isolation and control over that single server. You also need to consider
change control and the impact of running development quality assurance servers
in addition to your production server on the same or different machines.
Bob Cancilla has been actively involved in the
development of e-business systems using iSeries technology since 1994 and is the
managing director of IGNITe/400, a nonprofit iSeries e-business user group. You
can reach Bob at
bobc@ignite400.org.
|
You must be logged in to view or make comments on this article.