19
Fri, Apr
5 New Articles

BCP: The Incredible Bulk

Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times

Ask any network administrator what the most important developments of the last 20 years are, and most will name TCP/IP near the top of the list. Prior to the development of a standard network protocol, getting computers to exchange information was a difficult and arduous task. Each type of computer used its own method of communication, and converting between methods often involved several steps. Unfortunately, most database administrators still face that situation every day.

Although the world has standardized on a single network protocol, every RDBMS uses its own method for handling data. It is common to find one RDBMS at the branch or small-office level, while the central office uses another, and yet a third is used in decision- support functions.

Transferring data between different RDBMS systems can involve considerable difficulty and expense, especially if real-time data consistency is required. Often, the solution involves expensive middleware products and replicated data, which add another level of complexity and administrative headaches. If real-time data consistency is not required, however, you can often transfer data between databases by using the tools and utilities included with the RDBMS. This article describes just such a solution for transferring data between Microsoft SQLServer and IBM DB2 on an AS/400.

A Solution to a Problem

The problem involves a medium-sized company with about 20 branch offices running Windows NT as the primary file and print server. The central office uses an AS/400 running DB2 as its primary decision-support database. The company has decided to implement an SQLServer application at each of the branches but still needs a way to collect nightly summary information and import it into the central DB2 database. Since the data is only a daily summary, real-time data consistency between the two platforms is not a requirement. In addition, the customer is looking for the simplest and most cost-effective solution, preferably with a minimum of custom programming.

SQLServer includes native database replication and partitioning that can transfer data but only with other SQLServers. In addition, SQLServer 7.0 now includes a number of data-transformation utilities designed to move data to and from other databases.

However, these utilities are primarily graphical, and the lack of a local database administrator at each branch creates a need to find the simplest solution possible.

The answer to this problem involves an SQLServer command-line utility: the Bulk Copy Program (BCP). BCP is designed specifically to transfer information between SQLServer and a flat-text file. The text file can then be copied across the network and imported into any other RDBMS. The command-line nature of BCP makes it very easy to incorporate into batch files, which the Windows NT Scheduler service can then execute at specific times. In the end, the final solution takes the following form:

1. At each of the branch NT servers, the Windows NT Scheduler service calls a batch file every night at midnight.

2. The batch file first executes a BCP command that executes a query against the database and stores the output in a comma-delimited text file with the branch code as part of the filename.

3. The batch file then starts the NT FTP client by using a predefined script and transfers the results file to an FTP server on the AS/400.

4. A process on the AS/400 imports the files into the DB2 database.
5. Each morning, an administrator at the central office checks the results to be sure that all files have been received and that the import process has finished correctly. If any files have not been received correctly, the administrator recopies the file manually and reruns the import process on that file.

The solution proves extremely reliable, in part because of its simplicity. This article discusses the first three steps in the process: calling a batch file, using the batch file to call a BCP, and FTPing the file to an AS/400.

The BCP Command Line

By default, BCP is installed in the MSSQL7BINN directory and included with SQLServer specifically to provide a way to import and export data by using flat-text files. These text files can then be shared with many other programs, including spreadsheets, word processors, and, in this case, other databases.

Unfortunately, this speed and flexibility comes at a price. The complex nature of the program requires a large number of command-line parameters. You probably need to experiment by testing a BCP command before trying it in a production environment. Figure 1 shows several sample BCP command lines that illustrate most of the parameters commonly used with the utility along with definitions for the most common parameters. All these command lines are designed to work with the Pubs sample database that comes with SQLServer. For more information on all of the BCP command line options, you can refer to the Web sites listed at the end of this article.

You can execute BCP against any table or view, or BCP can execute an ad hoc SQL query, as shown in example C (Figure 1). In each of the examples, I have also specified the database name (Pubs) and the table or view owner (dbo); both of these items are optional. If no database is specified, BCP automatically executes against the user’s default database. However, if the user executing BCP is not the table owner and the table owner is not specified, an error is returned and the BCP operation aborts.

The next parameter defines the mode of operation for BCP. If a table or view is specified, you can transfer data either in or out. If used with an ad hoc query, BCP should use the queryout mode, which will execute the SQL statement included in the command line. This allows for more complex data manipulation before the text file is generated, such as column summation or table joins. Finally, you use the FORMAT mode with a format file to alter data as it is imported or exported. (The FORMAT mode is discussed later in more detail.) Each command line specifies the data file used in the operation (in this case, bcpdata.txt).

Having told BCP what mode to use, the next step is to give the command its security context. The -T switch, seen in example A, specifies that BCP use a trusted

connection, meaning that you must integrate the SQLServer with the Windows NT security subsystem. If SQLServer were installed to use its own security, you must give BCP a login ID and password by using the -U and -P switches respectively, as in example B. Finally, example B also shows the -S switch, which can determine the server that the BCP command is executed against. If the server is not listed, the BCP command executes against the local SQLServer.

A user can also specify the field and row terminators used during the BCP operation by using the -t and -r switches respectively. The default is to use a tab ( ) as the field separator and a new line ( ) as the row separator, although these are not needed if a format file is used. One very common change is to use the -t,” switch to identify a comma- delimited data file, as shown in example A.

The -F and -L switches specify the first and last rows to be copied by using BCP. The default is for BCP to duplicate all rows in the table or view (-F1 to -L0). BCP also includes several methods for handling errors. The -e switch, seen in example B, records any error messages to the listed file, while the -m switch causes BCP to abort after it has reached the set number of errors.

Data Types and Format Files

Having selected from this wide range of parameters, you are now ready to perform your first BCP operation. As any experienced database administrator knows, however, the fun is just beginning. Most problems that arise while transferring data between databases involve the different data types that each RDBMS uses. Fortunately, BCP has many options for controlling the type of data being imported or exported. Several default switches can be used for standardized data types:

• The -n switch indicates that the data file uses native database data types.
• The -c switch indicates that the data file uses character data types for all fields. (This switch is the most portable format.)

• The -w switch indicates that the data file uses Unicode data types. (This switch does not work with SQLServer 6.5 and earlier.)

• The -N switch indicates that the data file uses Unicode for character data and native data types for numeric data. (This switch is a faster alternative to -w and also does not work with SQLServer 6.5 and earlier.)

• The -6 switch indicates that the data file uses SQLServer 6.0 or 6.5 data types. You should try your BCP command by using several of these data types to see which one works best for your situation. Often, you can use the -n switch even when moving data between different RDBMSs, however, the -c switch will create the most portable data file. This will probably be the best option when transferring data to DB2/400. If none of these options works for your situation, never fear. BCP has another tool for manipulating data: the format file.

If you execute a BCP command without listing one of the five options above, the command prompts you for the name, data type, and length of each field in the table or view. The information you enter can then be stored in a format file by using the -f switch (FORMAT mode) to specify the name of the output file. You can also use the information with future BCP operations to provide customized data types. The FORMAT mode of BCP automatically generates a format file for the given table by using one of the five standard data typing schemes listed above. If none of these schemes proves satisfactory, however, you can manually create or edit the format file to fit your particular needs. Figure 2 shows a sample format file.

The first row of the format file specifies the version of BCP to be used, and the second is the number of fields or columns in the database table. Each subsequent row defines a field in the table, listing the name, length, data type, field terminator, and row terminator. For example, all the items in Figure 2 are of the generic SQLCHAR data type and use tabs as the field terminator. Figure 3 shows three different format files that were

generated from the discounts table in the Pubs database and that use three different sets of data types. In general, the character mode is the most useful mode when transferring data to another RDBMS, although care has to be taken to be sure that the data types are read correctly when importing data on the remote database.

In addition to controlling data types, format files are useful for combining or separating fields while moving data between a table and the data file. They are also useful for changing the order of the fields. (More information about using format files to change data types and field order is available from the Microsoft Web site at technet.microsoft.com/cdonline/content/complete/srvnetwk/SQL/manuals/admincmp/75517 c02.htm.)

FTP Scripts and Batch Files

Returning to the original problem of transferring data to DB2/400, you have now created a BCP command and format file that creates the desired output data file. You have also tested the data file with the import process on the AS/400 and determined that the data is being successfully entered into DB2. All that remains is to design a way to transfer the flat file from the NT server to the AS/400. Here again, you turn to simple command-line utilities to move the data file (namely FTP, batch files, and the NT Scheduler service).

Figure 4 shows a sample batch file that can start BCP and FTP the file to the AS/400. The batch file first uses BCP to create an output file from two different tables on the SQLServer. It then starts the FTP client and transfers those files to the AS/400 by using an FTP script similar to the one in Figure 5. The batch file is started by using the NT Scheduler service and AT command and is set to run every evening at midnight. (You can find more information about using FTP to transfer data between NT and the AS/400 in
“Setting Up Windows NT to FTP to the AS/400,” AS/400 Network Expert, September/October 1999.)

Simplify! Simplify! Simplify!

More can certainly be done to provide a more automated solution for the customer. There is no error checking anywhere in the process, other than in the BCP error output file. All error checking is performed the following morning by an employee at the central office, who checks to see that all files have been transferred to the AS/400 and properly imported into the DB2 database. The client has a very reliable network, and the simple nature of the solution means that problems are rare. However, in the event of an error, the employee reruns the BCP and FTP processes manually, remotely running the same batch file called by the Scheduler service. The new files are then reimported into DB2/400. You can automate this whole process easily by using any of the more complicated scripting hosts, such as Perl or Windows Script Host (WSH). The command-line nature of BCP means that it can be called from any programming language that can execute operating-system instructions. In this case, the customer is looking for an extremely simple solution. By using native utilities, you can deliver one and meet the client’s needs.

References

• “Importing and Exporting Data,” Microsoft TechNet: technet.microsoft.com/ cdonline/content/complete/srvnetwk/SQL/manuals/admincmp/75517c02.htm
• “Setting Up Windows NT to FTP to the AS/400,” John P. Lyons, AS/400 Network Expert, September/October 1999

Related Materials

• “Command Prompt Utilities,” Microsoft TechNet: technet.microsoft.com/cdonline/ content/complete/srvnetwk/SQL/manuals/utilref2/chpt4/75524c16.htm
• “Optimizing Utility and Tool Performance,” Microsoft TechNet: technet. microsoft.com/cdonline/content/complete/srvnetwk/SQL/manuals/diag/part/75528c04.htm

IN BCP transfers data from the text file into the specified table OUT BCP transfers data from the table into a text file QUERYOUT BCP executes the SQL query from the command line FORMAT BCP creates a format file named with the -f switch
-T Use a trusted connection (integrated with NT security) -Uxxxx Specify a database login user ID
-Pxxxx Specify a password
-Sxxxx Specify the SQLServer to log in to
-t"x" Specify the field terminator in quotes (default is tab, ) -r"x" Specify the row terminator in quotes (default is a line feed, )
-Fx Specifies the number of the first row to copy from the table -Lx Specifies the number of the last row to copy from the table -e"x:xxx" Specifies the error output file
-m10 Specifies the maximum number of errors before the BCP process aborts
-n BCP will use native SQLServer 7.0 data types for the output -c BCP will use character output for all fields
-w BCP will use Unicode datatypes for the output
-N BCP will use Unicode for character data and native data types for numeric
-6 BCP will use SQL 6.0 and 6.5 datatypes for the output -f"x:xxx" Specifies the name of the format file to be created or used by this BCP operation Ex. a) bcp pubs.dbo.authors in bcpdata.txt -T -t"," -F100 -L200 -n Ex. b) bcp pubs.dbo.titles out bcpdata.txt -Smssql -Usa -Ppassworde"c:error.txt" -m10 -c Ex. c) bcp pubs.dbo "select * from authors where authors.state = ëCAí" queryout bcpdata.txt -T Ex. d) bcp pubs.dbo.titleview format bcpdata.txt -Usa -Ppassword -t","f"C:format.txt"

Figure 1: Here are common BCP command-line options and illustrations of their use.

Version Number of

Columns

Host File Field Order
7.0

5

1 SQLCHAR 0 4 “ ” 1 pub_id 2 SQLCHAR 0 40 “ ” 2 pub_name 3 SQLCHAR 0 20 “ ” 3 city
4 SQLCHAR 0 2 “ ” 4 state
5 SQLCHAR 0 30 “ ” 5 country

Host File Data Type Host File Data Length

Prefix Length Terminator Server

Column Name

Figure 2: These items all use tabs as the field terminator.

BCP Format Files
Native Mode (-n)
7.0

5

1 SQLCHAR 2 40 “” 1 discounttype
2 SQLCHAR 2 4 “” 2 stor_id
3 SQLSMALLINT 1 2 “” 3 lowqty
4 SQLSMALLINT 1 2 “” 4 highqty
5 SQLDECIMAL 1 19 “” 5 discount

Character Mode (-c)

7.0

5

1 SQLCHAR 0 40 “ ” 1 discounttype
2 SQLCHAR 0 4 “ ” 2 stor_id
3 SQLCHAR 0 7 “ ” 3 lowqty
4 SQLCHAR 0 7 “ ” 4 highqty
5 SQLCHAR 0 41 “ ” 5 discount

Unicode Mode (-w)

7.0

5

1 SQLNCHAR 0 80 “ ” 1 discounttype
2 SQLNCHAR 0 8 “ ” 2 stor_id
3 SQLNCHAR 0 14 “ ” 3 lowqty
4 SQLNCHAR 0 14 “ ” 4 highqty
5 SQLNCHAR 0 82 “ ” 5 discount

Figure 3: These format files use three different sets of data types.

dumpdb.bat

bcp db1.owner.table1 out c:dataoutserver1table1.dat -Usa -Ppassword -ftransfer1.fmt
bcp db1.owner.table2 out C:dataoutserver1table2.dat -Usa -Ppassword -ftransfer2.fmt

ftp -s:c:atas400ftp.cmd

Figure 4: This batch file can start BCP.

as400ftp.cmd

open ftp.as400.mycompany.com
myuserid

Server Column Order

mypassword
cd /pub/ftp-in/db2files
lcd d:dataout
binary
prompt
mput *.dat
quit

Figure 5: This FTP script transfers files.

BLOG COMMENTS POWERED BY DISQUS

LATEST COMMENTS

Support MC Press Online

$0.00 Raised:
$

Book Reviews

Resource Center

  • SB Profound WC 5536 Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application. You can find Part 1 here. In Part 2 of our free Node.js Webinar Series, Brian May teaches you the different tooling options available for writing code, debugging, and using Git for version control. Brian will briefly discuss the different tools available, and demonstrate his preferred setup for Node development on IBM i or any platform. Attend this webinar to learn:

  • SB Profound WP 5539More than ever, there is a demand for IT to deliver innovation. Your IBM i has been an essential part of your business operations for years. However, your organization may struggle to maintain the current system and implement new projects. The thousands of customers we've worked with and surveyed state that expectations regarding the digital footprint and vision of the company are not aligned with the current IT environment.

  • SB HelpSystems ROBOT Generic IBM announced the E1080 servers using the latest Power10 processor in September 2021. The most powerful processor from IBM to date, Power10 is designed to handle the demands of doing business in today’s high-tech atmosphere, including running cloud applications, supporting big data, and managing AI workloads. But what does Power10 mean for your data center? In this recorded webinar, IBMers Dan Sundt and Dylan Boday join IBM Power Champion Tom Huntington for a discussion on why Power10 technology is the right strategic investment if you run IBM i, AIX, or Linux. In this action-packed hour, Tom will share trends from the IBM i and AIX user communities while Dan and Dylan dive into the tech specs for key hardware, including:

  • Magic MarkTRY the one package that solves all your document design and printing challenges on all your platforms. Produce bar code labels, electronic forms, ad hoc reports, and RFID tags – without programming! MarkMagic is the only document design and print solution that combines report writing, WYSIWYG label and forms design, and conditional printing in one integrated product. Make sure your data survives when catastrophe hits. Request your trial now!  Request Now.

  • SB HelpSystems ROBOT GenericForms of ransomware has been around for over 30 years, and with more and more organizations suffering attacks each year, it continues to endure. What has made ransomware such a durable threat and what is the best way to combat it? In order to prevent ransomware, organizations must first understand how it works.

  • SB HelpSystems ROBOT GenericIT security is a top priority for businesses around the world, but most IBM i pros don’t know where to begin—and most cybersecurity experts don’t know IBM i. In this session, Robin Tatam explores the business impact of lax IBM i security, the top vulnerabilities putting IBM i at risk, and the steps you can take to protect your organization. If you’re looking to avoid unexpected downtime or corrupted data, you don’t want to miss this session.

  • SB HelpSystems ROBOT GenericCan you trust all of your users all of the time? A typical end user receives 16 malicious emails each month, but only 17 percent of these phishing campaigns are reported to IT. Once an attack is underway, most organizations won’t discover the breach until six months later. A staggering amount of damage can occur in that time. Despite these risks, 93 percent of organizations are leaving their IBM i systems vulnerable to cybercrime. In this on-demand webinar, IBM i security experts Robin Tatam and Sandi Moore will reveal:

  • FORTRA Disaster protection is vital to every business. Yet, it often consists of patched together procedures that are prone to error. From automatic backups to data encryption to media management, Robot automates the routine (yet often complex) tasks of iSeries backup and recovery, saving you time and money and making the process safer and more reliable. Automate your backups with the Robot Backup and Recovery Solution. Key features include:

  • FORTRAManaging messages on your IBM i can be more than a full-time job if you have to do it manually. Messages need a response and resources must be monitored—often over multiple systems and across platforms. How can you be sure you won’t miss important system events? Automate your message center with the Robot Message Management Solution. Key features include:

  • FORTRAThe thought of printing, distributing, and storing iSeries reports manually may reduce you to tears. Paper and labor costs associated with report generation can spiral out of control. Mountains of paper threaten to swamp your files. Robot automates report bursting, distribution, bundling, and archiving, and offers secure, selective online report viewing. Manage your reports with the Robot Report Management Solution. Key features include:

  • FORTRAFor over 30 years, Robot has been a leader in systems management for IBM i. With batch job creation and scheduling at its core, the Robot Job Scheduling Solution reduces the opportunity for human error and helps you maintain service levels, automating even the biggest, most complex runbooks. Manage your job schedule with the Robot Job Scheduling Solution. Key features include:

  • LANSA Business users want new applications now. Market and regulatory pressures require faster application updates and delivery into production. Your IBM i developers may be approaching retirement, and you see no sure way to fill their positions with experienced developers. In addition, you may be caught between maintaining your existing applications and the uncertainty of moving to something new.

  • LANSAWhen it comes to creating your business applications, there are hundreds of coding platforms and programming languages to choose from. These options range from very complex traditional programming languages to Low-Code platforms where sometimes no traditional coding experience is needed. Download our whitepaper, The Power of Writing Code in a Low-Code Solution, and:

  • LANSASupply Chain is becoming increasingly complex and unpredictable. From raw materials for manufacturing to food supply chains, the journey from source to production to delivery to consumers is marred with inefficiencies, manual processes, shortages, recalls, counterfeits, and scandals. In this webinar, we discuss how:

  • The MC Resource Centers bring you the widest selection of white papers, trial software, and on-demand webcasts for you to choose from. >> Review the list of White Papers, Trial Software or On-Demand Webcast at the MC Press Resource Center. >> Add the items to yru Cart and complet he checkout process and submit

  • Profound Logic Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application.

  • SB Profound WC 5536Join us for this hour-long webcast that will explore:

  • Fortra IT managers hoping to find new IBM i talent are discovering that the pool of experienced RPG programmers and operators or administrators with intimate knowledge of the operating system and the applications that run on it is small. This begs the question: How will you manage the platform that supports such a big part of your business? This guide offers strategies and software suggestions to help you plan IT staffing and resources and smooth the transition after your AS/400 talent retires. Read on to learn: