Linux/AS400: download and convert database files from ebcdic to ascii and excel

I was involved in trasferring data from an AS/400 and to do a minimal datawarehouse, loading that data on an Oracle database. The datawarehouse system was my responsability, the AS/400 was managed by another group located in another city. The AS400 database had packed and zoned fields.
At that time I had been involved in Unix and TC/IP administration and configuration since 1992. However I never worked in an AS/400 or IBM Host environment, this means that I knew something about Network and Unix, but I knew almost nothing about the AS/400.
For this low budget project, after evaluating alternatives, I choose RedHat Linux for its stability and flexibility, the Oracle database for its features, stability and performance, and Perl for its developing speed and abundance of specialized modules.
In the early days of this project, I had the following issues:

  • how to script the transfer of database files from AS400 to the linux box to load them on the Oracle database
  • how to automate some interactive activities on the AS400 terminal emulator

I tried to find the solution to this kind of problems on the web, but it seemed that there were some AS400 experts, many Linux/Oracle/Perl experts, but the two groups had a very small intersection. At the end I decided to develop some scripts and some techniques that I am now happy to share with everybody interested.

Basic AS400/Linux tools used

To interface the Linux box to the AS/400 I used:

  • x3270 a very nice 3270 terminal emulator, using tips on his home page it can be used also as a 5250 emulator. It has the feature to be controlled by another program via standard input/standard output and it has also a companion program, s3270 that doesn't display the emulator window and is perfect to be used only to be controlled by another program;
  • curl a nice utility to download ftp or http pages unattended
  • the Perl language with the following modules:
    Expect
    Expect allows to either spawn a process or take an existing filehandle and interact with it such that normally interactive tasks can be done without operator assistance. This concept makes more sense if you are already familiar with the versatile Tcl version of Expect;
    Convert::EBCDIC
    Perl module for string conversion between EBCDIC and ASCII;
    Spreadsheet::WriteExcel
    Perl module to write to a cross-platform Excel binary file

Downloading database files from AS/400

Overview

Initially we tried to transfer database files from AS/400 to our linux box using ftp in ascii mode. This was almost ok: we got a text file with each record converted on a single line, within each line, each field had his own lenght. The problem was with some files due to the way the AS/400 manages packed and zoned or signed fields.
The fact is that in the AS/400 each field in the database file is stored according to his type:

Strings, Integer and Decimals
are stored using the ebcdic standard codes, during an ftp ascii transfer these fields are transferred without any errors, one minor problem is that decimals are transferred without the decimal point;
Packed fields
AS/400 allows the definition of packed fields, these are decimal fields, but each decimal digit is stored in a nibble (4 bits, half of a byte) to save space. The sign is stored on the last nibble using the convention that the hex digit 'D' (debit) means negative value, the hex digit 'C' (credit) means positive value. During an ftp ascii transfer these fields are converted to pure garbage;
Zoned or Signed fields
In signed fields each digit is stored on a single byte, but on the last byte one nibble (the first 4 bits) is used to store the sign: the hex digit 'D' means negative value, the hex digit 'C' means positive value. During an ftp ascii transfer these fields are transferred almost correctly: only the last digit is garbage.

I solved this problem in the following way:

  • download in binary mode the AS/400 database file with the curl utility using a command similar to the following:
    curl -o mytest \
      ftp://user:password@as400host.mydomain.com/library/mytest
    

    I now have the ebcdic downloaded file on the file mytest

  • download the file layout, I downloaded this giving the dspffd command and using the screen capture function of the x3270 terminal emulator (function file -> Save screen(s) in file) and converted these screens to the file mytest.layout. Later I developed a Perl/Expect script ( as400-file-layout.pl and as400-include.pl) that automated this downloading with a command similar to the following:
    as400-file-layout.pl -u userid -p password -l library \
                 -f file > file.layot
    
  • convert the downloaded ebcdic file in an ASCII file with one record per line and each field separated by the "|" character. I used a command similar to the following to get this result:
    ebcdic2ascii.pl mytest.layout mytest > mytest.txt
    
    
  • convert the text file mytest.txt in an excel file using a command similar to the following:
    csv2xls.pl mytest.txt mytest.xls
    

Layout file

To be able to convert the downloaded database from AS/400 I need the layout information stored on a file. The perl script ebcdic2ascii.pl does expect a layout file in the format similar to the following mytest.layout file:

 MYTRK      Record type                          A     1          1      1    1
 MYPLT      Plant code                           A     2          2      2    3
 MYIDF      Part number                          A    20         20      4   23
 MYWHA      Wharehouse code                      A     1          1     24   24
 MYOTY      Owner type                           A     1          1     25   25
 MYSUP      Supplier code                        A     7          7     26   32
 MYLOC      Location                             A     6          6     33   38
 MYBOY      Begin of Year quantity               P    11  3       6     39   44
 MYCUR      Current quantity                     P    11  3       6     45   50
 MYBOM      Begin of month quantity              P    11  3       6     51   56
 MYLDA      Last fill in date                    P     8  0       5     57   61

This information is same information you can get with the AS/400 command dspffd to display file layout information. This file was generated by an utility (dsptrk) that was installed on one of the AS/400 I used. At the time I tought that this was a standard AS/400 command, instead it was a command locally developed.

Because my script ebcdic2ascii.pl does expect the layout file in this format, if you have the layout file in a different format you should convert it to this format or change the ebcdic2ascii.pl script or use the expect based script as400-file-layout.pl to get this file in this format from the AS400.

If we look at the row related to the field MYBOM we will found the following information on the layout file:

MYBOM
Field name
Begin of month quantity
Field description
P
Field type, in this case it is a Packed Decimal
11
Field total lenght
3
Number of decimals
6
Field lenght in byte. It is a packed field (two digit per byte) so we have that this lenght is half of the field lenght
51
this field start at column 51
56
this field end at column 56

Database conversion to an ASCII text file

To convert the AS400 database file just downloaded in binary mode we use the command ebcdic2ascii.pl. The format of this command is:

ebcdic2ascii.pl [-s c] [-d lev] [-f|-b] file.layout file-ebcdic

it reads layout information from the file.layout and convert the AS400 file-ebcdic to an ASCII, pipe delimited text file on standard output. (To have an idea on what this command do you can look at the files mytest (AS400 file trasferred with ftp in binary mode), mytest.layout and mytest.txt ). The options of this command are the followings:

-d lev
set debug level to lev
-f
use field names instead of descriptions on the heading row of the text file
-b
use both field names and descriptions on the heading row of the text file
-s c
use c as field separator (default is '|')
-a
use "." as the decimal point (default is "," because this script was developed in Italy where the use of "." and "," in numbers is reversed)

If we look at the source of this file (see ebcdic2ascii.pl.lst ) we can see:

  • from line 1 to line 25 there are some comments and
    use statments;
  • from line 26 to line 43 there are assignments to arrays used to correctly read the layout file: @descr(11,35), for example, means that we are storing in this array the starting column number and ending column number of the field description in
    the layout file;
  • from line 44 to line 79 there is the function packed2ascii: this function has two input parameters: $str (the packed decimal field) and $decimals (the number of
    decimal digits); it returns a decimal value. Some examples:

    packed2ascii(0x1234567c,2) # returns  12345.67
    
    packed2ascii(0x1234567d,2) # returns -12345.67
    
    packed2ascii(0x1234567f,2) # returns  12345.67
    
  • from line 80 to line 121 there is the function signed2ascii: this function has two input parameters: $str (the siged or zoned decimal field) and $decimals (the number of decimal digits); it returns a decimal value. Some examples:
    signed2ascii(0xf1f2f3f4f5f6c7,2) # returns  12345.67
    
    signed2ascii(0xf1f2f3f4f5f6d7,2) # returns -12345.67
    
  • from line 122 to line 134 ther is the function autocrop used to strip off leading and trailing white spaces from a string;
  • from line 135 to line 146 there is the function usage used to print some usage text;
  • from line 147 the main program starts and till the line 175 it processes options and parameters
  • from line 176 to line 213 the layout file is processed and the following main variables are filled in:
    • @campi a list of field names;
    • many hashes (%campo, %descr, %dacol ... ), the keys of this hashes are the field names
  • from line 214 to line 231 the header row is printed on the standard output;
  • from line 232 to line 255 the database file is processed:
    • the file is read record by record (line 240)
    • each field of the record is processed (lines 243-255) and is converted to an ASCII string according to his type
  • from line 256 to the end each ASCII converted field is printed on the standard output

Download the layout file with a "screen scraper" application

To download the mytest.layout layoutfile easily I wrote a script that using x3270 or s3270 is able to connect to the AS/400, to logon, to give the DSPFFD command (Display File Format Description), capture the terminal screens, reformat this information and print, on the standard output, the file layout.
Using the same approach described here it is easy to write CGI web interface to "green screen" AS/400 applications (I did this). The application to download the layout file is composed by two little perl script: as400-include.pl and as400-file-layout.pl.

The as400-file-layout.pl script

This is the main script, the one that is executed do download the file layout. The format of this command is:

as400-file-layout.pl [-d debuglevel] [-x] -u userid \
  -p password -l library -f file

it connects to the AS/400, logon to the system using the userid and password specified on the comman line, gives the DSPFFD command, captures the screens and print the
layout file on the standard output. It gives some information on what it is doing on the standard error.

-d debuglevel
set debug level to debuglevel
-u userid
specify the user profile to use to logon to the AS/400
-p password
specify the password to use to logon to the AS/400
-l library
the library where the file is stored
-f file
the file we are interested in

If we look at the source file of this script we can see what it does:

  • include the as400-include.pl;
  • process command line options and set global variables. In particular the variable $as400 defines the AS/400 hostname or IP address (you shold change this if you want to use this script), $x3270cmd defines the emulator program to use: x3270 for the option -x the s3270 without the -x option;
  • define the usual usage subroutine;
  • define the ffd2layout function (File field description to layout); this function takes as input a string that is a result of "screen scraping" the output of the DSPFFD command and gives back, as output, a string in the usual mytest.layout format;
  • the main program, using the Expect module and some functions defined inside the included as400-include.pl does:
    • spawn the 3270 emulator (x3270 or s3270) and assign the handle of this Expect object to the global variable $x3270
    • connect to the AS/400 (function x3270_connect);
    • login to the AS/400 (function x3270_login)
    • store on the string $s the "screen scraping" output of the AS/400 DSPFFD command (function x3270_dspffd);
    • convert the output of the previous string to a different format (function ffd2layout) and print it on the standard output.

The as400-include.pl script

The main functions used to interact with the AS/400 and with the x3270 or with the s3270 emulator are grouped here.

If we look at the source of this script we will find the following interesting things (description of not used or very simple functions is omissed)

  • it uses the following global variables that are defined on the
    parent script that it is including this file:

    • $as400 the hostname or IP address of the AS/400 host
    • $x3270 the handle for the Expect object
    • $user and $password the user profile and the password to use to logon to the AS/400;
  • the use Expect statement to use the Perl Expect module;
  • the x3270_die function that prints the error string passed to it, the current content of the x3270 emulator screen and exit;
  • the x3270_connect function that connects to the AS/400 controlling the x3270 application;
  • the x3270_login function that logins on the AS/400:
    • before logging in this function checks that on the login screen on the row 1, column 69 there is the 8 characters string "S65DD4AA" (in my case this is the name of my AS/400). This check is done to be sure to have a login prompt on the screen.
    • After this check the function "type" the user profile and password to logon (statement print $x3270 "String($user)\nTab()\nString($password)\nEnter()\n").
    • it looks at the screen contents and if it find the string "Press enter to continue" it "type" Enter to continue;
    • it checks to have the "AS/400 Main Menu" on the screen to be sure to have successfully logon on the system;
  • the function x3270_asciiscreen returns, in a multiline string, the entire emulator screen;
  • the function x3270_asciiregion returns a rectangular region of the screen;
  • the function x3270_ascii returns a string, part of a line, from the screen;
  • the function x3270_dspffd gives the DSPFFD command (statement "print $x3270 "String(\"dspffd file($library/$file)\")\nEnter()\n""), "type" the "F8" key to move forward, page by page, till the "More ..." information disappear from the screen and store the lines between 5 and 20 of each screen on a multiline string that is returned to the caller;

Conclusion

Using the examples shown here it is possible not only to download and convert, unattended, AS/400 database files, but also design "screep scraping" applications that gives a web interface to the old fashioned applications of the AS/400.
I currently use these applications on Linux and Solaris, but they can be easily ported to the Windows environmet because Perl, and its modules, are available for this environment and using Cygwin should be possible to compile s3270 and c3270 also on this platform.

I would appreciate if someone using the scripts and tips described here is so kind to drop me an e-mail to let me know, to ask me some question or to post a comment here.

My email address: Valerio _at_ DiGiampietro.com (replace ' _at ' with '@').

Thanks

Thanks for article...

---------
Oyun Hileleri


infinity downline

Cool stuff great post! I love finding little hidden gem blogs like this!


script

The script is good and useful.
Can you help following issue ?
if the file in AS400 is empty or no data, how to handle this situation ?


yeap

this script really great.. i wanna use to that .. özel turlar can u help as.. for how can i use that .. thanks


Variable length records

Hi Valerio,

I know this post is quite old, but I'm hoping you're still around to read this.

I downloaded and modified your ebcdic2ascii.pl script to work for testing purposes. I think a version of this script may fill a niche in our env. I successfully converted some of our files using manually created layouts.

We already have applications that handle the ftp process, but I don't have access to the source's mainframe computer, so I'm writing a perl script that will generated the layout files from copybooks. I'm pretty sure I can make it work, at least for fixed length records, but I forsee some issues. For example, how would you handle variable length records?

Thank you for creating these scripts and taking time to provide the instructions and examples. You have been most helpful.

Gary Powell
Senior Operations Analyst
INGENIX Government Solutions
ge.powell@comcast.net


Useful article + alternatives

About 9 months ago I was faced with a similar requirement and used CPYTOIMPF to convert the AS/400 database files and kermit to script the interaction with the AS/400. I mirror about a dozen database files daily which comprise the key data of our core banking system. The whole system has proved very effective if a bit crude and I have a number of shell and PHP scripts to interact with the data.

Since I am familiar with tcl probably expect would have been a better choice than kermit but for some reason did not occur to me at the time.

I also came across this surprisingly recent article regarding using unixODBC which is an approach I had been searching for months previous to implementing the above: http://blogs.23.nu/c0re/stories/17514

Now that I have the whole framework in place and because the AS/400 should be phased out this year I am unsure whether I will use this fully on-demand capability but it would have been great to have known about it last year. The free availability of these downloads from IBM is great but they could do more to advertise their features/benefits.

By the way the equivalent package for Informix works great as well. In my case Informix is running on a remote RS/6000 under AIX.

Regards,
Mark