Last Updated:

Debugging CGI scripts in Perl

CGI scripts can be attributed to the most "hard-to-debug" applications. As a rule, they are debugged on the server where they will work. At the same time, the process of finding errors, such as syntax errors, becomes a very difficult task, because due to the specifics of the CGI interface, error messages at the compilation stage do not "reach" the operator that debugs the script while behind the client machine.

And with time-based payment for the Internet, debugging CGI scripts also becomes quite expensive :).

The purpose of this article is to present some methods and techniques designed, in the author's opinion, to significantly simplify the process of debugging CGI scripts in Perl, as well as to point out some of the most common errors when writing them.

The following steps assume that you are debugging CGI scripts under Windows-9x.

Prepare the production environment.

To work on writing and debugging CGI scripts in Perl, it is convenient to use a local Web server, i.e. A web server installed on your computer. At the same time, from the point of view of the browser, working with such a server will not differ in any way from working with a server on the Internet. Almost any Windows Web server can be used as a local Web server; you just need it to support running CGI scripts.

Of the simplest options, you can offer PWS (Personal Web Server) from the Windows9x kit or IIS (Internet Information Server) from NT. However, if your site uses things like SSI, it's best to install Apache for Windows. Most of today's hosting providers have Apache Web Server under UNIX, so this compatibility won't hurt. In addition, Apache allows you to use additional non-standard CGI environment variables, so that CGI scripts that "in reality" will work on Apache are better debugged on it (although this is my opinion).

The local server has a "domain name" localhost and an IP address of 127.0.0.1. Accordingly, it is accessed through a URL of the form:

http://localhost/ ... 

However, to execute CGI scripts on Perl, the installation of a Web server is not enough, you also need to install Perl itself. I would recommend Perl for Win32 from ActiveState. After installing Perl, it must be "spelled out" in the Web server installations so that it is the Script Handler for Perl scripts. This is done differently in different Web servers, read the documentation for your server. For PWS and IIS, installing Perl is a special task. In most Web servers for Windows, the belonging of a CGI script to a certain "type" (Perl script, some other script ...), and accordingly the handler is determined by the file name extension. If this is the case on your server, you need to install the extension for CGI scripts on Perl as your hosting provider has (the de facto standard is *.cgi, *.pl). In Apache, the ownership of a script is determined not by the extension, but by the line "#!..." at the beginning of the script, as in almost all UNIX servers.

When installing all of the above, it is very important for us to be able to run Perl scripts "without a web server", i.e. as regular programs. This is very useful when checking them for syntax errors.

You can check the correct installation of everything by writing a simple script:

#!(path to Pearl)
print "Content-Type: text/plain ;
charset=windows-1251\n\n";
print "Script completed successfully! Congratulations!";

Save this script in a file, for example, test.cgi and put this file in your cgi-bin folder.

Now test the script by launching the browser and executing the URL:

http://localhost/cgi-bin/test.cgi

If everything is fine, you should see the following output in the browser window:

The script worked successfully! Congratulations!

If instead of this output of the script you saw the script itself on the screen or something like "Access Denied", "Permission Denied", "Forbidden" or did not see anything at all (as when downloading a "blank" document), then check the settings of your web server - most likely, you have incorrect access rights to your cgi folder.

Now let's try to run our CGI script as a regular Perl program. Let's go to some command shell (for example, FAR manager) and type:

perl (url of your script) 

As a result, you should see the output:

Content-Type: text/plain ;charset=windows-1251

The script worked successfully! Congratulations!

Instead of words in the last line, "abracadabra" is possible - that's okay; just a mismatch between the encodings in the script and in the shell in which we execute it. The main thing is that it works.

If both of the above tests were successful - congratulations! Now you can debug most CGI scripts on your computer without paying the provider a penny for it! Now you have your own little "Internet in miniature" :)))

Script debugging techniques.

A fairly common syntax error is the omission of ";" at the end of the statement. (this is especially true for those who are accustomed to BASIC, where the line separator is the separator between operators. In Perl, as in C/C++, all line feeds, carriage returns, and tabs are equated in importance to a space and are called "space characters". They are not operator delimiters. The only exception is their use in string constants, where they are "themselves", but that only confirms the rule that they don't separate operators.)

So, if your script contains a syntax error, then the message about this error will still not reach the browser. Most often, when a syntax error occurs in a script, the server generates a "500 Internal Server Error". Well, this is indeed considered an "internal server error"... But in what line is it?!

But now we can run the CGI script "as a program" and see a Perl error message!

If we make a deliberate mistake in the above script by removing the ";" at the end of the penultimate line, then running it "through the server", we are likely to see in large letters written "500 Internal Server Error". And if we run it "as a program", we will see a message like the following:

syntax error at test.cgi line 3, near "print"
Execution of test.pl aborted due to
compilation errors.

The first line indicates that a syntax error occurred in the test.cgi file, on line 3, next to the word print. I think it's pretty exhaustive! :) (the second line says that the script was interrupted due to compilation errors).)

Now we look for line 3, the print statement, and correct the error. Also, a rather unpleasant error, leading to, at first glance, completely incomprehensible behavior of the script, is the omission of the closing curly brace (}). Most often, Perl will define this as a compilation error, but I've had cases where no errors have been reported. In general, all errors in CGI scripts can, in my opinion, be divided into the following categories:

  1. Syntax (compilation errors);
  2. CGI interoperability errors;
  3. Errors in interaction with other programs and/ or files;
  4. Logical.

The first errors are most convenient to find in the way described above.

The second is the uncorrected output of the script: the script must output its response in HTTP format, i.e. the response header fields, then an empty string, and then the actual body of the response. In the example above, the title bar was displayed:

Content-Type: text/plain ;charset=windows-1251

followed by an empty string - and the body of the response:

The script worked successfully! Congratulations!

At the same time, the server, having received a response from the script, parses the header issued by it, adds additional fields and the main line of the response to it. Therefore, the CGI script does not need to form a full title. typically you must specify a content-type field, but in either case the empty line after the header must be

A few words about the so-called nph-CGI-scripts. These are scripts that fully form the HTTP header. Therefore, the server should not "parse" the headers issued by such scripts, but pass everything "as is" to the browser. Hence the name - nph (non-parsed headers ). For some servers, such scripts must have a specific file name structure (the name must begin with nph-), for others it is not necessary.

CGI communication errors also include incorrectly specifying CGI environment variables and, for scripts that process forms, the method of passing data from the form (GET or POST). At the same time, if the script is waiting for the data sent by the POST method (for standard input), and the GET method is mistakenly specified in the form, the script will be "stuck" - it will wait for the data to arrive at standard input! If the script accepts data from the GET method, and the form specifies POST, the script will work, but will not receive any data from the form (as if there were no fields in the form).

For scripts that display "pictures" (GIF, JPG), a typical error is the ASCII transfer mode. Immediately after the open statement opens the file, this file corresponds to the ASCII read/write mode, intended for "plain texts" (text/plain), such as the text in Notepad.

All files that are involved in the output of the image (including STDOUT!) must be transferred to the "binary" mode by the Perl operator:

binary FILE; 

Otherwise, the data will be distorted, since it will be transmitted as text: first, the "line ends" will be replaced, and secondly, the character with the code 0 will be perceived as the end of the file.

Errors in communication with external programs and files include an incorrect call, for example, UNIX-commands sendmail, date, etc. This can also include the wrong path to Perl, written in the first line after #!.

How do you debug scripts on a local machine that use, say, sendmail if you (under Windows) don't have sendmail? The easiest way is to comment out the place of the script that sends the letter. It most likely in the general case has the form:

open EMAIL,"|path_to_sendmail
    recipient_list";
print EMAIL "....";
....
close EMAIL;

where sendmail is opened in the open statement, EMAIL is the file handle; can be anything. Thus, you can check the work of the script without sending a message by e-mail, if it is auxiliary (for example, in guest books that send a notification of a new record to the webmaster).

There is another way to do this: instead of opening sendmail in the open command, write the opening of the file to a record. Nothing else needs to be changed.

In this case, when "sending a message", a file with this letter will be created, and you can manually control its format.

By the way, since the E-Mail address has the form user@hostname, and the @ symbol in perl is the designation of the array, the "direct" entry of such an address will cause an error! Therefore, before @ you need to put a reverse slash \ . That is, the address in the Perl line should be user\@hostname . The same applies to other reserved characters, such as $ .

for the date command you can say that in most cases it can be replaced by the perl function

scalar localtime 

which returns a string with the current date/time, for example:

Sun Oct 22 16:11:42 2000 

Such a replacement is quite possible if the resulting date/time value is not parsed by the CGI script, but is simply written to the log file (or used "as is" in a different way).

Thus, in most cases, the script can be checked even if it does not have the necessary programs that are standard for UNIX.

As for the error of communication with files, it should be mentioned incorrectly set access rights to auxiliary files used by the script.

Final debugging of CGI scripts on the server.

So, your script works fine on your local computer, now it's time to transfer it to the server.

So, what you should pay attention to:

1. The path to Perl in the first line.

Change it to the way to Perl your hosting. When debugging the script, you either did not need this line at all, or, if you were debugging on Apache, then this path on your computer is most likely different.

2. Paths to other programs used by the CGI script.

3. The names of the files that the script accesses.

In Windows, there is no distinction between uppercase and lowercase letters in file names, i.e. A.TXT and a.txt are identical names. In UNIX, on the basis of which most Internet servers work, capital and lowercase letters in file names are different characters. Thus, the script that opens file a.txt command:

open FILE,"a.TXT"; 

will work fine under Windows, but will not want to work under UNIX (the file will not be found).

4. Mode of uploading files to the server.

The most common mistake is to upload the entire site in the "binary" mode. And if there are no special problems with html and txt files downloaded in this mode (although they may arise), scripts downloaded in this way will not work. All CGI script files, as well as the text files they use, must be downloaded in ASCII mode.

5. File permissions.

Even if everything is done correctly, the script after uploading to a UNIX server is unlikely to start working immediately.

In order for it to work, you need to set access rights for the CGI script files and the files they use.

As a rule, immediately after uploading files to the site, they all set some "standard" set of rights (by default), for example:

-rw-r--r-- 

In general, all files in terms of the necessary access to them can be divided into 3 groups:

  1. CGI script files;
  2. Files used by the CGI script to read;
  3. Files that the CGI script uses to read and write;

Typically, a hosting provider that allows the use of CGI specifies what access rights should be set for each file type.

If not, the following settings can be used as a compromise:

CGI script - -rwx-r-x-r-x (755);
Files to read - -rw-r--r-- (644);
Files to be written - -rw-rw-rw- (666);

ATTENTION! On some hostings, other, more strict configurations of access rights are recommended, providing more reliable protection against hacking for your site and the system as a whole! Therefore, follow the recommendations of your hosting provider, if any!