Last Updated:

Basics of web-design : Perl and CGI

So far, we have come across the only type of script: client scripts that run on the client, that is, the computer of the visitor of the web page, and are written, as a rule, in the JavaScript language. A significant limitation of this solution is, for example, the inability to save the received data: due to security reasons, the script executed in the browser does not have access to the vast majority of computer resources, in particular to the file system. Of course, this is not the only drawback of client-side scripts.

In today's lesson, we'll talk about server-side programs: the very server that stores the web pages we view. These programs can be triggered by the visitor's browser contacting them: just as in the case of a request with the name of an html file, when the server returns hypertext, in response to a request with the name of a file type or (there may be no extension at all, it all depends on how the server is configured) the server runs a cgi application, which is the requested file. The results of the application are again transmitted to the browser..cgi.pl

Contrary to a common misconception among beginners, CGI (Common Gateway Interface) is not a language, but only a type of interface, a protocol for exchanging data between a server and a program. A CGI program can be written in almost any language, the compiler (interpreter) of which can work in the OS under which the web server is running. Most often, CGI applications are written in the Perl language, which is what we will talk about today.

In order to experiment with cgi scripts, you first need to install Perl and configure the server correctly. For the ActivePerl distribution for Windows, you need to go to the Activestate website. The product you need is distributed free of charge. I think that there should be no problems with its search and installation. I will only note that it is best to install ActivePerl in the folder , and not where the installer offers.C:\Perl

To work with Perl, you will need a specialized editor. You can do with Notepad, but it does not have syntax highlighting, it is impossible to organize the output of diagnostics to the editor itself, etc. etc. There are dozens of suitable editors, I myself use Perlbuilder, the trial version of which can be downloaded from the Solutionsoft server.

Configuring Apache

Now open the .httpd.conf

1. First of all, you need to tell the server in which folder you can run scripts: by default, this is forbidden. Locate the directive in the file. It needs to be rewritten something like this:httpd.confScriptAlias

ScriptAlias /cgi-bin/ "D:/MyWeb/artefact.list.ru/cgi/"<Directory "D:/MyWeb/artefact.list.ru/cgi">
   AllowOverride None
   Options ExecCGI
   Order allow,deny
   Allow from all
</Directory>

It is clear that you need to prescribe those paths that correspond to your computer. Note that the directory where our scripts will be stored has an option that allows Apache to run executable programs in this directory.D:/MyWeb/artefact.list.ru/cgiExecCGI

2. Now you need to determine which type of files Apache should read as scripts. Look in the substring. Rewrite it like this:httpd.confAddHandler

AddHandler cgi-script .pl

Actually, almost any extension can be specified for cgi scripts, but most often they use . In fact, these are ordinary text files that contain the code of scripts written in the Perl language..pl.cgi

This completes the Apache settings. Restart the server. It remains to make sure that all the described procedures are done correctly.

In the folder where you want to store the scripts, create a text file. Its content should be as follows:

#!C:/Perl/bin/Perl.exe
print "Content-type: text/html\n\n";
print "Hello world!";

Note the first two lines of the script.

The first line should contain the path to the Perl interpreter that you have installed on your computer. This string must be present in every script, otherwise the server will generate an error, and the script itself will not be executed.

The second line should precede any output to the browser. If it is not, the server will also issue an error message: the difference from the previous case is that the script code will have time to execute before the output operations.

So, rename the code file to . Now type the following in the address bar of the browser:test.pl

http://localhost/cgi-bin/test.pl

If you did everything right, the browser window will say "Hello world!", otherwise you will see an error message. In order to find out what exactly the server did not like, look in the file that is located in the folder: there is written a decoding of server errors. To display errors in the browser (instead of the usual insignificant Internal server error message), use the directive with your scriptserror.loglogs

use CGI::Carp qw (fatalsToBrowser);

Note that the remote server on which you are going to store your scripts is probably configured differently than your personal computer. On Unix systems, the path to the Perl interpreter usually looks like this:

#!/usr/bin/perl

Accordingly, change the first lines of your scripts before uploading them to a remote server. In addition, you may need to change the permissions for the downloaded files: this is done by the . This command exists only in Unix/Linux systems, under Windows you do not need to do anything like this. Any decent ftp client (e.g. -- LeapFTP) that allows you to access a remote server supports the command .chmod 755chmod

Pearl Language

The Perl language is both very simple and quite complex. Simply because elementary tasks with its help are solved in one action, as in school BASIC.

$a = 2;
$b = 2;
$c = $a * $b;
print $c;

Do not be afraid of "extra" characters: in Perl it is customary to precede the names of variables with prefixes, the dollar sign means that we are dealing with a scalar, that is, a number or string of characters. There are also arrays whose names begin with the symbol "", and associative arrays (begin with ""). Note that the individual elements of these arrays are scalars and are correspondingly written through "": for example, , but (the fifth element of the array , numbering starts from zero).@%$@lines$lines[4]@lines

Perl is complex for two reasons: first, it makes it possible to solve any problem in more than one way, if only because the syntax of the language itself is not too strict. Second, Perl allows a developer familiar with the undocumented and little-used features of the language to write code that a beginner, even if he is already familiar with two or three popular programming languages, will not be able to read at all. Don't believe me? Nothing, it will pass.

If you stick to some "average" level of programming, then Perl is rather simple. It has directives familiar to any programmer while, ifforreadprint, etc. There is also an amazing tool called Regular Expressions: with its help you can do anything with strings of characters. True, it looks creepy for an unprepared eye:

s/%([0-9a-fA-F])/pack("c",hex($1))/ge;

or

s/(href=\".*?)\.+(\")/$1$2/igm;

In fact, as the unforgettable Professor Iceberg said, everything is very, very simple.

Of course, I won't be writing a Perl programming guide. There are already too many of them, these guides. But a few examples of solving the most common problems faced by a web programmer will not hurt us.

Examples of CGI scripts

1. Issuing information to the browser

Let's take the simplest task: let the script calculate something and give the result to the browser.

Replace the contents of the file with the following code:test.pl

#!C:/Perl/bin/Perl.exe
# Set $a to 2:
$a = 2;
# Set $b to twice the value of $a:
$b = $a * 2;
# Give the browser the type of expected data:
print "Content-type: text/html\n\n";
# Display the data itself:
print qq~
<HTML>
<HEAD>
<TITLE>An example of a dynamically generated page</TITLE>
</HEAD>
<BODY>
<H1>Result:</H1>
<p class=body>Variable value b = $b</p>
</BODY>
</HTML>
~;

All information located in the block

print qq~
   ...
~;

will be passed to the browser as is, i.e. unchanged. As can be seen from the above example, it is very convenient to organize the output in this way, because variables (in this case, $b) can be inserted directly into the html code, and in the source code of the output page their values will be located in this place. There can be as many such blocks in the text of the program as you like, they can, say, be grouped into subroutines (sub keyword) and carried to the end of the script file so as not to clutter the text of the Perl program with html.

By the way, this method of displaying html-code is convenient for another reason.

Imagine that you need to display a string like "" in the browser. Well, there's not much to do. ;-) So, if you write this without thinking how12375"ghr

print "12375"ghr";

we will get an error and the program will not run. The reason is that the character """ will be perceived as the end of a string terminator, and everything that follows it will try to be perceived as Perl code, which will be the result of the error message. In other situations - for example, in the case of regular expressions, the syntax of which is quite complex - "/", "?", "?", ".", "\", "$", "^" and many other characters will fall into the category of special ones. Typically, this problem is solved by using the "\" character (backslash):

print "12375\"ghr";

all characters facing backlash will be interpreted as normal and will have no effect on program execution.

Let you give the browser dozens of lines of html code. If you do this line by line, then, firstly, each line will have to be enclosed in a container "", but this is not the only trouble: the html code contains a lot of characters that will be perceived as special and which will have to be preceded by the sign "\" (in jargon - "escape"). Block output () allows you not to change the code and thus save time for something more important.print "...";print qq~ ... ~;

2. Receive data from the form

Let's complicate the task a bit. Let the script now perform calculations with the data that is transmitted to it by the browser. The first question is: how to transmit this data?

Method one: pass parameters using a URL.

So far, we have called the script by directing the browser to the following address:

http://localhost/chi-bin/test.pl

Suppose we have two parameters. Let's rewrite the URL as follows:

http://localhost/chi-bin/test.pl?a=2&b=4

Now our script will be able to read the names and values of the variables passed to it, how exactly -- I'll describe it a little later. Note: the sequence of data passed to the script is opened with the symbol "?", and the name/value pairs are separated from each other by ampersendent ("&").

The disadvantage of the method: all the transmitted parameters are explicitly present in the address bar of the browser, which in some cases can be inconvenient.

Method two: data transfer through the html-form.

Suppose the web page contains a form similar to the following:

<form method=POST action="http://localhost/test.pl">
<input type=hidden name="par1" value="56817">
<input type=text name="par2" value="rtrtrt">
<input type=submit value="Submit data">
</form>

After clicking the "Send Data" button, the script will be able to read the names and values of all the fields of the form, including hidden ones, which in the end will give a result similar to the previous one, but the page visitor will not see the passed parameters, although for the script this method is absolutely similar to the call

http://localhost/test.pl?par1=56817&par2=rtrtrt

The way to read this form is the same as in the case of passing parameters through a URL. We're going to talk about it now.

Let's create a page containing a link

http://localhost/chi-bin/test.pl?a=2&b=4

Now let's rewrite again:test.pl

#!C:/Perl/bin/Perl.exe
# Read input parameters:
read(STDIN, $buffer, $ENV);
# Divide the parameters into pairs:
@pairs = split(/&/, $buffer);
# Loop through each pair:
foreach $pair (@pairs) {
($name, $value) = split(/=/, $pair);
$value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;
#
$in = $value;
}
$c = $in * $in;
# Give the browser the type of expected data:
print "Content-type: text/html\n\n";
# Display the data itself:
print qq~
<HTML>
<HEAD>
<TITLE>An example of a dynamically generated page</TITLE>
</HEAD>
<BODY>
<H1>Result:</H1>
<p class=body>Variable value c = $c</p>
</BODY>
</HTML>
~;

I will not go into this part of the code in detail, because, as mentioned, learning Perl is beyond the scope of this course. I will only say that the result of the foreach loop is an associative array containing the names and values of all input parameters. It is enough to understand that each of these values can now be obtained by calling the element of the array , where is the name of the parameter we need. The recording illustrates this very well.$inpar_name$с = $in * $in;

After calling the script, the value of the variable should appear in the browser window as a product of the values of the variables and , which we passed to the script in the URL string. If something didn't work out, welcome to . Perlbuilder himself will tell you about syntax errors.caberror.log

3. Work with files

New task: the script must read the text file and display its contents in the browser window.

Create a . Copy some text there. test.pltext.txt

#!C:/Perl/bin/Perl.exe
# Set the name of the text file:
$fname = "text.txt";
# Give the browser a content type:
print "Content-type: text/html\n\n";
# Print the header of the HTML file:
print qq~
<HTML>
<HEAD>
<TITLE>An example script that reads a local file</TITLE>
</HEAD>
<BODY>
<H1>File content:</H1>
~;
# Open the file (or issue an error message):
open(TFILE,$fname) or die "No such file";
# Read the file line by line into an array:
@LINES=<TFILE>;
# Close the file:
close(TFILE);
# Get the number of elements of the resulting array:
$LTFILE=@TFILE;
# We start the loop element-by-element processing of the array:
for ($i=0;$i<$LTFILE;$i++) {
# Print each line as an HTML paragraph:
print "<p>".$TFILE[$i]."<\/p>";
}
# Print the end of the HTML file.
print qq~
</BODY>
</HTML>
~;

That's it. Notice that the merging of strings in Perl is indicated by a period (for example, two text variables are joined by an entry). $a.$b

Run the script by typing its address in the browser bar. If something is wrong, see the last two sentences of example #2.

4. Regular expressions

As mentioned above, regular expressions are designed to handle strings. Recognizing regexp (abbreviated name Regular Expression) is quite simple, usually they are preceded by the sign "=~", and the expressions themselves begin with a letter (letters) that specifies the type of operation: (match) - search for a substring, (substitute) - replacement of a substring, and (translate) - translation. Regular expressions are difficult to read, because they are built mainly from special characters and their sequences, but they are based on simple logic that allows you to create real miracles. Here are a couple of simple examples:mstr

# Search for the first shortest substring,
# located between the substrings "Van" and "ver".
# the result is passed to the service variable $1.
$a =~ m/Van(.*?)ver/i;
# Global replacement of the word Toronto with the word Vancouver
# throughout the line.
$b =~ s/Toronto/Vancouver/ig;

Perhaps this will seem meaningless: at least it seemed to me at the time. But see what the simplest uses you can do with regular expressions:

# With one movement of the hand, remove all spaces at the beginning
# and the end of the line. In the absence of an explicit
# search area operation is performed on the service area
# variable $_:
s/(^s+)|(\s+$)//igm;
# Remove HTML that accidentally got into the processed string:
s/<.*?>(.*?)<.*?>/$1/igm;

Like? I doubt it. But you'll love it. By the way. all the texts in the library were processed using regular expressions, which means that there is still some benefit from them... ;-)