Last Updated:

What are cookies and how to work with them

What are cookies

Where the term "cookie" came from, no one knows for sure, although it is believed that at the time of the birth of Unix-systems, the phrase Magic Cookies was used somewhere. They meant "receipts" (token, ticket), which were exchanged by programs.

Cookies are a solution to one of the inherited problems of the HyperText Transfer Protocol (HTTP). This problem lies in the impermanence of the connection between the client and the server, as in an FTP or Telnet session, i.e. a separate request is sent for each document (or file) when transmitted over the HTTP protocol. The inclusion of cookies in the HTTP protocol provided a partial solution to this problem. In other words, a transaction is completed after the browser has made a request and the server has issued the appropriate response. Immediately after that, the server "forgets" about the user and each subsequent request of the same user is considered a new user.

Using cookies, you can emulate the session over the HTTP protocol. In short, the principle of session emulation is as follows: on the first request, the corresponding cookie value is issued, and with each subsequent request, this value is read from the environment variable HTTP_COOKIE and processed accordingly.

A simple example: there is a form where the user is asked to specify his name, from which a script is called that writes the value of the cookie to the user's browser. At each subsequent visit based on the analysis of the cookie value from the user's browser, either a personal greeting (if there is a set cookie value) or an initial form asking for the user's name (if the cookie value is not set) appears on the page.

A cookie is a small piece of textual information that a server transmits to the browser. The browser will store this information and pass it to the server with each request as part of the HTTP header. Some cookie values can only be stored for one session, they are deleted after closing the browser. Others, installed for a certain period of time, are written to the file. Usually this file is called 'cookies.txt' and lies in the working directory of the browser installed on the computer. For example, I have the following in this file:

What can I do with cookies?


By themselves, cookies cannot do anything, it is only some textual information. However, the server can read the information contained in cookies and, based on its analysis, perform certain actions. For example, in the case of authorized access to something via the WWW, the login and password are stored in the cookies during the session, which allows the user not to enter them again when requesting each password-protected document.

The use of cookies is also often used to build ordering functions in online stores, in particular, in the largest virtual bookstore Amazon Books, a kind of virtual shopping cart is implemented, as in an ordinary real supermarket, in which the server records information about all ordered books. The user simply marks the books of interest to him, and then makes a purchase of all the marked books at once.

Another common area of the use of cookies is when setting up an individual profile of each registered user.

And finally, the most recent area is the use of the cookie mechanism in the advertising business on the Internet. A year ago, advertising on the Internet for money was a rather exotic service, and now this business is already established and is rapidly developing. However, advertisers are beginning to impose more stringent conditions on assessing the effectiveness of their spending. Cookies are used to target advertising (determine the target audience, for example, by the geographical location of users), track the interests of users, take into account the number of impressions and passages through banners.

Working with cookies


Now that the principles and scopes of cookies are more or less clear, you can start learning the format and syntax, as well as how to set cookie values.

The description of the format and syntax of cookies


provided by me in this article is a loose paraphrase of the original Netscape Communications specification "Persistent Client State HTTP Cookies". A stricter specification for cookies is currently being developed. So, the cookie is part of the HTTP header. Full description of the SET-Cookie HTTP header field:

Minimum description of the Set-Cookie HTTP header field:

NAME=VALUE - string of characters, excluding line feed, commas and spaces. NAME-name of the cookie, VALUE - value. You cannot use a colon, a comma, or a space.

expires=DATE - cookie storage time, i.e. DATE should be replaced by a date in the format "expires=Monday, DD-Mon-YYYY HH:MM:SS GMT", after which the cookie storage time expires. If this attribute is not specified, the cookie is stored for one session, until the browser is closed.

domain=DOMAIN_NAME is the domain for which the cookie value is valid. For example, "domain=cit-forum.com". In this case, the cookie value will be valid for both the cit-forum.com domain and the www.cit-forum.com. But don't be happy, specifying the last two domain name periods is only enough for the domains of the hierarchy "COM", "EDU", "NET", "ORG", "GOV", "MIL" and "INT". For the new seven first-level domains currently under discussion (FIRM, SHOP, WEB, ARTS, REC, INFO, NOM), this condition is likely to remain. For domains in the RU hierarchy, for example, you will have to specify three periods.

If this attribute is omitted, the default domain name of the server on which the cookie value was set is used.

path=PATH - This attribute sets a subset of documents for which the cookie value is valid. For example, specifying "path=/win" will cause the cookie value to be valid for many documents in the /win/ directory, in the /wings/ directory, and files in the current directory with names like wind.html and windows.shtml. In order for cookies to be sent each time a request is made to the server, you must specify the root directory of the server, for example, "path=/".

If this attribute is not specified, the cookie value applies only to documents in the same directory as the document in which the cookie value was set.

secure - if there is this token, then the cookie information is sent only via HTTPS (HTTP using SSL - Secure Socket Level), in secure mode. If this token is not specified, the information is sent in the usual way.

HTTP header syntax for the Cookie
field When a document is requested from the HTTP server, the browser checks its cookies for compliance with the server domain and other information. If cookie values that meet all the conditions are found, the browser sends them to the server in the form of a name/value pair:

More information
You can set multiple cookie values at the same time.

If the cookie accepts a new value if there is already a cookie in the browser with the same NAME, domain and path parameters, the old value is replaced with the new one. In other cases, new cookie values are added to the old ones.

The use of expires does not guarantee the safety of the cookie for a specified period of time, since the client (browser) may delete the record due to lack of allocated space or any other reasons.

The client (browser) has the following restrictions on cookies:

  • up to 300 cookie values can be stored in total
  • each cookie cannot exceed 4KB
  • up to 20 cookie values can be stored from a single server or domain

If the limit of 300 or 20 is exceeded, the first record in time is deleted. If the volume limit of 4KB is exceeded, the correctness of the cookie value suffers - a piece of the record (from the beginning of this record) equal to the excess of the volume is cut off.

In the case of document caching, for example, by a proxy server, the HTTP header Set-cookie field is never cached.

If the proxy server accepts a response that contains a Set cookie field in the header, it is assumed that the field reaches the client regardless of the 304 (Not Modified) or 200 (OK) return code. Therefore, if the client request contains a cookie in the header, it must reach the server even if the If-modified-since parameter is hard-coded.

Below are some examples illustrating the use of cookies

Example 1: Managing a subset of documents for which cookie values are valid and their expiration


date The browser requests a document and accepts from the server in response:

When the browser requests a URL with the path "/" on this server, it sends to the server:The browser requests the document and receives from the server in response:When the browser requests a URL with the path "/" on this server, it sends two cookie values to the server:The server set another cookie value, this time with a different scope:Now the browser, when requesting a URL with the path "/" on this server, sends only two cookie values:and only when the browser requests documents with the path "/foo" are all three cookie values sent to that server:
 
Comment: After closing the browser, only one cookie value will remain in the 'cookie.txt' file:since only it has an expiration date of 9 November 2025. All other values will not be stored.

Example 2: Cookie values with the same name, but different parameters


The browser requests a document and accepts a response from the server:

When the browser requests a URL with the path "/" on this server, it sends the value:The second time, when requesting a document, the browser accepts a cookie value from the server with a different scope:When the browser requests a URL with the path "/ammo" on this server, it sends the value:Comment: Here we have two name/value pairs with the same name "PART_NUMBER". When you close the browser, none of these values will be saved because the expires parameter is not specified.

How to set cookie


values The method of setting cookie values depends on how those values will be used and what server resources are available. You can manipulate the lifetime of set cookies and set subsets of URLs (Universal Resource Locator) in which the specified values are valid. There are several ways to set, the most commonly used are three - through META tags of the HTML language, JavaScript and CGI scripts. In any way, you can specify one or several values at once. Just want to warn you - do not forget about the limitations on the volume and number of cookie values, as well as the domain parameter, since in addition to the main domain name of the site there are often several aliases (alias).

In it, setting and analyzing the values of cookies, I either do not allow the user to vote (if cookies are disabled in the browser or the user has already voted once), or I allow voting (if the corresponding value is not set). It is possible to deceive such a system only by deleting the cookie.txt each time. You could use the voting log file on the site, but there would be problems of splitting access to the file and slowing down due to the use of slow disk operations.

A little about the problems associated with the use of cookies


The main problem is the initial distrust of users that remote servers without their (users') knowledge and consent write any information to their own local disks. There were also rumors that with the help of the cookie mechanism you can read any information from any computer. This is not true, besides, modern versions of browsers allow you to control the acceptance of cookies or even block it. In addition, there are many special utilities for managing the acceptance of cookies, the so-called Cookie Managers.

The other side of this problem is that huge amounts of data with personal information necessary for commercial servers are accumulated on the nodes of the Network. This is where the increased requirements for protection against unauthorized access to this data appear. Users of such servers must be sure that their names, e-mail addresses, phone numbers, etc., will not fall into the wrong hands. Otherwise, the consequences can be disastrous for the "penalized" commercial servers.