Last Updated:

Functions for working with strings in Lua

Let's move on to string operations in Lua. Lua offers a wide range of functions for working with text. As we mentioned earlier, Lua is a Unicode agnostic. Unicode agnostic is good enough for most things, but there are situations where it just doesn't work, so we'll also introduce a library to work with utf8 in Lua. Let's first look at the basic string operations.

strings in Lua

Basic string functions

Like everything else in Lua, all of our indexes start at 1, not 0! Remember this for anything related to the index. Let's start with a table and present each function as it goes.

FunctionDescription
string.len( s )Gets the length of s in bytes
string.upper( s )Returns the transformation "s" to
string.lower( s )Returns a lowercase transformation of "s"
string.reverse( s )Returns a reverse "with"
string.sub( s [, i [, j ] ] )Returns a substring from "S" starting with "i" and ending with "j"
string.rep( s, n )Returns "s" repeated n times
string.match( s, pattern [, i] )Returns a substring of "s" that corresponds to "pattern" (see below for patterns) that begins with the optioni index
string.gmatch( s, pattern )Returns iterator substrings from "s" that match the "pattern" (see)
string.gsub( s, pattern, replacement [, n] )returns the string "s" that replaces each instance of "pattern" with "replacement" which takes an optional n that limits the number of times to make a replacement
string.find( s, pattern [, index [, boolplain ] ] )Returns the start and end index (or zero) for finding "pattern" in "s", starts with an optional "index" and can take the optional bool "boolplain" to ignore pattern search and search literally
string.format( s, o1, ..., on )Returns the formatting of the "s" using options in and out of o1, similar to the printf options in C

We'll take a look at each of them and how to use them. Things like match and gmatch will be described in detail as each has different use cases. Let's also take a look at some of the less commonly used string functions that we won't spend a lot of time on.

Let's first dive into the easiest string functions to learn.

  • string.len( s ),
  • string.upper( s ),
  • string.lower( s ),
  • string.reverse( s )

Each one is simple, as they just take one parameter. Here's an example of what they do:

#!/usr/bin/lua5.1

test= «123456789»
test2 = «aBcDeFgHiJkLmNoPqRsTuVwXyZ»

print( "Test length:" .. string.len( test ) )
print( "Test length 2:" .. test2:len() )
print ( "uppercase test:" .. test:upper() )
print ( "test 2 written in uppercase:" .. string.upper ( test2 ) )
print ( "lowercase test:" .. test2:lower() )
print ( "reverse test:" .. test:" .. test:reverse() )
print ( "reverse test 2 in lowercase:" .. string.reverse ( test2:lower() ) ) )

At the output we get:

./luastring.lua test
length: 9
test length 2:26
uppercase
test: 123456789 test 2 written in upper case: ABCDEFGHIJKLMNOPQRSTUVWXYZ
test 2 written in lower case: abcdefghijklmnopqrstuvwxyz
: 987654321
reverse test 2 in lowercase: zyxwvutsrqponmlkjihgfedcba

string.sub( s [, i [, j ] ] )

 

string.sub(...) is extremely useful for retrieving known parts of a string. Like Perl and some other languages, you can also use negative indexes. A negative index means the beginning on the reverse side, so -1 will be the last character. Let's see all this in practice:

#!/usr/bin/lua5.1

test= «123456789»

print( "test substring starting with 5:" .. test:sub( 5 ) )
print( "test substring from 5 to 8:" .. string.sub( test, 5, 8 ) )
print( "test substring from -3 to -1:" .. string.sub( test, -3, -1 ) )

At the output we get:

./luastring2.lua
test substring starting at 5:56789
test substring 5 to 8:5678
test substring -3 to -1:789

If you want to get one character at a certain position, set "i" and "j" to the same number.

string.rep( s, n )

This is another trivial string function. Repeat the "s" n times. Let's look at a brief example:

#!/usr/bin/lua5.1

print( 'repeat 'abcdefg' 3 times: ".. string.rep( «abcdefg», 3 ) )

At the output we get:

./luastring3.lua
repeat "abcdefg" 3 times: abcdefgabcdefgabcdefg

Patterns and special characters

First, we need to understand the basics of patterns and how to handle special characters in Lua. Special characters are things like new strings and tabs. Let's first look at special characters, since they are the simplest.

Filmed characterDescription
\nNew line
\rReturn
\tTab
\\\
\”
\’
\[[
\]]

All of them are extremely common and work in almost all string operations, and they should be remembered. "New Line" is common in all operating systems, but "Return Carriage" is used in Windows in conjunction with "New Lines". I tend to just execute a regular expression (see the templates below for details). You will also need to take into account the magic symbols (see below).

Unusual special characters

 

They are rarely used, but here you are for general development.

SymbolDescription
\aBell
\bBackspace
\fSubmitting the form
\vVertical tab

Greedy Coincidence

 

Greedy matching means that we match everything that matches the pattern, while non-greedy matching means that we match until the condition is first matched. In our previous line, we have "123", then a bunch of extra stuff and ends with "123" for the test line. We have another "123" in the middle of the line.

We compare a number or more, then whatever, then a number or more. Because of this, the greedy juxtaposition will match all that last "123", since our numbers are technically "any symbol", but not the greedy juxtaposition will end once it sees the next batch of numbers.

Greedy juxtaposition is useful when you want everything to be between something, even if it coincides with the final case. Non-greedy juxtaposition is useful when you want something between a known set of things and never want it to spill out. This is useful when you have sequences of similar formatting (such as tags in a string or similar). Greedy and non-greedy mapping becomes more important with some other string formats.

string.gmatch( s, pattern )

 

string.gmatch( s, pattern ) is similar to string.match, except that it provides all matches. You can iterate through them using a for loop or similar. Let's look at an example:

#!/usr/bin/lua5.1

test= «abc 123 ABC 456 !!! catd0g -+[] 789»

for s in ( string.gmatch( test, «%d+» ) ) do
print( "found:" .. s )
end

At the output we get:

./luamatch.lua
found: 123
found: 456
found: 0
found: 789

string.gsub( s, pattern, replacement [, n] )

string.gsub( s, pattern, replacement [, n]) is one of the most useful functions in all of Lua for working with strings. This allows you to take a string and replace something from the template in it. You get back the line and the number of substitutions made. Here's a simple example:

#!/usr/bin/lua5.1

test= «abc 123 ABC 456 !!! catd0g -+[] 789»

localnewstring, replacements = string.gsub( test, "%d+", "[Numbers]" )
print( "Replaced:" .. replacements )
print( "New Line:" .. newstring )

At the output we get:

./luamatch.lua
Replaced: 4
New line: abc [Numbers] ABC [Numbers] !!! cat d[Numbers]g -+[] [Numbers]

You can limit the number of substitutions by adding an integer n to limit the number of occurrences you affect. This is useful for some scenarios where you know roughly where replacements should occur. You will use such things in data processing.