Troubleshooters.Com, Code Corner and Ruby Revival Present

Ruby Basic Tutorial
Copyright (C) 2005 by Steve Litt

See the Troubleshooters.Com Bookstore.


Note: All materials in Ruby Revival are provided AS IS. By reading the materials in Ruby Revival you are agreeing to assume all risks involved in the use of the materials, and you are agreeing to absolve the authors, owners, and anyone else involved with Python Patrol of any responsibility for the outcome of any use of these materials, even in the case of errors and/or omissions in the materials. If you do not agree to this, you must not read these materials.
To the 99.9% of you honest readers who take responsibility for your own actions, I'm truly sorry it is necessary to subject all readers to the above disclaimer.


CONTENTS

About this Tutorial

This is a Ruby tutorial for one not knowing Ruby. Therefore, we use many constructs and styles that, while familiar to programmers and intuitive to beginners, are not optimal for Ruby. A companion document, Ruby the Right Way, discusses how to use Ruby to full advantage and have your code compatible with the vast body of Ruby code out there.

Ruby can be used as a fully object oriented language, in which case you'd create classes and objects to accomplish everything. However, it can be used quite nicely with only the objects and classes that ship with Ruby, in which case it can be used as a procedural language, except that functions are typically methods of the program's variables.

If that doesn't make any sense to you, don't worry, it's just a way of saying that Ruby can be very easy to learn and use.

Even if you want to become a Ruby expert, you need to learn the basic functionality before you can become a Ruby OOP ninja. This tutorial gives you those basics.


Hello World

This is the simplest possible Ruby program, hello.rb. As you'd expect, it prints "Hello World" on the screen. Be sure to set it executable.

#!/usr/bin/ruby
print "Hello World\n"

Although this program works as expected, it goes against the philosophy of Ruby because it's not object oriented. But as a proof of concept that Ruby's working on your computer, it's just fine.

Besides print, there's also a puts keyword. The difference is that puts automatically inserts a newline at the end of the string being printed, whereas print does not. In other words, puts is more convenient, but print is necessary if separate statements print to the same line. Througout this tutorial we'll use both print and puts.

Loops

Let's count to 10...

#!/usr/bin/ruby
for ss in 1...10
print ss, " Hello\n";
end

The elipses (...) indicate the range through which to loop. The for is terminated by an end. You don't need braces for a loop. Whew!

The following is the output:

[slitt@mydesk slitt]$ ./loop.rb
1 hello
2 hello
3 hello
4 hello
5 hello
6 hello
7 hello
8 hello
9 hello
[slitt@mydesk slitt]$

Notice that it stops on 9. The number following the elipses causes termination at the top of the loop. The 1...10 means 1 TO BUT NOT INCLUDING 10, it does NOT mean 1 through 10. Please remember this when using Ruby loops.

NOTE

There are actually two versions of the elipses operator, the three period version as shown previously, and the two period version. The two period version is inclusive. In other words, 1...3 means 1 up to but not including 3, while 1..3 means one through 3.

By using the appropriate version of the elipses operator you can save having to code convoluted end conditions.


Now let's iterate through an array.

#!/usr/bin/ruby
presidents = ["Ford", "Carter", "Reagan", "Bush1", "Clinton", "Bush2"]
for ss in 0...presidents.length
print ss, ": ", presidents[ss], "\n";
end

We defined an array of presidents using a Perl like syntax (except we used brackets instead of parens), and we iterated from 0 (Ruby is 0 based, like most languages), through the final subscript in the presidents array. Remember, the triple dot stops before executing the final number, which is why it doesn't count to 6. If you had wanted it to count to 6 (which in this case would have walked off the end of the array), you would have used the double dot. The output of the preceding code follows:

[slitt@mydesk slitt]$ ./loop.rb
0: Ford
1: Carter
2: Reagan
3: Bush1
4: Clinton
5: Bush2
[slitt@mydesk slitt]$

Now lets list the presidents backwards by calculating the array's subscript as the array's length minus the counter, minus one. Ugly, but it gets the job done:

#!/usr/bin/ruby
presidents = ["Ford", "Carter", "Reagan", "Bush1", "Clinton", "Bush2"]
for ss in 0...presidents.length
print ss, ": ", presidents[presidents.length - ss - 1], "\n";
end

The preceding program produces the following output:

[slitt@mydesk slitt]$ ./hello.rb
0: Bush2
1: Clinton
2: Bush1
3: Reagan
4: Carter
5: Ford
[slitt@mydesk slitt]$

Ruby has a much nicer way of iterating backwards through a list: Negative subscripts. The following iterates backward through the array, using the fact that array[-1] is the last item, array[-2] is the second to last, etc:

#!/usr/bin/ruby
presidents = ["Ford", "Carter", "Reagan", "Bush1", "Clinton", "Bush2"]
for ss in 0...presidents.length
	print ss, ": ", presidents[-ss -1], "\n";
end
      

If you're familiar with C, Pascal or Perl, you're probably dissappointed you couldn't just use presidents.length...0. Backwards iteration doesn't work in Ruby -- it must iterate up.

Iterators and Blocks

Another way to loop through an array is to use an iterator (in red in the following code) and a block (in blue in the following code:

#!/usr/bin/ruby
presidents = ["Ford", "Carter", "Reagan", "Bush1", "Clinton", "Bush2"]
presidents.each {|prez| puts prez}

In the preceding code, the block argument (prez) contains the current array element, and everything else until the closing brace contains code to operate on the block argument. The block argument is always enclosed in vertical lines (pipe symbols). The following is the output of the preceding code:

[slitt@mydesk slitt]$ ./hello.rb
Ford
Carter
Reagan
Bush1
Clinton
Bush2
[slitt@mydesk slitt]$


The block needn't be on one line:

#!/usr/bin/ruby
presidents = ["Ford", "Carter", "Reagan", "Bush1", "Clinton", "Bush2"]
presidents.each {
|prez|
puts prez
}

As shown in the previous examples, you can define the block by enclosing it in curly braces. You can also define it by enclosing it in a do and an end, where the do replaces the opening brace, and the end replaces the closing brace:

#!/usr/bin/ruby
presidents = ["Ford", "Carter", "Reagan", "Bush1", "Clinton", "Bush2"]
presidents.each do
|prez|
puts prez
end

Personally, I greatly prefer the do/end syntax for multiline blocks, because as a Perl/C/C++ guy I have a very different perception of braces than their limited use in Ruby, and also because of all the brace placement religious wars I've endured (I'm a Whitesmith type guy myself). However, on short single line blocks, using the braces saves valuable line space. From what I understand, the methods are interchangeable in features and performance, with one small exception...

Speaking of performance, if you declare the block argument outside the block (in other words, make it a local variable), performance improves because Ruby needn't recreate a variable every iteration. HOWEVER, the loop messes with the value of the variable, so it's best to use a specific variable for that purpose, and do not use it for other purposes within the subroutine. Here's an example of using a local variable as a block argument:

#!/usr/bin/ruby
i = -99
puts "Before: " + i.to_s
(1..10).each{|i| puts i}
puts "After : " + i.to_s
[slitt@mydesk slitt]$ ./loop.rb          
Before: -99
1
2
3
4
5
6
7
8
9
10
After : 10
[slitt@mydesk slitt]$

If you use a local variable for a block argument, do so only in loops with huge numbers of iterations, and use only variables that are specifically intended to serve as block arguuments and nothing else.

A Difference Between {} and do/end

As mentioned, there's one small difference between brace enclosed blocks and do/end enclosed blocks: Braces bind tighter. Watch this:

#!/usr/bin/ruby
my_array = ["alpha", "beta", "gamma"]
puts my_array.collect {
|word|
word.capitalize
}
puts "======================"
puts my_array.collect do
|word|
word.capitalize
end
[slitt@mydesk slitt]$ ./test.rb
Alpha
Beta
Gamma
======================
alpha
beta
gamma
[slitt@mydesk slitt]$

The braces bound tightly like this:

puts (my_array.collect {|word| word.capitalize})
Whereas do/end bind more loosely, like this:
puts (my_array.collect) do |word| word.capitalize} end
Note that the latter represents a syntax error anyway, and I've found no way to coerce do/end into doing the right thing simply by using parentheses. However, by assigning the iterator's results to a new array, that array can be used. It's one more variable and one more line of code. If the code is short, use braces. If it's long, the added overhead is so small a percentage that it's no big deal:

#!/usr/bin/ruby
my_array = ["alpha", "beta", "gamma"]
puts my_array.collect {
|word|
word.capitalize
}
puts "======================"
new_array = my_array.collect do
|word|
word.capitalize
end
puts new_array
[slitt@mydesk slitt]$ ./test.rb
Alpha
Beta
Gamma
======================
Alpha
Beta
Gamma
[slitt@mydesk slitt]$

Generally speaking, if you want to directly use the result of iterators, use braces. For longer blocks, do/end is more readable, and the overhead for the extra variable and line of code is trivial.

while Loops

All the loops previously discussed looped through either an array or a set of numbers. Sometimes you need a more generic loop. That's when you use a while loop:

#!/usr/bin/ruby
ss = 4
while ss > 0
puts ss
ss -= 1
end
puts "======================"
while ss < 5
puts ss
ss += 1
break if ss > 2
end
puts "======================"
ss = 5
while ss > 0
puts ss
ss -= 2
if ss == 1
ss += 5
end
end
[slitt@mydesk slitt]$ ./loop.rb
4
3
2
1
======================
0
1
2
======================
5
3
6
4
2
[slitt@mydesk slitt]$

The first while loop iterated from 4 down to 1, quitting when ss became 0 and hit the while condition. The second loop was intended to iterate up to 4 and quit when 5 was encountered, but a break statement inside the loop caused it to terminate after printing 2 and then incrementing to 3. This demonstrates the break statement.

The third loop was intended to loop from 5 down to 1, quitting after printing 1 and then decrementing. However, the statement in the body of the loop added 5 when it reached 1, pushing it back up to 6, so it had to count down again. On the second countdown, the numbers were even, so it didn't trigger the if statement. This shows that unlike Pascal, it's OK to tamper with the loop variable inside the loop.

Branching

Looping is one type of flow control in pure procedural languages. The other is branching. The following program implements an array called democrats and another called republicans . Depending on the command line argument, the program prints either the democratic presidents since 1974, the republican presidents since 1974, or an appropriate error message.

#!/usr/bin/ruby
democrats = ["Carter", "Clinton"]
republicans = ["Ford", "Reagan", "Bush1", "Bush2"]
party = ARGV[0]
if party == nil
print "Argument must be \"democrats\" or \"republicans\"\n"
elsif party == "democrats"
democrats.each { |i| print i, " "}
print "\n"
elsif party == "republicans"
republicans.each { |i| print i, " "}
print "\n"
else
print "All presidents since 1976 were either Democrats or Republicans\n"
end

Note the if, elsif, else and end keywords, and how they delineate the branching. Note also the democrats.each syntax, which is a very shorthand way of iterating through an array, assuming what you want to do to each element can be stated succinctly.

One last note. The error handling in the preceding would be much better handled by exceptions, but they haven't been covered yet.

Like Perl, the if keyword can follow the action instead of preceding it:

#!/usr/bin/ruby
democrats = ["Carter", "Clinton"]
republicans = ["Ford", "Reagan", "Bush1", "Bush2"]
party = ARGV[0]
if party != nil
democrats.each { |i| print i, " "} if party == "democrats"
republicans.each { |i| print i, " "} if party == "republicans"
print "All presidents since 1976 were either Democrats or Republicans\n"\
if (party != "democrats" && party != "republicans")
end

The preceding is a very contrived program to showcase using the if keyword after the action. Note the following:
  1. The if keyword must be on the same line as the action
  2. Only a single action can precede the if keyword. Multiple actions separated by semicolons will do quite unexpected things.

Containers

Containers are entities that contain other entities. Ruby has two native container types, arrays and hashes. Arrays are groups of objects ordered by subscript, while hashes are groups of key->value pairs. Besides these two native container types, you can create your own container types.

Arrays

You've already seen how to initialize an array and how to use the each method to quickly iterate each element:

#!/usr/bin/ruby
presidents = ["Ford", "Carter", "Reagan", "Bush1", "Clinton", "Bush2"]
presidents.each { |i| print i, "\n"}


[slitt@mydesk slitt]$ ./array.rb          
Ford
Carter
Reagan
Bush1
Clinton
Bush2
[slitt@mydesk slitt]$

Now let's manipulate the array, starting by deleting the last three presidents:

#!/usr/bin/ruby
presidents = ["Ford", "Carter", "Reagan", "Bush1", "Clinton", "Bush2"]
presidents.pop
presidents.pop
presidents.pop
presidents.each { |i| print i, "\n"}

The pop method deletes the final element. If you were to assign the pop method to a variable, it would store that last element and then delete it from the array. In the preceding code, you pop the last three presidents. Here is the result:

[slitt@mydesk slitt]$ ./array.rb          
Ford
Carter
Reagan
[slitt@mydesk slitt]$

Now let's prepend the previous three presidents, Kennedy, Johnson and Nixon:

#!/usr/bin/ruby
presidents = ["Ford", "Carter", "Reagan", "Bush1", "Clinton", "Bush2"]
presidents.pop
presidents.pop
presidents.pop
presidents.unshift("Nixon")
presidents.unshift("Johnson")
presidents.unshift("Kennedy")
presidents.each { |i| print i, "\n"}

The result is as expected:

[slitt@mydesk slitt]$ ./array.rb          
Kennedy
Johnson
Nixon
Ford
Carter
Reagan
[slitt@mydesk slitt]$

However, you might not like the idea of prepending in the reverse order. In that case, prepend all three at once:

#!/usr/bin/ruby
presidents = ["Ford", "Carter", "Reagan", "Bush1", "Clinton", "Bush2"]
presidents.pop
presidents.pop
presidents.pop
presidents.unshift("Kennedy", "Johnson", "Nixon")
presidents.each { |i| print i, "\n"}

Ruby arrays have methods shift, unshift, push, and pop:

METHOD
ACTION
ARGUMENT
RETURNS
push
Appends its argument to the end of the array.
Element(s) to be appended to end of the array.
A string consisting of the concatination of all non-nil elements in the array AFTER the action was taken.
pop
Returns the last element in the array and deletes that element.
None.
The last element of the array.
shift
Returns the first element of the array, deletes that element, and shifts all other elements down one location to fill its empty spot.
None.
The first element in the array.
unshift
Shifts all elements of the array up one, and places its argument at the beginning of the array.
Element(s) to be prepended to start of array.
A string consisting of the concatination of all non-nil elements in the array AFTER the action was taken.

You can assign individual elements of an array:
#!/usr/bin/ruby
presidents = []
presidents[2] = "Adams"
presidents[4] = "Madison"
presidents[6] = "Adams"
presidents.each {|i| print i, "\n"}
print "=======================\n"
presidents[6] = "John Quincy Adams"
presidents.each {|i| print i, "\n"}
print "\n"

The preceding code produces this output:

[slitt@mydesk slitt]$ ./array.rb
nil
nil
Adams
nil
Madison
nil
Adams
=======================
nil
nil
Adams
nil
Madison
nil
John Quincy Adams

[slitt@mydesk slitt]$

The length of the array is the determined by the last initialized element, even if that element was initialized to nil. That can be very tricky, especially because if you read past the end of the array it returns nil. Be careful.

You can insert an element by assignment, as shown in the preceding code. If you assign to an element that already exists, you simply change its value, as we changed "Adams" to "John Quincy Adams".

Another thing you can do is get a slice of an array.

#!/usr/bin/ruby
presidents = ["Ford", "Carter", "Reagan", "Bush1", "Clinton", "Bush2"]
p123=presidents[1..3]
p123.each { |i| print i, "\n"}

Notice this time I used the two period version of the elipses operator, so you'd expect it to list Carter, Reagan and Bush1, and indeed it does. The preceding slice produces the following output:

[slitt@mydesk slitt]$ ./array.rb
Carter
Reagan
Bush1
[slitt@mydesk slitt]$

Another way to slice an array is with a start and a count instead of a range. The following is another way to write basically the same code as the preceding code:

#!/usr/bin/ruby
presidents = ["Ford", "Carter", "Reagan", "Bush1", "Clinton", "Bush2"]
p123=presidents[1,3]
p123.each { |i| print i, "\n"}


The preceding used a starting subscript of 1 and a count of 3, instead of  a range 1 through 3.

You can also use slices in insertions, deletions and replacements, and you can insert/replace with elements or whole arrays. Our first example deletes unneeded elements from the middle of an array:

#!/usr/bin/ruby
numbers = ["one", "two", "buckle", "my", "shoe", "three", "four"]
numbers.each { |i| print i, "\n"}
print "=====================\n"
numbers[2,3]=[]
numbers.each { |i| print i, "\n"}

In the preceding, we have extraneous elements "buckle", "my" and "shoe", which we want to delete. So we replace element 2, for a count of 4 (element 2 and the next 2, in other words), to an empty array, effectively deleting them. The result follows:


[slitt@mydesk slitt]$ ./array.rb
one
two
buckle
my
shoe
three
four
=====================
one
two
three
four
[slitt@mydesk slitt]$

Next, let's replace three numeric representations with their spelled out equivalents, plus add in another element we had forgotten:
#!/usr/bin/ruby
numbers = ["one", "two", "3", "4", "5", "seven"]
numbers.each { |i| print i, "\n"}
print "=====================\n"
numbers[2,3]=["three", "four", "five", "six"]
numbers.each { |i| print i, "\n"}

You can see we deleted the three numerics, and then added the four spelled out versions in their place. Here's the output:

[slitt@mydesk slitt]$ ./array.rb
one
two
3
4
5
seven
=====================
one
two
three
four
five
six
seven
[slitt@mydesk slitt]$

But what if you don't want to replace anything -- what if you just want to insert in the middle? No problem -- use 0 for the count...

#!/usr/bin/ruby
numbers = ["one", "two", "five"]
numbers.each { |i| print i, "\n"}
print "=====================\n"
numbers[2,0]=["three", "four"]
numbers.each { |i| print i, "\n"}

The only trick here is that if you are not deleting the starting point element, the insertion will occur AFTER the starting element. Here is the output:

[slitt@mydesk slitt]$ ./array.rb
one
two
five
=====================
one
two
three
four
five
[slitt@mydesk slitt]$

You might ask yourself what to do if you need to append before the first element, given that slice type insertion inserts   AFTER the starting point. The simplest answer is to use the unshift method.

You can construct an array using a parenthesized range:
   
#!/usr/bin/ruby
myArray = (0..9)
myArray.each{|i| puts i}
[slitt@mydesk slitt]$ ./array.rb
0
1
2
3
4
5
6
7
8
9
[slitt@mydesk slitt]$

Finally, remembering that Ruby is intended to be an object oriented language, let's look at some of the more common methods associated with arrays (which are really objects in Ruby):

#!/usr/bin/ruby
numbers = Array.new
numbers[3] = "three"
numbers[4] = nil
print "Class=", numbers.class, "\n"
print "Length=", numbers.length, "\n"
numbers.each { |i| print i, "\n"}

The Array.new method types numbers as an array. You could have done the same thing with numbers=[]. The next line assigns text three to the element with subscript 3, thereby setting the element and also setting the array's length. The next line sets the element whose subscript is 4 to nil, which, when you view the output, will prove that the length method returns one plus the last initialized element, even if it's initialized to nil. This, in my opinion, could cause trouble.

The class method returns the variable's class, which in a non-oop language could be thought of as its type.  The following is the output:

[slitt@mydesk slitt]$ ./hello.rb
Class=Array
Length=5
nil
nil
nil
three
nil
[slitt@mydesk slitt]$

We've gone through arrays in great detail, because you'll use them regularly. Now it's time to review Ruby's other built in container class...

Hashes

There are two ways to think of a hash:
  1. A set of key->value pairs
  2. An array whose subscripts aren't necessarily ordered or numeric
Both of the preceding are correct, and do not conflict with each other.

#!/usr/bin/ruby
litt = {"lname"=>"Litt", "fname"=>"Steve", "ssno"=>"123456789"}
print "Lastname : ", litt["lname"], "\n"
print "Firstname : ", litt["lname"], "\n"
print "Social Security Number: ", litt["ssno"], "\n"
print "\n"
litt["gender"] = "male"
litt["ssno"] = "987654321"
print "Corrected Social Security Number: ", litt["ssno"], "\n"
print "Gender : ", litt["gender"], "\n"
print "\n"
print "Hash length is ", litt.length, "\n"
print "Hash class is ", litt.class, "\n"

In the preceding, we initialized the hash with three elements whose keys were lname, fname and ssno. We later added a fourth element whose key was gender, as well as correcting the value of ssno. The class and length methods do just what we'd expect, given our experience from arrays. This hash could be thought of as a single row in a database table. Here is the result:

[slitt@mydesk slitt]$ ./hash.rb
Lastname : Litt
Firstname : Litt
Social Security Number: 123456789

Corrected Social Security Number: 987654321
Gender : male

Hash length is 4
Hash class is Hash
[slitt@mydesk slitt]$


Better yet, hashes values can be other types of classes. For instance, consider a hash of hashes:

#!/usr/bin/ruby
people = {
"torvalds"=>{"lname"=>"Torvalds", "fname"=>"Linus", "job"=>"maintainer"},
"matsumoto"=>{"lname"=>"Matsumoto", "fname"=>"Yukihiro", "job"=>"Ruby originator"},
"litt"=>{"lname"=>"Litt", "fname"=>"Steve", "job"=>"troubleshooter"}
}

keys = people.keys

for key in 0...keys.length
print "key : ", keys[key], "\n"
print "lname: ", people[keys[key]]["lname"], "\n"
print "fname: ", people[keys[key]]["fname"], "\n"
print "job : ", people[keys[key]]["job"], "\n"
print "\n\n"
end

Here's the output:

[slitt@mydesk slitt]$ ./hash.rb
key : litt
lname: Litt
fname: Steve
job : troubleshooter


key : matsumoto
lname: Matsumoto
fname: Yukihiro
job : Ruby originator


key : torvalds
lname: Torvalds
fname: Linus
job : maintainer


[slitt@mydesk slitt]$

Basically, you just implemented the equivalent of a database table, whose rows correspond to Litt, Matsumoto and Torvalds, and whose columns are lname, fname and job. There are probably a dozen better ways to actually print this information, but at this point I'm still learning Ruby, so I did it with a distinctively Perl accent. Perhaps that's a good thing -- it proves that Ruby follows ordinary programming logic in addition to its many wonderful features.

Sorting Hashes

You sort hashes by converting them to 2 dimensional arrays -- an array of key/value pairs, and then sorting them. The sort method does just that. Here's an example:
#!/usr/bin/ruby -w

h = Hash.new
h['size'] = 'big'
h['color'] = 'red'
h['brand'] = 'ford'

av = h.sort{|a,b| a[1] <=> b[1]}
ak = h.sort{|a,b| a[0] <=> b[0]}
ak.each do
|pair|
print pair[0]
print "=>"
print pair[1]
puts
end
puts "=============="
av.each do
|pair|
print pair[0]
print "=>"
print pair[1]
puts
end
[slitt@mydesk ~]$ ./test.rb
brand=>ford
color=>red
size=>big
==============
size=>big
brand=>ford
color=>red
[slitt@mydesk ~]$
Notice that often a simple <=> command does not suffice, and you actually need to write your own function to establish collation order. Simply write a function taking two arguments (a and b) that returns 1 when a is superior to b, -1 when a is inferior to b, and 0 when they are equivalent.

Tests and Info Requests on Hashes

Method What it does Synonyms
has_key?(key) Tests whether the key is present in the hash. include?(key), key?(key) and member?(key)
has_value?(value) Tests whether any element of the hash has the value, returning true or false. value?(value)
index(value) Returns the key for an element with the value. I don't know what happens if multiple elements have that value.
select {|key, value| block} => array Returns an array of key/value pairs for which block evaluates true:
h.select {|k,v| v < 200}
empty? Returns True if no key/value pairs
inspect Return contents of the hash as a string
invert Returns a new hash with keys and values switched.
length How many key/value pairs does it have? size()
sort {| a, b | block } => array

Strings

Strings are a class that ship with Ruby. The String class has a huge number of methods, such that memorizing them all would be futile. If you really want a list of them all, go http://www.rubycentral.com/book/ref_c_string.html., but don't say I didn't warn you.

What I'd like to do here is give you the 10% of strings you'll need for 90% of your work. By the way, Ruby has regular expressions, and that will be covered in the following section. This section covers only Ruby's String class methods.

Let's start with string assignment and concatination:

#!/usr/bin/ruby
myname = "Steve Lit"
myname_copy = myname
print "myname = ", myname, "\n"
print "myname_copy = ", myname_copy, "\n"
print "\n=========================\n"
myname << "t"
print "myname = ", myname, "\n"
print "myname_copy = ", myname_copy, "\n"

The double less than sign is a Ruby String overload for concatination. If all goes well, we'll change the original string but the copy won't change. Let's verify that:

[slitt@mydesk slitt]$ ./string.rb
myname = Steve Lit
myname_copy = Steve Lit

=========================
myname = Steve Litt
myname_copy = Steve Litt
[slitt@mydesk slitt]$

Oh, oh, it changed them both. String assignment copies by reference, not by value. Do you think that might mess up your loop break logic?

Use the String.new() method instead:

#!/usr/bin/ruby
myname = "Steve Lit"
myname_copy = String.new(myname)
print "myname = ", myname, "\n"
print "myname_copy = ", myname_copy, "\n"
print "\n=========================\n"
myname << "t"
print "myname = ", myname, "\n"
print "myname_copy = ", myname_copy, "\n"

Here's the proof that it works the way you want it:

[slitt@mydesk slitt]$ ./hello.rb
myname = Steve Lit
myname_copy = Steve Lit

=========================
myname = Steve Litt
myname_copy = Steve Lit
[slitt@mydesk slitt]$

One really nice thing about the Ruby String class is it works like an array of characters with respect to splicing:

#!/usr/bin/ruby
myname = "Steve was here"
print myname[6, 3], "\n"
myname[6, 3] = "is"
print myname, "\n"

[slitt@mydesk slitt]$ ./string.rb
was
Steve is here
[slitt@mydesk slitt]$

This gets more powerful when you introduce the index string method, which returns the subscript of the first occurrence of a substring:


#!/usr/bin/ruby
mystring = "Steve was here"
print mystring, "\n"

substring = "was"
start_ss = mystring.index(substring)
mystring[start_ss, substring.length] = "is"
print mystring, "\n"

In the preceding, the start point for replacement was the return from the index method, and the count to replace is the return from the length method (on the search text). The result is a generic replacement:

[slitt@mydesk slitt]$ ./string.rb
Steve was here
Steve is here
[slitt@mydesk slitt]$

Naturally, in real life you'd need to add code to handle cases where the search string wasn't found.

You already saw in-place concatenation with the << method, but in addition there's the more standard plus sign concatenation:

#!/usr/bin/ruby
mystring = "Steve" + " " + "was" + " " + "here"
print mystring, "\n"
[slitt@mydesk slitt]$ ./string.rb
Steve was here
[slitt@mydesk slitt]$

If the addition sign means to add strings together, it's natural that the multiplication sign means string together multiple copies:

#!/usr/bin/ruby
mystring = "Cool " * 3
print mystring, "\n"
[slitt@mydesk slitt]$ ./string.rb
Cool Cool Cool
[slitt@mydesk slitt]$

Do you like the sprintf() command in C? Use the % method in Ruby:

#!/usr/bin/ruby
mystring = "There are %6d people in %s" % [1500, "the Grand Ballroom"]
print mystring, "\n
[slitt@mydesk slitt]$ ./string.rb
There are 1500 people in the Grand Ballroom
[slitt@mydesk slitt]$

You can compare strings:

#!/usr/bin/ruby
print "frank" <=> "frank", "\n"
print "frank" <=> "fred", "\n"
print "frank" <=> "FRANK", "\n"
[slitt@mydesk slitt]$ ./hello.rb
0
-1
1
[slitt@mydesk slitt]$

Here are some other handy string methods:


mystring.capitalize
Title case. Returns new string equal to mystring except that the first letter of every word is uppercase
mystring.capitalize!
Title case in place.
mystring.center(mynumber)
Returns a new string mynumber long with mystring centered within it. If mynumber is already less than the length of mystring, returns a copy of mystring.
mystring.chomp
Returns a new string equal to mystring except any newlines at the end are deleted. If chomp has an argument, that argument serves as the record separator, replacing the default newline.
mystring.chomp!
Same as chomp, but in place. Equivalent of Perl chomp().
mystring.downcase
Returns new string equal to mystring but entirely lower case.
mystring.downcase!
In place modifies mystring, making everything lower case.
mystring.reverse
Returns new string with all characters reversed. IOWA becomes AWOI.
mystring.reverse!
Reverses mystring in place.
mystring.rindex(substring)
Returns the subscript of the last occurrence of the substring. Like index except that it returns the last instead of first occurrence. This method actually has more options, so you might want to read the documentation.
mystring.rjust(mynumber)
Returns a copy of mystring, except the new copy is mynumber long, and mystring is right justified in that string. If mynumber is smaller than the original length of mystring, it returns an exact copy of mystring.
mystring.split(pattern, limit)
Returns a new array with parts of the string split wherever pattern was encountered as a substring. If limit is given, returns at most that many elements in the array.
mystring.strip
Returns a new string that is a copy of mystring except all leading and trailing whitespace have been removed.
mystring.to_f
Returns the floating point number represented by mystring. Returns 0.0 if it's not a valid number, and never raises exceptions. Careful!
mystring.to_i Returns an integer represented by mystring. Non-numerics at the end are ignored. Returns 0 on invalid numbers, and never raises exceptions. Careful!
mystring.upcase
Returns a new string that's an uppercase version of mystring.
mystring.upcase!
Uppercases mystring in place.





There are many, many more methods, but the preceding should get you through most programming tasks. If you end up using Ruby a lot, it would help to learn all the methods.

A word about mystring.split(pattern). What about the reverse -- turning an array into a string? Try this:

#!/usr/bin/ruby
mystring=""
presidents = ["reagan", "bush1", "clinton", "bush2"]
presidents.each {|i| mystring << i+" "}
mystring.strip
print mystring, "\n"
[slitt@mydesk slitt]$ ./string.rb
reagan bush1 clinton bush2
[slitt@mydesk slitt]$

Here's a version that turns it into a comma delimited file with quotes:

#!/usr/bin/ruby
mystring=""
presidents = ["reagan", "bush1", "clinton", "bush2"]
presidents.each {|i| mystring << "\"" + i + "\", "}
mystring[mystring.rindex(", "), 2] = ""
print mystring, "\n"
[slitt@mydesk slitt]$ ./string.rb
"reagan", "bush1", "clinton", "bush2"
[slitt@mydesk slitt]$

You now know most of the Ruby string techniques you need for the majority of your work. Well, except for regular expressions, of course...

Regular Expressions


NOTE

This section assumes you understand the concept of regular expressions. If you do not, there are many fine regular expression tutorials on the web, including this one on my Litt's Perls of Wisdom subsite.

Regular expressions make life so easy, often replacing 100 lines of code with 5. Perl is famous for its easy to use and intuitive regular expressions.

Ruby is a little harder because most regular expression functionality is achieved by a regular expression object that must be instantiated. However, you CAN test for a match the same as in Perl:

#!/usr/bin/ruby
string1 = "Steve was here"
print "e.*e found", "\n" if string1 =~ /e.*e/
print "Sh.*e found", "\n" if string1 =~ /Sh.*e/
[slitt@mydesk slitt]$ ./regex.rb
e.*e found
[slitt@mydesk slitt]$


Here's the code to actually retrieve the first match of /w.ll/ in the string:

#!/usr/bin/ruby
string1 = "I will drill for a well in walla walla washington."
if string1 =~ /(w.ll)/
print "Matched on ", $1, "\n"
else
puts "NO MATCH"
end
[slitt@mydesk slitt]$ ./regex.rb
Matched on will
[slitt@mydesk slitt]$

This was almost just like Perl. You put parentheses in the regular expression to make a group, perform the regular expression search with the =~ operator, and then the match for the group is contained in the $1 variable. If there had been multiple groups in the regular expressions, matches would have also been available in $2, $3, and so on, up to the number of groups in the regular expression.


The more OOPish method of doing all this is to instantiate a new Regexp object and using its methods to gain the necessary information:

#!/usr/bin/ruby
string1 = "I will drill for a well in walla walla washington."
regex = Regexp.new(/w.ll/)
matchdata = regex.match(string1)
if matchdata
puts matchdata[0]
puts matchdata[1]
else
puts "NO MATCH"
end

[slitt@mydesk slitt]$ ./hello.rb
will
nil
[slitt@mydesk slitt]$

If you change /w.ll/ to /z.ll/, which of course does not match because there's not a "z" in string1, the output looks like this:

[slitt@mydesk slitt]$ ./hello.rb
NO MATCH
[slitt@mydesk slitt]$

The preceding example shows how to do complete regex in Ruby. Start by creating a regular expression object using Regexp.new(). Then use that object's match method to find a match and return it in a MatchData object. Test that the MatchData object exists, and if it does, get the first match (matchdata[0]). The reason we also printed matchdata[1] was to show that, in the absense of groups surrounded by parentheses, the match method returns only a single match. Later you'll see a special way to return all matches of a single regular expression.

Another thing to notice is that, in Ruby, matching is not greedy by default. It finds the shortest string that satisfies the regular expression. If Ruby's matching was greedy like Perl's, the match would have included the entire string:

"will drill for a well in walla wall"

In other words, it would have returned everything from the first w to the last double l. Ungreedy matches go along with Ruby's principle of least surprise, but sometimes what you want is greedy matching.

You can return several matches using multiple groups, like this:

#!/usr/bin/ruby
string1 = "I will drill for a well in walla walla washington."
regex = Regexp.new(/(w.ll).*(in).*(w.ll)/)
matchdata = regex.match(string1)
if matchdata
for ss in 0...matchdata.length
puts matchdata[ss]
end
else
puts "NO MATCH"
end
[slitt@mydesk slitt]$ ./hello.rb
will drill for a well in walla wall
will
in
wall
[slitt@mydesk slitt]$

Note the different behavior when you use parentheses. Here you see that the 0 subscript element matches the entire regular expression, while elements 1, 2 and 3 are the individual matches for the first, second and third parenthesized groups.

What if you wanted to find ALL the matches for /w.ll/ in the string, without guessing beforehand how many parentheses to put in? Here's the way you do it:

#!/usr/bin/ruby
string1 = "I will drill for a well in walla walla washington."
regex = Regexp.new(/w.ll/)
matchdata = regex.match(string1)
while matchdata != nil
puts matchdata[0]
string1 = matchdata.post_match
matchdata = regex.match(string1)
end
[slitt@mydesk slitt]$ ./regex.rb
will
well
wall
wall
[slitt@mydesk slitt]$

What you've done here is repeated the match, over and over again, each time assigning the remainder of the string after the match to string1 via the post_match method. The loop terminates when no match is found.

Regex Substitution

My research tells me Ruby's regular expressions do not, in and of themselves, have a provision for substitution. From what I've found, you need to use Ruby itself, specifically the String.gsub() method, to actually perform the substitution. If that's true, to me that represents a significant hassle, although certainly not a showstopper. If I'm wrong about this, please let me know.

The following makes all occurrences of /w.ll/ uppercase in the string:

#!/usr/bin/ruby
string1 = "I will drill for a well in walla walla washington."
string1.gsub!(/(w.ll)/){$1.upcase}
puts string1
[slitt@mydesk slitt]$ ./hello.rb
I WILL drill for a WELL in WALLa WALLa washington.
[slitt@mydesk slitt]$

The preceding depends on the block form of the String.gsub() method. I could not get the non-block form to accept the matches of the regular expression.

If you had wanted to replace only the first occurrence of /w.ll/, you would have had to do this (warning, ugly!):

#!/usr/bin/ruby
puts string1
regex = Regexp.new(/w.ll/)
match = regex.match(string1)
offsets = match.offset(0)
startOfMatch = offsets[0]
endOfMatch = offsets[1]
string1[startOfMatch...endOfMatch] = match[0].upcase
puts string1
[slitt@mydesk slitt]$ ./regex.rb
I WILL drill for a well in walla walla washington.
[slitt@mydesk slitt]$

Being a Perl guy, I'm used to having the regular expression do the entire substitution in a single line of code, and find the preceding quite cumbersome. Obviously, some of the preceding code was inserted just for readability. For instance, I could have done this:

#!/usr/bin/ruby
string1 = "I will drill for a well in walla walla washington."
match = /w.ll/.match(string1)
string1[match.offset(0)[0]...match.offset(0)[1]] = match[0].upcase
puts string1

Or even this, which I'm sure would have fit right in with K&R first edition:

#!/usr/bin/ruby
string1 = "I will drill for a well in walla walla washington."
match = /w.ll/.match(string1)
string1[/w.ll/.match(string1).offset(0)[0].../w.ll/.match(string1).offset(0)[1]] = match[0].upcase
puts string1

If you can read the preceding, you're a better programmer than I.

In my opinion, Ruby beats the daylights out of Perl in most aspects, but not in regular expressions.

Subroutines

A subroutine starts with def and ends with a corresponding end. Subroutines pass back values with the return keyword. In a welcome change from Perl, variables declared inside a subroutine are local by default, as shown by this program:

#!/usr/bin/ruby
def passback
howIfeel="good"
return howIfeel
end

howIfeel="excellent"
puts howIfeel
mystring = passback
puts howIfeel
puts mystring

In the preceding, note that the puts command writes the string and then prints a newline, as opposed to the print command, which doesn't print a newline unless you add a newline to the string being printed.

If the howIfeel variable inside subroutine passback were global, then after running the subroutine, the howIfeel variable in the main program would change from "excellent" to good. However, when you run the program you get this:

[slitt@mydesk slitt]$ ./hello.rb
excellent
excellent
good
[slitt@mydesk slitt]$

The first and second printing of the howIfeel variable in the main program both print as "excellent", while the value passed back from the subroutine, and stored in variable mystring prints as "good", as we'd expect. Ruby's variables are local by default -- a huge encapsulation benefit.

You can pass variables into a subroutine as shown in the following code:

#!/usr/bin/ruby
def mult(multiplicand, multiplier)
multiplicand = multiplicand * multiplier
return multiplicand
end

num1 = 4
num2 = 5
result = mult(num1, num2)
print "num1 is ", num1, "\n"
print "num2 is ", num2, "\n"
print "result is ", result, "\n"
[slitt@mydesk slitt]$ ./hello.rb
num1 is 4
num2 is 5
result is 20
[slitt@mydesk slitt]$

The value of num1 was not changed by running mult(), showing that arguments are passed by value, not reference, at least for integers. But what about for objects like strings?

#!/usr/bin/ruby
def concat(firststring, secondstring)
firststring = firststring + secondstring
return firststring
end

string1 = "Steve"
string2 = "Litt"
result = concat(string1, string2)
print "string1 is ", string1, "\n"
print "string2 is ", string2, "\n"
print "result is ", result, "\n"
[slitt@mydesk slitt]$ ./hello.rb
string1 is Steve
string2 is Litt
result is SteveLitt
[slitt@mydesk slitt]$

Once again, manipulations of an argument inside the subroutine do not change the value of the variable passed as an argument. The string was passed by value, not reference.

Exceptions

Growing up with C, I wrote code for every possible error condition. Or, when I was too lazy to write code for error conditions, my code was less robust.

The modern method of error handling is with exceptions, and Ruby has that feature. Use them.

There are two things you can do: handle an exception, and raise an exception. You raise an exception by recognizing an error condition, and then associating it with an exception type. You usually don't need to raise an exception because most system calls already raise exceptions on errors. However, if you've written a new bit of logic, and encounter a forbidden state, then you would raise an exception.

You handle an exception that gets raised -- typically by system calls but possibly by your code. This handling is only for protected code starting with begin and ending with end. Here's a simple example:

#!/usr/bin/ruby
begin
input = File.new("/etc/resolv.conf", "r")
rescue
print "Failed to open /etc/fstab for input. ", $!, "\n"
end
input.each {
|i|
puts i;
}
input.close()
 
The preceding code produces the following output:

[slitt@mydesk slitt]$ ./hello.rb
search domain.cxm
nameserver 192.168.100.103

# ppp temp entry
[slitt@mydesk slitt]$

However, if the filename in File.new() is changed to the nonexistent /etc/resolX.conf, the output looks like this:

[slitt@mydesk slitt]$ ./hello.rb
Failed to open /etc/fstab for input. No such file or directory - /etc/resolX.conf
./hello.rb:7: undefined method `each' for nil:NilClass (NoMethodError)
[slitt@mydesk slitt]$

Global variable $!i had the value "No such file or directory - /etc/resolX.con", so that printed along with the error message in the rescue section. This exception was then passed to other exception handlers, that wrote additional messages and eventually terminated the program.

Exceptions are implemented as classes (objects), all of whom are descendents of the Exception class. Some have methods over and above those of the Exception class, some do not. Here is a list of the exceptions I was able to find in documentation on the web:

The following is a more generic error handling syntax:
begin
# attempt code here
rescue SyntaxError => mySyntaxError
print "Unknown syntax error. ", mySyntaxError, "\n"
# error handling specific to problem here
rescue StandardError => myStandardError
print "Unknown general error. ", myStandardError, "\n"
# error handling specific to problem here
else
# code that runs ONLY if no error goes here
ensure
# code that cleans up after a problem and its error handling goes here
end
In the preceding, variables mySyntaxError and myStandardError are local variables to store the contents of global variable $!, the exception that was raised.

Retry

There's a retry keyword enabling a retry on error. This is handy when performing an activity that might benefit from a retry (reading a CD, for instance):
begin
# attempt code here
rescue
puts $!
if EscNotPressed()
print "Reload the CD, or press ESC\n"
retry
else
puts "User declined to retry further"
end
end

Raising an Exception

Sometimes the neither the system nor the language detect an error, but you do. Perhaps the user input someone 18 years old for Medicare. Linux doesn't know that's wrong. Ruby doesn't know that's wrong. But you do.

You can raise a generic exception (or the current exception if there is one) like this:
raise if age < 65

#!/usr/bin/ruby
age = 18
raise if age < 66
print "Age is ", age, ". This happens after the exception was raised\n"
[slitt@mydesk slitt]$ ./hello.rb
./hello.rb:3: unhandled exception
[slitt@mydesk slitt]$

To raise a RuntimeError exception with your own message, do this:
raise "Must be 65 or older for Medicare"

#!/usr/bin/ruby
age = 18
raise "Must be 65 or older for Medicare." if age < 66
print "Age is ", age, ". This happens after the exception was raised\n"
[slitt@mydesk slitt]$ ./hello.rb
./hello.rb:3: Must be 65 or older for Medicare. (RuntimeError)
[slitt@mydesk slitt]$

To raise a RangeError exception (you wouldn't really do this), you'd do this:
raise RangeError, "Must be 65 or older for Medicare", caller

#!/usr/bin/ruby
age = 18
raise RangeError, "Must be 65 or older for Medicare", caller if age < 66
print "Age is ", age, ". This happens after the exception was raised\n"
[slitt@mydesk slitt]$ ./hello.rb
./hello.rb:3: Must be 65 or older for Medicare (RangeError)
[slitt@mydesk slitt]$

Perhaps the best way to do it is to create a new exception class specific to the type of error:

#!/usr/bin/ruby
class MedicareEligibilityException < RuntimeError
end

age = 18
raise MedicareEligibilityException , "Must be 65 or older for Medicare", caller if age < 66
print "Age is ", age, ". This happens after the exception was raised\n"
[slitt@mydesk slitt]$ ./hello.rb
./hello.rb:6: Must be 65 or older for Medicare (MedicareEligibilityException)
[slitt@mydesk slitt]$

Now let's combine raising and handling, by creating a subroutine called signHimUp(), which raises the exception, and the calling main routine, which handles. In this particular, rather contrived program, information about the person whose information raised the exception is stored in the exception itself, by the initialize() method, which assigns its arguments to the class's instance variables, so that this call:
myException = MedicareEligibilityException.new(name, age)
creates an instance of class MedicareEligibilityException whose instance variables contain the person's name and age for later reference. Once again, this is very contrived, but it illustrates some of the flexibility of exception handling:

#!/usr/bin/ruby
class MedicareEligibilityException < RuntimeError
def initialize(name, age)
@name = name
@age = age
end
def getName
return @name
end
def getAge
return @age
end
end

def writeToDatabase(name, age)
# This is a stub routine
print "Diagnostic: ", name, ", age ", age, " is signed up.\n"
end

def signHimUp(name, age)

if age >= 65
writeToDatabase(name, age)
else
myException = MedicareEligibilityException.new(name, age)
raise myException , "Must be 65 or older for Medicare", caller
# raise MedicareEligibilityException , "Must be 65 or older for Medicare", caller
end
end

# Main routine
begin
signHimUp("Oliver Oldster", 78)
signHimUp("Billy Boywonder", 18)
signHimUp("Cindy Centurinarian", 100)
signHimUp("Bob Baby", 2)

rescue MedicareEligibilityException => elg
print elg.getName, " is ", elg.getAge, ", which is too young.\n"
print "You must obtain an exception from your supervisor. ", elg, "\n"

end

print "This happens after signHimUp was called.\n"


In the preceding code, the main routine calls subroutine signHimUp for each of four people, two of whom are underage. The begin/rescue/end structure in the main routine allows exceptions of type MedicateEligibilityException to be handled cleanly, although such exceptions are raised by the called subroutine, signHimU(). , signHimU(). routine tests for age 65 and older, and if so, calls dummy writeToDatabase() and if not, creates a new instance of MedicateEligibilityException containing the person's name and age, and then raises that exception, with the hope that the calling routine's exception handling will be able to use that information in its error message.

The MedicateEligibilityException definition itself is a typical class definition, with instance variables beginning with @, an initialize() constructor that assigns its arguments to the instance variables, and get routines for the instance variables. All of this will be covered later when we discuss classes and objects.

Here is the result:

[slitt@mydesk slitt]$ ./hello.rb
Diagnostic: Oliver Oldster, age 78 is signed up.
Billy Boywonder is 18, which is too young.
You must obtain an exception from your supervisor. Must be 65 or older for Medicare
This happens after signHimUp was called.
[slitt@mydesk slitt]$

As you can see, the first call to signHimUp() successfully ran the stub write to database routine, as indicated by the diagnostic line. The next call to signHimUp() encountered an exceptio MedicateEligibilityException exception, and the code in the rescue block got the patient's name and age from the exception, and wrote it. At that point the begin block was terminated, and execution fell through to the line below the end matching the exception handling's begin. If we had wanted to, we could have terminated the program from within the rescue block, in many ways, including ending that block with a raise command, or to bail immediately, an exit command.

Catch and Throw

The catch and throw keywords enable you to jump up the error stack, thereby in effect performing a goto. If you can think of a good reason to do this, research these two keywords on your own. Personally, I'd prefer to stay away from them.

We've just scratched the surface of exception handling, but you probably have enough now to at least write simple exceptions and read other people's exception code.

Terminal IO

This section will cover just a few of the many ways you can do terminal IO. You've already learned about print and puts:

#!/usr/bin/ruby
print "This is the first half of Line 1. "
print "This is the second half.", "\n"
puts "This is line 2, no newline necessary."

The preceding code produces the following result:

[slitt@mydesk slitt]$ ./hello.rb
This is the first half of Line 1. This is the second half.
This is line 2, no newline necessary.
[slitt@mydesk slitt]$

Ruby has a printf() command similar to C:

#!/usr/bin/ruby
printf "There were %7d people at the %s.\n", 439, "Avalanche Auditorium"
[slitt@mydesk slitt]$ ./hello.rb
There were 439 people at the Avalanche Auditorium.
[slitt@mydesk slitt]$

You get line oriented keyboard input with gets:

#!/usr/bin/ruby
print "Name please=>"
name = gets
print "Your name is ", name, "\n"
[slitt@mydesk slitt]$ ./hello.rb
Name please=>Steve Litt
Your name is Steve Litt

[slitt@mydesk slitt]$

You can get a single character with gets(). However, the user will need to press the Enter key before gets() will accept the character. To enable instantaneous recognition of the character, you must set cbreak before gets() and then reset it afterwards, like this:

#!/usr/bin/ruby
print "Character please=>"
system "stty cbreak </dev/tty >/dev/tty 2>&1";
int = STDIN.getc
system "stty -cbreak </dev/tty >/dev/tty 2>&1";
print "\nYou pressed >", int, "<, char >", int.chr, "<\n"
[slitt@mydesk slitt]$ ./hello.rb
Character please=>A
You pressed >65<, char >A<
[slitt@mydesk slitt]$

The cbreak commands seem to work on modern Linuces. They are VERY system dependent, and as far as I know don't work on Windows at all. On some Unices you might try these instead:
system "stty", '-icanon', 'eol', "\001";
int = STDIN.getc
system "stty", 'icanon', 'eol', '^@'; # ASCII null
Terminal I/O is pretty simple in Ruby. So is file I/O...

File IO

File I/O uses the File object. It's very straightforward, as you can see from the following program, which opens resolv.conf for input, andjunk.jnk for output, and then copies each line from the input file to the output file:

#!/usr/bin/ruby
infile = File.new("/etc/resolv.conf", "r")
outfile = File.new("junk.jnk", "w")
infile.each {
|i|
outfile.write i
}
outfile.close()
infile.close()

outfile = File.new("junk.jnk", "r")
outfile.each {
|i|
print ">>", i
}
[slitt@mydesk slitt]$ ./hello.rb
>>search domain.cxm
>>nameserver 192.168.100.103
>>
>># ppp temp entry
[slitt@mydesk slitt]$

Perl has a way to immediately read a whole file into an array, and so does Ruby:

#!/usr/bin/ruby
infile = File.new("/etc/resolv.conf", "r")
linearray = infile.readlines
linearray.each{|i| print i}
infile.close
[slitt@mydesk slitt]$ ./hello.rb
search domain.cxm
nameserver 192.168.100.103

# ppp temp entry
[slitt@mydesk slitt]$

Ruby can no also read one character at a time:

#!/usr/bin/ruby
infile = File.new("/etc/resolv.conf", "r")
infile.each_byte {
|i|
if i.chr == "e"
print("!")
else
print(i.chr)
end
}
infile.close
[slitt@mydesk slitt]$ ./hello.rb
s!arch domain.cxm
nam!s!rv!r 192.168.100.103

# ppp t!mp !ntry
[slitt@mydesk slitt]

If for some reason you don't want to use the each construct, you can use readchar like this:

#!/usr/bin/ruby
infile = File.new("/etc/resolv.conf", "r")
until infile.eof
i = infile.readchar
if i.chr == "e"
print("!")
else
print(i.chr)
end
end
infile.close

In the preceding code, the eof method looks ahead to see whether the next character read will be valid, and if so, loops through, reads and prints it. You might think of doing a priming read, then putting the next read at the bottom of the loop, testing for i==nil. Unfortunately, if you read into the end of file, it triggers an exception which prints an error message, and nobody wants that. Instead, use eof to look ahead and read just enough.

It isn't demonstrated in this tutorial, but you can use readline to read a line at a time, again using eof to look ahead.

[slitt@mydesk slitt]$ ./hello.rb
s!arch domain.cxm
nam!s!rv!r 192.168.100.103
nam!s!rv!r 209.63.57.200

# ppp t!mp !ntry
[slitt@mydesk slitt]$

How OOP is Ruby?

You hear it all the time. "Ruby's a purely Object Oriented language!"

On some levels that's a true statement, but it's misleading. It misled me into staying away from Ruby for three years.

See, to me "purely OOP" means a language you can't write procedural code with. Java, for instance, where you need to create a class to write a "hello world" program, and you can't make a subroutine outside of a class.

Ruby's not like that. A Ruby "hello world" program is two lines, you can write subroutines outside of any class that are accessible anywhere, and if you'd like you can write complete and complex programs without creating a single class or object.

In Ruby's case, what they mean by "purely OOP" is that all variables are objects. Integers, floating point numbers, characters, strings, arrays, hashes, files -- they're all objects. You manipulate these objects with their methods, not with Ruby built in operators. For instance, in the following:
profit = revenue - expense
In the preceding, profit, revenue and expense are all objects of class Float. The minus sign (-) is not a Ruby operator -- it's a method of the Float class. In the C language, the minus sign would be an operator supplied by the language, but in Ruby it's just a method of the Float class. Incidentally, a plus sign method is implemented in class Fixnum integers, where once again it adds the value, and in the String class, where it concatinates strings.

So Ruby's "purely OOP" in that when you use it you'll definitely be using objects, but you do not need to create objects to write a substantial Ruby programmer. So if you do not consider yourself an Object Oriented programmer, or even if you hate OOP, don't let that stop you from using Ruby.

Object Oriented Programming Concepts

In my opinion, objects are all about data. In programs using objects to simulate real world things like cannonballs, such data might be position, velocity and mass. In business programs, an object might contain a person's first and last name, employee number, job classification and health insurance.

An object is a wonderful place to store a program's configuration information. All such info is kept in one place such that only a single object is kept global or passed in and out of subroutines.

All of these ideas precede object orientation. Since the dawn of time programmers have put all data for an entity in a data structure, and then manipulated the structure. Here's some code I wrote in 1986 to manipulate the page of a dot matrix printer. Keep in mind that back in those days, computers didn't have enough RAM for everyone to store their printed page in an 80x66 array. Much of my job back then was programming computers to print out medical insurance forms, each with about 40 boxes to fill out in very tight quarters. There were several different form layouts, and they changed frequently. So here's some 1986 C code (note the original K&R style -- no prototypes):

/* THE REPORT VARIABLE */
typedef struct
{
FILE *fil; /* report file file variable */
int y; /* y coord on page, changed only by atyxpr */
int x; /* x coord on page, changed only by atyxpr */
int pglength; /* lines per page, changed only by openrpt */
int stringlength; /* maximum length of string to be printed */
int lineno; /* line number, changed only by applcatn pgmr */
int pageno; /* page number, changed only by applictn pgmr */
char status[10]; /* set to @REPORT or @CLOSED */
} REPORT;


void atyxpr(rpt,y,x,st)
REPORT *rpt; /* the report variable pointer */
int y; /* the present vertical print position */
int x; /* the present horizontal print position */
char *st; /* the string to be printed */

{
int i;

checkopen(rpt);
if ((x == 0) && (y == 0))
{ /* continue printing at last position */
y = rpt->y;
x = rpt->x;
}

/* formfeed if the print line you're seeking is higher than the last time */
if (y < rpt->y)
formfeed(rpt);

/* insert a '^' if you've overwritten a column */
if ((y == rpt->y) && (x < rpt->x))
{
strcpy(st, st +(1 + rpt->x - x));
writestring(rpt, "^");
x = rpt->x;
fprintf(stderr, "?-warning-atyxpr- column overwrite in line %d.\n", rpt->y);
}

/* bring the print position to the new coordinates */
while (y > rpt->y)
{
linefeed(rpt->fil);
rpt->y = rpt->y + 1;
rpt->x = 1;
}
while (x > rpt->x)
{
spaceout(rpt->fil);
rpt->x = rpt->x + 1;
}

/* do the actual write of the string */
writestring(rpt, st);

/* bring the x position up to date after the write */
rpt->x = rpt->x + strlen(st);
}
The REPORT structure kept track of the current position of the print head (y and x), the number of lines on a page (pglength), and the file to which to write the output (the file was usually a printer device). All this information remained persistent in the report structure.

The report structure was manipulated by a function called atyxpr(),. To print a string at a specific line and column, the programmer specified the string to print and the y and x coordinates (row and column) at which to start printing the string. Also specified was the report structure.

If the row and column were specified as both being 0, atyxpr() printed the string at the current print head position, as if the print was done by a simple printf().

If the row was the same as the current printhead row but the column was farther out, atyxpr() printed spaces until the printer head was in the desired place, and then the string was printed.

If the desired row was below the current printhead position, atyxpr() printed linefeeds to get to the desired row, printed spaces to get to the desired column, and then printed the string.

If the desired row was above the current printhead position, that meant that it needed to be printed on the next page, so a formfeed was issued, then enough linefeeds to get to the desired row, then enough spaces to get to the desired column, and then the string was printed.

What does this have to do with Ruby? Believe it or not, there's a purpose to showing this obsolete C code from an era of monospace printers and computers too anemic to store 80x66 worth of characters. That purpose is to show that there's absolutely nothing new about congregating all data about a specific entity or device in a single place, nor is there anything new about encapsulation. You do not need object orientation to do these things. I did it in 1986 using K&R C, and people were doing it long before me.

What IS new about object oriented programming (OOP) is that you can store the subroutines that manipulate the data (atyxpr() in this example) right along with the data. But so what? What's the advantage?

The advantage is something called namespace collision. The name of the subroutine manipulating the data is in scope only within the context of that data. If that name is used elsewhere, it refers to a different subroutine. In old C, if you had geometric figures square, circle, point and parabola, look what you'd need:
You need to remember four subroutine names (circle_move, square_move, point_move, and parabola_move), none of which is especially memorable. Now consider an object oriented language, where objects circle, square, point and parabola each implement their own move routine:
In Object Oriented Programming (OOP), move means move -- it's intuitive.

Others will state additional benefits. They'll tell of the ability to redefine operators depending on the types being manipulated. They'll speak of inheritance, where you can create a new object type that's an enhancement of one already made, and you can even create a family of similar object types that can be manipulated by same named, similar performing subroutines. These are all nice, but in my opinion the only essentials are encapsulation and reduction of namespace collision.

Many tout OOP for purposes of reusability. I disagree. Everyone's talking about reusable code, but few are writing it, with OOP or anything else. Reusability is harder to find than the fountain of youth. If OOP were really that reusable, that wouldn't be true.

Classes and Objects

Think of a class as a set of architectural drawings for a house. Think of objects as the houses built according to those drawings. The drawings can be used as a plan for many, many houses. Not only that, the houses needn't be the same. Some can have carpeting, some have wood floors, but they were all created from the drawings. Once the house is created, the owner can put in a 14 cubic foot refrigerator or a 26 foot one. The owner can put in the finest entertainment center, or a 14" TV with rabbit ears on a wooden crate. No matter, they were all made from the same drawings. The drawing is the class, the house is the object.

A class is a plan to create objects. Ideally it lists all the data elements that will appear in any of its objects. It lists any subroutines the objects will need to manipulate the data. Those subroutines are called methods in OOP speak. It might even give the data elements initial values so that if the programmer doesn't change them, he has intelligent defaults. But typically, the computer program changes at least some of those data elements while it's being run.

Simple OOP in Ruby

In Ruby, a class begins with the class keyword, and ends with a matching end. The simplest class that can be made contains nothing more than the class statement and corresponding end:
class Myclass
end
The preceding class would not error out, but it does nothing other than tell the name of its class:

#!/usr/bin/ruby
class Myclass
end

myclass = Myclass.new
print myclass.class, "\n"
[slitt@mydesk slitt]$ ./hello.rb
Myclass
[slitt@mydesk slitt]$

To be useful, a class must encapsulate data, giving the programmer methods (subroutines associated with the class) to read and manipulate that data. As a simple example, imagine a class that produces objects that maintain a running total. This class maintains one piece of data, called @total, which is the total being maintained. Note that the at sign (@) designates this variable as an instance variable -- a variable in scope only within objects of this class, and persistent within those objects.

This class has a method called hasTotal() that returns true if the total is defined, false if it's nil. That way you can test to make sure you don't perform operations on a nil value. It also has getTotal() to read the total. It has setTo() to set the total to the argument of setTo(), it has methods increaseBy() and multiplyBy() add or multiply the total by an argument.

Last but not least, it has initialize()., which is called whenever Total.new() is executed. This happens because initialize() is a special reserved name -- you needn't do anything to indicate it's a constructor. The number of arguments in initialize() is the number of arguments Total.new() expects. The other thing that happens in initialize() is that all the instance variables are declared and initialized (in this case to the argument passed in through new().

Here is the code:

#!/usr/bin/ruby
class Total
def initialize(initial_amount)
@total=initial_amount
end

def increaseBy(increase)
@total += increase
end

def multiplyBy(increase)
@total *= increase
end

def setTo(amount)
@total = amount
end

def getTotal() return @total; end
def hasTotal() return @total!=nil; end
end

total = Total.new(0)
for ss in 1..4
total.increaseBy(ss)
puts total.getTotal if total.hasTotal
end
print "Final total: ", total.getTotal, "\n" if total.hasTotal
[slitt@mydesk slitt]$ ./hello.rb
1
3
6
10
Final total: 10
[slitt@mydesk slitt]$

The main routine instantiates an object of type Total, instantiating the total to a value of 0. Then a loop repeatedly adds the loop subscript to the total, printing each time after the add. Finally, outside the loop, the total is printed, which is 10, otherwise known as 1+2+3+4.

Take some time to study the preceding example, and I think you'll find it fairly self-explanatory.

Now for a little controversy. Remember I said you declare all instance variables inside initialize()? You don't have to. You could declare them in other methods:

#!/usr/bin/ruby
class Total
def initialize(initial_amount)
@total=initial_amount
end

def setName(name) @name = name; end
def hasName() return @name != nil; end
def getName() return @name; end

def increaseBy(increase)
@total += increase
end

def multiplyBy(increase)
@total *= increase
end

def setTo(amount)
@total = amount
end

def getTotal() return @total; end
def hasTotal() return @total!=nil; end
end

total = Total.new(15)
print total.getTotal(), "\n"
print total.getName(), "\n"
total.setName("My Total")
print total.getName(), "\n"
[slitt@mydesk slitt]$ ./hello.rb
15
nil
My Total
[slitt@mydesk slitt]$

From a viewpoint of pure modularity, readability and encapsulation, you'd probably want to have all instance variables listed in the initialize() method. However, Ruby gives you ways to access instance variables directly, either read-only or read-write. Here's a read only example:

#!/usr/bin/ruby
class Person
def initialize(lname, fname)
@lname = lname
@fname = fname
end

def lname
return @lname
end
def fname
return @fname
end
end

steve = Person.new("Litt", "Steve")
print "My name is ", steve.fname, " ", steve.lname, ".\n"
[slitt@mydesk slitt]$ ./hello.rb
My name is Steve Litt.
[slitt@mydesk slitt]$

You and I know fname and lname are accessed as methods, but because they're read as steve.fname, it seems like you're directly reading the data. Now let's go for a read/write example:


#!/usr/bin/ruby
class Person
def initialize(lname, fname)
@lname = lname
@fname = fname
end

def lname
return @lname
end

def fname
return @fname
end

def lname=(myarg)
@lname = myarg
end

def fname=(myarg)
@fname = myarg
end
end

steve = Person.new("Litt", "Stove")
print "My name is ", steve.fname, " ", steve.lname, ".\n"
steve.fname = "Steve"
print "My name is ", steve.fname, " ", steve.lname, ".\n"

When I instantiated the object in the preceding code, I accidentally spelled my name "Stove". So I changed it as if it were a variable. This behavior was facilitated by the def lname=(arg) method. The output of the preceding code follows:

[slitt@mydesk slitt]$ ./hello.rb
My name is Stove Litt.
My name is Steve Litt.
[slitt@mydesk slitt]$

The methods facilitating the seeming ability to write directly to the data are called accessor methods. Because accessor methods are so common, Ruby has a shorthand for them:

#!/usr/bin/ruby
class Person
def initialize(lname, fname)
@lname = lname
@fname = fname
end

attr_reader :lname, :fname
attr_writer :lname, :fname
end

steve = Person.new("Litt", "Stove")
print "My name is ", steve.fname, " ", steve.lname, ".\n"
steve.fname = "Steve"
print "My name is ", steve.fname, " ", steve.lname, ".\n"
[slitt@mydesk slitt]$ ./hello.rb
My name is Stove Litt.
My name is Steve Litt.
[slitt@mydesk slitt]$

In the preceding code, the variables after attr_reader substituted for the readonly accessor members, while the attr_writer substituted for the writeonly accessor members. Notice that when you write the names of the instance variables, you substitute a colon for the instance variables' at signs. There is actually a syntax reason, consistent with Ruby, for this substitution, but I can't explain it, so I choose to just remember it.

Remember, this seeming direct access must be explicitly enabled by the class's programmer, so this usually doesn't compromise encapsulation beyond what needs to be available. In my opinion this is a really handy option.

Inheritance

Inheritance is where a more specific kind of class is made from a more general one. For instance, an employee is a kind of person. Specifically (and oversimplistically), it's a person with an employee number. See this inheritance example:

#!/usr/bin/ruby
class Person
def initialize(lname, fname)
@lname = lname
@fname = fname
end

attr_reader :lname, :fname
attr_writer :lname, :fname
end

class Employee < Person # Declare Person to be parent of Employee
def initialize(lname, fname, empno)
super(lname, fname) # Initialize Parent's (Person) data
# by calling Parent's initialize()
@empno = empno # Initialize Employee specific data

end
attr_reader :empno # Accessor for employee specific data
attr_writer :empno # Accessor for employee specific data
# Parent's data already given accessors
# by parent class definition
end

steve = Employee.new("Litt", "Steve", "12345")
print steve.fname, " ", steve.lname, " is employee number ", steve.empno, ".\n"
[slitt@mydesk slitt]$ ./hello.rb
Steve Litt is employee number 12345.
[slitt@mydesk slitt]$

Ruby REALLY makes inheritance easy. On the class line you declare the child class's parent. In the child class's initialize() you call the parent's initializer by the super(supers_args) syntax. Because the parent's data is initialized and available to the child, you needn't redeclare accessor methods for the parent's data -- only for the child's data. In other words, in the child class you need code only for data specific to the child. It's handy, intuitive, and smooth.

Redefining Operators

It is nice to have total.add() and total.increaseBy() methods. But in many cases it's even more intuitive to use the + or += operator. In C++ it's always somewhat difficult to remember how to redefine operators. Not so in Ruby:

#!/usr/bin/ruby
class Total
def getTotal() return @total; end
def hasTotal() return @total!=nil; end
def initialize(initial_amount)
@total=initial_amount
end

def increaseBy(b)
@total += b
end

def add(b)
if b.class == Total
return Total.new(@total + b.getTotal())

else
return Total.new(@total + b)
end
end

def +(b)
self.add(b)
end

def *(b)
if b.class == Total
return Total.new(@total * b.getTotal())

else
return Total.new(@total * b)
end
end
end

total5 = Total.new(5)
total2 = Total.new(2)
total3 = Total.new(3)

myTotal = total5 + total2 + total3
print myTotal.getTotal(), "\n"

myTotal *= 2
print myTotal.getTotal(), "\n"

myTotal += 10
print myTotal.getTotal(), "\n"

In the preceding, we define add() as returning the argument plus @total. Notice that @total is not changed in-place. We might want to add add a Total to the existing Total, or we might want to add an integer. Therefore, Total::add() checks the argument's type, and if it's a Total it adds the argument's value, otherwise it adds the argument.

With add() safely defined, we now define + as basically a synonym for add(). The fascinating thing about Ruby is that if you define +, you get += free of charge, without further coding, and += does the right thing. As of yet I have not found a way to redefine +=, or any other punctuation string more than one character long. Luckily, += "just does the right thing", consistent with the definition of +.

It's not necessary to define a word function before redefining an operator, as the * operator (really a method) in the preceding code shows. Once again, it has an if statement so that integers or Totals can be added.

In the main part of the routine, we test by creating three totals with values 5, 2 and 3 repectively. We then add them together to create myTotal, which should be 10 and indeed is. We then in-place multiply by 2 to get the expected 20, and then in-place add 10 to get the expected 30:

[slitt@mydesk slitt]$ ./hello.rb
10
20
30
[slitt@mydesk slitt]$

As mentioned, I haven't yet found the way to redefine an operator string longer than one character, so I cannot yet redefine things like ++, +=, <<, >>, and the like. If you know of a way to do it, please email me.



x

x