Chapter 7

Strings

based on How to Think Like a Computer Scientist: Python Version :Chapter 7

7.1 A compound data type

We have seen three data types: ints, floats and strings. Several operations and functions are provided to access and manipulate the characters that make up a string.

The square brackets operator([ and ]), which selects and reads a single character from a string:

  >>> fruit = "banana"
  >>> letter = fruit[1]
  >>> print letter 
The output is:
  a
Not the expected result b. Strings in Python start with an index of 0, so the zero-eth letter is
  >>> letter = fruit[0]
  >>> print letter
  b

7.2 Length

The len function returns the number of characters in the given string:
  >>> fruit = "banana" 
  >>> len(fruit)
  6
The WRONG way to find the last letter of a string:
  length = len(fruit)
  last = fruit[length]       # ERROR!
This is wrong because the letters are numbered from 0 to 5.
  length = len(fruit)
  last = fruit[length-1]
Python also supports negative indices, which count backwards from the end of the string:
  last = fruit[-1]
  secondToLast = fruit[-2]

7.3 Traversal and the for loop

A common thing to do with a string is start at the beginning, select each character in turn, do something to it, and continue until the end. One way to encode a traversal is with a while statement:
  index = 0
  while index < len(fruit): 
    letter = fruit[index]
    print letter
    index = index + 1
The name of the loop variable is index. Python provides an alternate, simpler syntax: the for loop.
  for char in fruit:
    print char
The following example uses concatenation and a for loop:
  prefixes = "JKLMNOPQ"
  suffix = "ack"

  for letter in prefixes: 
    print letter + suffix
The output of this program is:
  Jack
  Kack
  Lack
  Mack
  Nack
  Oack
  Pack
  Qack
FYI: this is an abecedarian series, where the elements appear in alphabetical order.

7.4 Slicing

Python supports reading part, or a slice, of a larger string:
  >>> s = "Peter, Paul, and Mary"
  >>> print s[0:5]
  Peter
  >>> print s[7:11]
  Paul
  >>> print s[17:21]
  Mary
The operator [n:m] returns the part of the string from the nth character to the mth character, including the first, but excluding the last.

7.5 string comparison

The comparison operators in Section 4.2 also work on strings. To see if two strings are equal:
  if word == "banana":
    print  "Yes, we have no bananas!"
Other comparison operations are useful for putting words in alphabetical order.
  if word < "banana":
    print "Your word," + word + ", comes before banana."
  elif word > "banana":
    print "Your word," + word + ", comes after banana."
  else: 
    print "Yes, we have no bananas!"
Python does not handle upper and lower case letters the same way that people do. All the upper case letters come before all the lower case letters:
Your word, Zebra, comes before banana.
A common way to address this problem is to convert strings to a standard format, like all lower-case, before performing the comparison.

7.6 strings are not mutable

The [] operator cannot be used on the left side of an assignment:
  
  greeting = "Hello, world!"
  greeting[0] = 'J'            # ERROR!
  print greeting
Instead of producing the output Jello, world!, this code produces an error message like
  TypeError: object doesn't support item assignment
An alternative is to create a new string by concatenating a new character and the remainder of the original string:
  greeting = "Hello, world!"
  new_greeting = 'J' + greeting[1:]
  print new_greeting
Keep in mind that this operation does not modify the original string.

7.7 A find function

  def find(str, ch):
    index = 0
    while index < len(str):
      if str[index] == ch:
        return index
      index = index + 1
    return -1 
find is the opposite of the [] operator.

7.8 Looping and counting

The following program counts the number of times the letter 'a' appears in a string:
  fruit = "banana"
  count = 0
  index = 0
  for char in fruit:
    if char == 'a':
      count = count + 1
  print count 
This program demonstrates another common idiom, called a counter.

7.9 The string module

The string module contains a number of functions that are useful for manipulating strings. We have to import the string module before we use the functions in it.
  >>> import string
The module includes a function named find that does the same thing as the function we wrote. To call it,
  >>> fruit = "banana"
  >>> index = string.find(fruit, "a")
  >>> print index
  1
We specify the name of the module and the name of the function. string.find is more general than the version we wrote. There are many other functions in the string module. Consult the on-line documentation included with Python.

7.10 Character classification

It is often useful to examine a character and test whether it is upper or lower case, or whether it is a character or a digit. The string module provides several string constants that are useful for these purposes.
>>> print string.lowercase
abcdefghijklmnopqrstuvwxyz
>>> print string.uppercase
ABCDEFGHIJKLMNOPQRSTUVWXYZ
>>> print string.digits
0123456789
We can use these constants and find to classify characters:
  def isLower(ch):
    return find(string.lowercase, ch) != -1
This can also be implemented using the comparison operator:
  def isLower(ch):
    return 'a' <= ch <= 'z'