On this page:
2 DNA decompression
3 DNA compression
4 Challenges
8.7

Lab 11: Generative recursion 1

Remember to follow (and when necessary, adapt) the design recipe for each function. When you write a generative recursive function, you must provide a termination argument.

1 Power

As you saw in Lecture 22: Merge sort, creatively decomposing problems can make them faster to solve. In this section, you’ll apply this lesson to a different problem: raising a given number to a given power. First you’ll use structural recursion to design a slow solution to the problem; then you’ll use generative recursion to design a fast solution to the problem.

Exercise 1. Design a function power that takes a number x and a natural number n and returns the number that is x raised to the n-th power. Here n is the exponent. Do not use any built-in function for exponentiation.
(check-expect (power 10 3) 1000)
(check-expect (power 2 10) 1024)
(check-expect (power 1.1 4) 1.4641)
Hint: Do not use generative recursion; this exercise is just to warm up. Instead, stick with structural recursion and follow the template for processing a NaturalNumber:
; A NaturalNumber is one of:
; - 0
; - (+ 1 NaturalNumber)

Exercise 2. A typical bank account yields 0.01% interest per day. In such an account, how much money does $1 become in 10,000 days? Compute the answer by running (power 1.0001 10000). How long does the computation take? (You can measure how long it takes using time.)

Exercise 3. Because you designed power by following the template for processing a NaturalNumber, it decomposes the problem (power 1.0001 10000) into the problem (power 1.0001 9999). We can make power faster by decomposing the exponent 10000 differently, into the exponent 5000 instead of 9999. After all, the answer to (power 1.0001 10000) is simply the square of the answer to (power 1.0001 5000).

In general, let’s decompose a NaturalNumber into a PowerTree defined below:
; A PowerTree is one of:
; - (make-zeroth)
; - (make-oddth  PowerTree)
; - (make-eventh PowerTree)
(define-struct zeroth [])
(define-struct oddth  [sub1])
(define-struct eventh [half])
The interpretation of a PowerTree is specified by the following function.
; power-tree-exponent : PowerTree -> NaturalNumber
; returns the meaning of the given PowerTree
(define (power-tree-exponent pt)
  (cond [(zeroth? pt) 0]
        [(oddth?  pt) (+ 1 (power-tree-exponent (oddth-sub1  pt)))]
        [(eventh? pt) (* 2 (power-tree-exponent (eventh-half pt)))]))

Design a function generate-power-tree that takes a natural number and decomposes it into a PowerTree whose interpretation is it. For example, (generate-power-tree 10000) should generate a PowerTree whose interpretation is 10000.
(check-expect (power-tree-exponent (generate-power-tree 10000)) 10000)
To keep the generated PowerTree as small as possible, do not follow the structural recursion template for processing a NaturalNumber. Instead, check if the input is zero, odd, or even.
  • If the input is zero, then just return (make-zeroth) because its interpretation is zero.

  • If the input is odd, then make a recursive call on a natural number that is one less.

  • If the input is even, then make a recursive call on a natural number that is half.

Because this function makes recursive calls without following any processing template, you must remember to make it terminate, and write a termination argument that explains why the function eventually stops; it is not enough to say only when the function stops. Your definition of generate-power-tree should pass these tests:
(check-expect (generate-power-tree 0)
              (make-zeroth))
(check-expect (generate-power-tree 1)
              (make-oddth (make-zeroth)))
(check-expect (generate-power-tree 2)
              (make-eventh (make-oddth (make-zeroth))))
(check-expect (generate-power-tree 4)
              (make-eventh (make-eventh (make-oddth (make-zeroth)))))
(check-expect (generate-power-tree 8)
              (make-eventh (make-eventh
               (make-eventh (make-oddth (make-zeroth))))))
(check-expect (generate-power-tree 9)
              (make-oddth (make-eventh (make-eventh
               (make-eventh (make-oddth (make-zeroth)))))))

Exercise 4. Design a function power-tree-result that takes a number x and a PowerTree pt and returns the number that is x raised to the (power-tree-exponent pt)-th power. But do not use the slow power function you designed, and do not use any built-in function for exponentiation either. Instead, follow the template for processing a PowerTree.
(check-expect (power-tree-result 10
                (make-oddth (make-eventh (make-eventh
                 (make-eventh (make-oddth (make-zeroth)))))))
              1000000000)
In order to fill in the template and complete the definition, you need to turn the recursion outcomes provided by the template into the expected output, so make sure you write enough examples, and perhaps use the table method. For example, to handle the input pt = (generate-power-tree 9), you need to turn the recursion outcome x8 into the expected output x9, by multiplying by x. And to handle the input pt = (generate-power-tree 8), you need to turn the recursion outcome x4 into the expected output x8, by squaring. In general, you need to use the equation x2n = (xn)2.

Because this function follows a processing template, you can be sure that it always terminates, and you do not need to write any termination argument.

Exercise 5. Combine generate-power-tree and power-tree-result to design a function fast-power like power: it takes a number x and a natural number n and returns the number that is x raised to the n-th power. The definition of fast-power is very short, does not use recursion, and can be written just by looking at the signatures of generate-power-tree and power-tree-result. Your fast-power should pass the same tests as power above. Again, do not use the slow power function you designed, and do not use any built-in function for exponentiation either.

Exercise 6. Compute the same answer as Exercise 6 by running (fast-power 1.0001 10000). How long does the computation take? (You can measure how long it takes using time.)

2 DNA decompression

DNA is often modeled by sequences of the characters A, C, G and T. These sequences are very long and so often need to be compressed. A simple way to do this is run-length encoding. It means to replace identical consecutive characters with the character followed by its count. For example, the run-length encoding of AAAAAAAAAAGCCCCC is A10G1C5. This is the only run-length encoding—something like A4A6G1C5 is not valid, because each count must cover as many identical consecutive characters as possible.

The goal of this section is decoding, to convert run-length encodings to DNA sequences. For example, we want to convert the run-length encoding A10G1C5 to the DNA sequence AAAAAAAAAAGCCCCC, and we want to convert the empty run-length encoding to the empty DNA sequence. We will decompose the problem of decoding A10G1C5 to the subproblem of decoding G1C5.

We will represent both the run-length encoding and the DNA sequence as [ListOf 1String]. Recall that a 1String is a string of length 1, in other words, a character.

Exercise 7. Design a helper function drop-digits, which drops all the digits at the beginning of a given [ListOf 1String].
(check-expect (drop-digits (explode "10G1C5")) (explode "G1C5"))
The example above is written using the built-in function explode. Recall that explode converts a String to a [ListOf 1String]. Use it to write at least two more examples: one whose input starts with a non-digit character, and one whose input is empty.

Hint: Your definition of drop-digits can use the built-in function string-numeric?

Exercise 8. Design another helper function, which collects all the digits at the beginning of a given [ListOf 1String].

Exercise 9. Next, you will use the helpers above to decompose decoding problems via a DNATree defined below:
; A DNATree is one of:
; - (make-end)
; - (make-run 1String NaturalNumber DNATree)
(define-struct end [])
(define-struct run [base count rest])
Below is an example of a DNATree:
(define dt1
  (make-run "A" 10
    (make-run "G" 1
      (make-run "C" 5 (make-end)))))
Design a function generate-decoding that takes a run-length encoding as input and decomposes its decoding problem into a DNATree. Here are two examples:
(check-expect (generate-decoding (explode "A10G1C5")) dt1)
(check-expect (generate-decoding empty) (make-end))

Hint: Use the helper functions you designed in Exercises 7 and 8 to turn the problem (explode "A10G1C5") into the subproblem (explode "G1C5"). Also, use the built-in function string->number.

Exercise 10. Design a function decode-dna-tree that takes a DNATree as input and produces the DNA sequence. Here are two examples:
(check-expect (decode-dna-tree dt1) (explode "AAAAAAAAAAGCCCCC"))
(check-expect (decode-dna-tree (make-end)) empty)
Hint: Follow the template for processing a DNATree. Also, use the built-in function make-list.

Exercise 11. Combine generate-decoding and decode-dna-tree to design a function decode that consumes the run-length encoding of a DNA sequence and produces the DNA sequence. Here are two examples:
(check-expect (decode (explode "A10G1C5")) (explode "AAAAAAAAAAGCCCCC"))
(check-expect (decode empty) empty)
Hint: The definition of decode is very short, does not use recursion, and can be written just by looking at the signatures of generate-decoding and decode-dna-tree.

3 DNA compression

The goal of this section is encoding, to convert DNA sequences to run-length encodings. For example, we want to convert the DNA sequence AAAAAAAAAAGCCCCC to the run-length encoding A10G1C5, and we want to convert the empty DNA sequence to the empty run-length encoding. We will decompose the problem of encoding AAAAAAAAAAGCCCCC to the subproblem of encoding GCCCCC.

Again, we will represent both the DNA sequence and the run-length encoding as [ListOf 1String], in other words, a list of characters.

Exercise 12. Design a helper function drop-letter, which drops all the copies of a given 1String at the beginning of a given [ListOf 1String].
(check-expect (drop-letter "A" (explode "AAAAAAAAAAGCCCCC"))
              (explode "GCCCCC"))
Write at least two more examples: one whose input starts with a different character, and one whose list input is empty.

Hint: Your definition of drop-digits can use the built-in function string=?

Exercise 13. Design another helper function, which collects all the copies of a given 1String at the beginning of a given [ListOf 1String].

Exercise 14. Next, you will use the helpers above to decompose encoding problems via the same DNATree that you used in the previous section to decompose decoding problems.

Design a function generate-encoding that takes a DNA sequence as input and decomposes its encoding problem into a DNATree. Assume that the given DNA sequence will only consist of upper-case A, C, G and T. Here are two examples:
(check-expect (generate-encoding (explode "AAAAAAAAAAGCCCCC")) dt1)
(check-expect (generate-encoding empty) (make-end))
Hint: Use the helper functions you designed in Exercises 12 and 13 to turn the problem (explode "AAAAAAAAAAGCCCCC") into the subproblem (explode "GCCCCC").

Exercise 15. Design a function encode-dna-tree that takes a DNATree as input and produces the run-length encoding. Here are two examples:
(check-expect (encode-dna-tree dt1) (explode "A10G1C5"))
(check-expect (encode-dna-tree (make-end)) empty)
Hint: Follow the template for processing a DNATree. Also, use the built-in functions number->string and explode.

Exercise 16. Combine generate-encoding and encode-dna-tree to design a function encode that consumes a DNA string and produces its run-length encoding. Again, assume that the given DNA string will only consist of upper-case A, C, G and T. Here are two examples:
(check-expect (encode (explode "AAAAAAAAAAGCCCCC")) (explode "A10G1C5"))
(check-expect (encode empty) empty)
Hint: The definition of encode is very short, does not use recursion, and can be written just by looking at the signatures of generate-encoding and encode-dna-tree.

4 Challenges

Exercise 17. Abstract from the helpers drop-digits and drop-letter you designed in Exercises 7 and 12.

Exercise 18. Abstract from the helpers you designed in Exercises 8 and 13.

Exercise 19. Think: on what kinds of input is run-length encoding most efficient (compresses the most) versus least efficient (compresses the least)? Express your answer by defining two constant DNA sequences very-efficient and very-inefficient, such that
(> (compression-ratio very-efficient) 100)
(< (compression-ratio very-inefficient) 1)
 
; compression-ratio : [ListOf 1String] -> Number
; returns how many times shorter encode makes the given string
(check-expect (compression-ratio (make-list 4 "A")) 2)
(check-expect (compression-ratio (make-list 2 "A")) 1)
(define (compression-ratio dna)
  (/ (length dna) (length (encode dna))))