Ruby 中的函数式编程

April 18th, 2012
摘自 <The Ruby Programming Language>，最近学习Ruby，于是结合着中英文版的这书把<函数式编程>章节敲出来了。
6.8. Functional Programming
6.8.1. 对一个Enumerable对象应用一个函数 Applying a Function to an Enumerable

map和inject是Enumerable类定义的两个最重要的迭代器，它们都需要一个代码块。如果用以函数为中心的方式编写程序，我们会喜欢那些可以让函数应用到Enumerable对象上的方法：

# This module defines methods and operators for functional programming.
module Functional
  # Apply this function to each element of the specified Enumerable,
  # returning an array of results. This is the reverse of Enumerable.map.
  # Use | as an operator alias. Read "|" as "over" or "applied over".
  #
  # Example:
  #   a = [[1,2],[3,4]]
  #   sum = lambda {|x,y| x+y}
  #   sums = sum|a   # => [3,7]
  def apply(enum)
    enum.map &self
  end
  alias | apply

  # Use this function to "reduce" an enumerable to a single quantity.
  # This is the inverse of Enumerable.inject.
  # Use <= as an operator alias.
  # Mnemonic: <= looks like a needle for injections
  # Example:
  #   data = [1,2,3,4]
  #   sum = lambda {|x,y| x+y}
  #   total = sum<=data   # => 10
  def reduce(enum)
    enum.inject &self
  end
  alias <= reduce
end

# Add these functional programming methods to Proc and Method classes.
class Proc; include Functional; end
class Method; include Functional; end

注意，我们是在Functional模块中定义方法，然后把这个模块包含(include)在Proc和Method类中，这样apply和reduce方法就可以用在proc和lambda对象上。后面的绝大多数方法仍然定义在Functional模块中，因此它们也可以被Proc和Method对象使用。

有了上面定义的apply和reduce方法，我们可以重构下面的统计方法：

sum = lambda {|x,y| x+y }        # A function to add two numbers
mean = (sum<=a)/a.size           # Or sum.reduce(a) or a.inject(&sum)
deviation = lambda {|x| x-mean } # Function to compute difference from mean
square = lambda {|x| x*x }       # Function to square a number
standardDeviation = Math.sqrt((sum<=square|(deviation|a))/(a.size-1))

值得注意的是，最后一行尽管很简洁，但是那些非标准的操作符使得它难以阅读。还要注意|符是我们自己定义的，它是左连接的，因此，上面的代码要在一个Enumerable对象上应用多个函数，则须要使用圆括号。
也就是说，必须使用 square|(deviation|a)，而不能使用 square|deviation|a。

6.8.2. 复合函数 Composing Functions

如果有两个函数f、g，有时我们会希望定义一个新函数 h=f(g())，它也可被称为f由g复合。
我们可以用一个方法自动进行函数复合，其代码如下：

module Functional
  # Return a new lambda that computes self[f[args]].
  # Use * as an operator alias for compose.
  # Examples, using the * alias for this method.
  #
  # f = lambda {|x| x*x }
  # g = lambda {|x| x+1 }
  # (f*g)[2]   # => 9
  # (g*f)[2]   # => 5
  #
  # def polar(x,y)
  #   [Math.hypot(y,x), Math.atan2(y,x)]
  # end
  # def cartesian(magnitude, angle)
  #   [magnitude*Math.cos(angle), magnitude*Math.sin(angle)]
  # end
  # p,c = method :polar, method :cartesian
  # (c*p)[3,4]  # => [3,4]
  #
  def compose(f)
    if self.respond_to?(:arity) && self.arity == 1
      lambda {|*args| self[f[*args]] }
    else
      lambda {|*args| self[*f[*args]] }
    end
  end
  # * is the natural operator for function composition.
  alias * compose
end

示例中的注释演示了如何对Method对象和lambda使用compose方法。我们可以用这个新的*函数复合操作符来简化标准差的计算。仍然使用上面定义的sum、square和deviation，标准差的计算变为：

standardDeviation = Math.sqrt((sum<=square*deviation|a)/(a.size-1))

区别在于在square和deviation应用到数组a之前，我们先将它们复合成单个函数。

6.8.3. 局部应用函数 Partially Applying Functions

在函数式编程中，局部应用是指用一个函数和部分参数值产生一个新的函数，这个函数等价于用某些固定参数调用原有的函数。例如：

product = lambda {|x, y| x*y }       # A function of two arguments
double = lambda {|x| product(2,x) }  # Apply one argument

局部应用可以用Functional模块中的方法（和操作符）进行简化:

module Functional
  #
  # Return a lambda equivalent to this one with one or more initial
  # arguments applied. When only a single argument
  # is being specified, the >> alias may be simpler to use.
  # Example:
  #   product = lambda {|x,y| x*y}
  #   doubler = lambda >> 2
  #
  def apply_head(*first)
    lambda {|*rest| self[*first.concat(rest)]}
  end
  #
  # Return a lambda equivalent to this one with one or more final arguments
  # applied. When only a single argument is being specified,
  # the << alias may be simpler.
  # Example:
  #  difference = lambda {|x,y| x-y }
  #  decrement = difference << 1
  #
  def apply_tail(*last)
    lambda {|*rest| self[*rest.concat(last)]}
  end
  # Here are operator alternatives for these methods. The angle brackets
  # point to the side on which the argument is shifted in.
  alias >> apply_head    # g = f >> 2 -- set first arg to 2
  alias << apply_tail    # g = f << 2 -- set last arg to 2
end

使用这些方法和操作符，可以简单地用 product>>2 来定义我们的double函数。
使用局部应用，我们使标准差计算变得更加抽象，这可以通过一个更通用的difference函数来实现：

difference = lambda {|x,y| x-y }  # Compute difference of two numbers
deviation = difference<<mean      # Apply second argument

6.8.4. 缓存函数 Memoizing Functions

Memoization是函数式编程的一个术语，表示缓存函数调用的结果。如果一个函数对同样的参数输入总是返回相同的结果，另外出于某种需要我们认为这些参数会不断使用，而且执行这个函数比较消耗资源，那么memoization可能是一个有用的优化。我们可以通过下面的方法使Proc和Method对象的memoization自动化：

module Functional
  #
  # Return a new lambda that caches the results of this function and
  # only calls the function when new arguments are supplied.
  #
  def memoize
    cache = {}  # An empty cache. The lambda captures this in its closure.
    lambda {|*args|
      # notice that the hash key is the entire array of arguments!
      unless cache.has_key?(args)  # If no cached result for these args
        cache[args] = self[*args]  # Compute and cache the result
      end
      cache[args]                  # Return result from cache
    }
  end
  # A (probably unnecessary) unary + operator for memoization
  # Mnemonic: the + operator means "improved"
  alias +@ memoize        # cached_f = +f
end

下面是如何使用memoize方法或一元+操作符的例子：

# A memoized recursive factorial function
factorial = lambda {|x| return 1 if x==0; x*factorial[x-1]; }.memoize
# Or, using the unary operator syntax
factorial = +lambda {|x| return 1 if x==0; x*factorial[x-1]; }

注意这里的factorial方法是一个递归函数，它自己也会调用缓存版本的自身函数，得到最佳的缓存效果。否则，如果定义一个非缓存版本的递归函数，然后根据它定义一个缓存版本方法，优化效果就没有那么突出了：

factorial = lambda {|x| return 1 if x==0; x*factorial[x-1]; }
cached_factorial = +factorial # Recursive calls aren't cached!

6.8.5. 符号、方法和Proc Symbols, Methods, and Procs

Symbol、Method和Proc类有亲密的关系。我们已经看到method方法可以用一个Symbol对象作为参数，然后返回一个Method方法。
Ruby1.9为Symbol类增加了一个有用的to_proc方法，这个方法允许用&打头的符号作为代码块被传入到一个迭代器中。在这里，这个符号被假定为一个方法名。在用to_proc方法创建的Proc对象被调用时，它会调用第一个参数所表示的名字的方法，剩下的参数则作为参数传递给这个方法。示例如下：

# Increment an array of integers with the Fixnum.succ method
[1,2,3].map(&:succ)  # => [2,3,4]

如果不用Symbol.to_proc方法，我们会不得不更啰嗦一些：
[1,2,3].map {|n| n.succ }

Symbol.to_proc 最初是作为Ruby1.8的扩展被引入的，它通常使用如下方式实现：

class Symbol
  def to_proc
    lambda {|receiver, *args| receiver.send(self, *args)}
  end
end

这个实现使用send方法（参见第8.4.3节）来调用一个符号命名的方法。不过，也可以像下面这样做：

class Symbol
  def to_proc
    lambda {|receiver, *args| receiver.method(self)[*args]}
  end
end

除了to_proc，还能定义一些相关而且有用的工具方法，首先从Module类开始：

class Module
  # Access instance methods with array notation. Returns UnboundMethod,
  alias [] instance_method
end

这里我们为Module类的instance_method定义了一个快捷方式。注意，这个方法会返回一个UnboundMethod对象，它在绑定到一个对象前不能被调用，下面是一个使用这种新记号的例子（注意，我们可以用名字像索引一样获得一个方法!）：

String[:reverse].bind("hello").call   # => "olleh"

用同样的语法糖，绑定一个无绑定的方法也可以变得简单：

class UnboundMethod
  # Allow [] as an alternative to bind. 
  alias [] bind
end

使用这里定义的别名，以及已有的用来调用方法的[]别名，上面的代码会变成：

String[:reverse]["hello"][]   # => "olleh"

第一个方括号索引一个方法，第二个方括号将它绑定，而第三个方括号则调用它。

接下来，既然我们用[]操作符来索引一个类的方法，那么就用[]=来定义一个实例方法：

class Module
  # Define a instance method with name sym and body f.
  # Example: String[:backwards] = lambda { reverse }
  def []=(sym, f)
    self.instance_eval { define_method(sym, f) }
  end
end

[]=操作符的定义有点让人费解——这是Ruby的髙级内容。
define_method是Module的私有方法，我们用instance_eval方法（Object的一个公开方法）运行一个代码块（包括一个私有方法的调用），就像它位于方法定义的模块中一样。
在第8章中我们将再次看到instance_eval和define_method方法。

让我们用新的[]=操作符定义一个新的Enumerable.average方法：

Enumerable[:average] = lambda do
  sum, n = 0.0, 0
  self.each {|x| sum += x; n += 1 }
  if n == 0
    nil
  else
    sum/n
  end
end

现在我们有了[]和[]=操作符，它们可以用于获得和设置一个类或模块的成员方法。我们也可以对类的单键方法做相似的事情（也包括类或模块的类方法)。任何对象都可以有单键方法，不过给Object类定义一个[]操作符有点说不通，因为已经有太多的子类已经定义了这个操作符，因此我们可以用另外一种方式，给Symbol类定义这样的操作符：

#
# Add [] and []= operators to the Symbol class for accessing and setting
# singleton methods of objects. Read : as "method" and [] as "of".
# So :m[o] reads "method m of o".
#
class Symbol
  # Return the Method of obj named by this symbol. This may be a singleton
  # method of obj (such as a class method) or an instance method defined
  # by obj.class or inherited from a superclass.
  # Examples:
  #   creator = :new[Object]  # Class method Object.new
  #   doubler = :*[2]         # * method of Fixnum 2
  #
  def [](obj)
    obj.method(self)
  end
  # Define a singleton method on object o, using Proc or Method f as its body.
  # This symbol is used as the name of the method.
  # Examples:
  #
  #  :singleton[o] = lambda { puts "this is a singleton method of o" }
  #  :class_method[String] = lambda { puts "this is a class method" }
  #
  # Note that you can't create instance methods this way. See Module.[]=
  #
  def []=(o,f)
    # We can't use self in the block below, as it is evaluated in the
    # context of a different object. So we have to assign self to a variable.
    sym = self
    # This is the object we define singleton methods on.
    eigenclass = (class << o; self end)
    # define_method is private, so we have to use instance_eval to execute it.
    eigenclass.instance_eval { define_method(sym, f) }
  end
end

通过这里定义的Symbol.[]方法，以及前面描述的Functional模块，我们可以写出如下精巧的（也是难读的）代码：

dashes = :*['-']       # Method * of '-'
puts dashes[10]        # Prints "----------"
y = (:+[1]*:*[2])[x]   # Another way to write y = 2*x + 1

Symbol类定义的[]=与Module类的[]=相似，都使用instance_eval调用define_method方法。不同点在于单键方法并不像实例方法一样在类中定义，而是在对象的中定义，在第7章中，我们还将见到eigenclass。
WarmGrid

Answerers: April and Probe
Ruby 中的函数式编程

Share Post

WarmGrid