Bottom-up Explicit Search.

Chentian Wu

2024-06-12

参考 MIT CS 6.S981 Lec 3

1. Simple bottom up search

最简单的 Bottom-up Synthesis 算法是从语法中的 terminal 开始, 根据语法明确构建所有可能的 programs.

Synthesize(inputs, outputs):
  plist := set of all terminals
  while(true):
    plist := grow(plist);
    plist := elimEquvalents(plist, inputs);
    forall(p in plist):
      if(isCorrect(p, inputs, outputs)): return p;

算法的关键步骤是这样的两个函数:

grow(·): uses non-terminals in the grammar to construct new terms from all the terms in plist
elimEquivalents(·, ·): check of equivalence is not an real equivalence check, which would be expensive. Instead, the expressions are tested on the target inputs, and any two expression that produce the same outputs on these inputs are deemed equivalent, regardless of whether they are truly equivalent or not. (所谓的 observational equivalence)

这个简单的算法有几个好处:

自然地满足从小到大遍历, 所以我们找到的程序自然就是最小的程序.
很容易将启发式方法引入 grow 和 elimEquivalents 指导程序搜索，以便首先发现被认为更理想的程序，或者根据哪些程序更可能正确的 先验知识 加快搜索速度.
适用于黑盒语言构建块.

2. Formal Requirements

并不是所有的语法都允许我们进行自底向上的搜索. 需要满足一些要求.

最重要的要求是程序的语法必须是上下文无关文法 (CFG).
语义具有下面描述的 Context independent equivalence.

Definition. Context independent equivalence

Given two expressions $$e_1$$ , $$e_2$$ evaluated on inputs $\sigma$ , bottom-up search requires that the following condition be true:

\forall\sigma.\textsf{ObsEquiv}(e_1, e_2, \sigma)=\forall\mathcal C.\textsf{ObsEquiv}(\mathcal C[e_1], \mathcal C[e_2], \sigma)

where $\textsf{ObsEquiv}(e_1, e_2, \textsf{inputs})$ is the observational equivalence of two expressions under a given set of inputs.

其中,

$\mathcal C$ 是任意的上下文 (i.e. expressions with a hole)
$\mathcal C[e]$ is the expression by filling the hole with expression $$e$$

Note. 显然 observatioanl equivalent 不是真的 equivalent, 因此我们在使用 observational equivalence 的时候必须非常小心. 对于涉及到递归调用的语言, 我们要保证它的 Trace Completeness, 也就是说用户提供的 examples 必须要包括所有递归调用的 IO.

Trace Completeness 从某种程度来说是一种逃避. 对于通过示例问题进行编程, 要求 Trace Completeness 可能对用户要求过高. 但程序综合有许多应用, 例如逆向工程, 其中请求更多示例相对便宜, 因此对于这些应用 Trace Completeness 可能是一种适当的解决方案.

即使满足了上下文无关的等价性, 该算法还有另一个重要限制: scalaibility. 即使采用非常激进的 prunning, 也很难将算法扩展到少数几个项之外.

3. Synthesis through Unification (STUN)

为了扩展程序合成的 scalability, 一个好的办法是搜索适用于不同情况的多个程序, 然后找到将它们组合成一个适用于所有输入的程序的方法.

TBD.