用python写语言的解释器

来源:岁月联盟 编辑:exp 时间:2011-12-26

我花了一下午的时间完成了一个简单语言的解释器,我会在最后帖出所有代码,但是今天不打算详细解释程序的每一个步骤,最近考虑找实习、做论文,要花一些时间。有时间了我会解释每一部分,在这里只提醒一下读者,程序的写作过程和它呈现出来的不一样,总体来说我的写作过程是先写一个只包含一条指令的解释器,然后逐渐加入其他指令。ps:我是多么的想看看高手们写程序的过程,而不只是结果,但是就像graham说的“写的过程往往显得冗长”,所以这方面的书籍不多。我觉得比较接近的书包括《clean code》、《paip》、《software tools》。只可惜现在只看完了一本半。

想法的来源
昨天圣诞节,坐在不能上网的实验室,觉得好无聊。然后想起了学习《可计算性与计算复杂性》的时候,里面用到了一种含有5条指令的原语言,老师也要求我们用这个语言写程序,所以我就想写一个解释器,来执行用该语言,并把这个解释器作为自己的圣诞礼物。
原语言描述
书中说它是一个fortran式的简单语言,所以我就给它起了个简单的名字Metafor,表示meta fortran。下面来看一下它的具体描述,然后贴上我在代码里更详细的描述,我为每一种指令都起了个名字(注意其中把赋值语句命名为setf是由于我最近在写common lisp多一些)。
+-------------------------------------------------------------------------------------------+
指令                         描述
x = x + 1                   变量x的值加1
x = x - 1                    变元x的值减1。若x的值为0,则结果仍为0
TO A IF x≠0             若x≠0,则转标号为A的指令;否则执行下一条指令
TO A                        无条件转到标号为A的指令
y=x                           把x的值赋给变元y,x值保持不变
+-------------------------------------------------------------------------------------------+

1. Metafor has five instructions: 
 
name instruction description 
 
inc: x = x + 1, Increase by 1 the value of the variable x. 
dec: x = x - 1, If the value of x is 0, leave it unchanged; otherwise 
                decrease by 1 the value of x. 
con_goto: TO A IF x != 0, If the value of x is nonzero, perform the instruction 
                          with the label A next; otherwise proceed to the next 
                          instruction in the list. 
goto: TO A, Perform the instruction with the label A next. 
setf: y = x, Change the value variable y to the value of variable x. 
 
2. Some specification: 
 
(1) Input variables are represented as: 
x1, x2, x3, ... 
 
(2) Local variables are represented as: 
z1, z2, z3, ... 
 
(3) Output variable is represented as: 

 
note: the num 1 is often omitted(i.e., x stands for x1 and z stands 
for z1). 
 
3. Labeled instructions: 
 
Instructions may or may not have labels. When an instruction is labeled, 
the label is written to its left in square brackets. For example, 
 
[B] Z = Z - l 
 
4. A smaple program: 
 
"Program to compute y = x1+ x2" 
 
    y = x1 
[B] TO A IF x2 != 0 
    TO E 
[A] x2 = x2 - 1 
    y = y + 1 
    TO B 
 
4. For more information, please refer to 《可计算性与计算复杂性》, 
周长林、李占山著. 
整体思路
由于该语言中含有goto语句,所以我不能一句一句的解释执行,我需要把整个程序读进来,转换成一种方面的内部格式后,作为整体执行。例如,这是计算y = x + x2的语言程序:

y = x 
TO A IF x2 != 0 
TO E 
x2 = x2 - 1 
y = y + 1 
TO B 
它转换为内部格式后,为:

[['setf', 'y', 'x'], 
 ['labeled_exp', 'B', ['con_goto', 'A', 'x2']], 
 ['goto', 'E'], 
 ['labeled_exp', 'A', ['dec', 'x2']], 
 ['inc', 'y'], 
 ['goto', 'B']] 
运行演示——从文件获得程序
还是前面那个y = x + x2的程序,首先我需要为两个输入参数(x, x2)给定值,如果没有为其赋值,那么它们的值默认为0。另外,我们还可以查看程序的内部表示结构,以及简单调试(就查看其他环境变量的值)。其演示结果如下:

====================================================== 
*Metafor 1.0, Welecome to Metafor shell environment. * 
*Author: Zhu zhaolong(zzljlu@gmail.com)              * 
====================================================== 
 
Metafor> 2 3 

Metafor> 5 6 
11 
Metafor> code 
[['setf', 'y', 'x'], 
 ['labeled_exp', 'B', ['con_goto', 'A', 'x2']], 
 ['goto', 'E'], 
 ['labeled_exp', 'A', ['dec', 'x2']], 
 ['inc', 'y'], 
 ['goto', 'B']] 
>>> ========================================== RESTART ========================================== 
>>>  
====================================================== 
*Metafor 1.0, Welecome to Metafor shell environment. * 
*Author: Zhu zhaolong(zzljlu@gmail.com)              * 
====================================================== 
 
Metafor> 234 5 
239 
Metafor> debug 
debug> x 
234 
debug> x2 

debug> y 
239 
debug> exit 
>>>  
运行演示——从终端用户直接输入原语言程序
我写了一个简单的repl交互环境,这样我们可以方便的测试简单的原语言程序,例如下面给出了一个y =  x + 1的程序,示例如下:

>>> repl() 
====================================================== 
*Metafor 1.0, Welecome to Metafor shell environment. * 
*Author: Zhu zhaolong(zzljlu@gmail.com)              * 
====================================================== 
 
 
 
Input your program: 
x = x + 1 
y = x 
 
inputs> 5 
=>6 
 
 
Input your program: 
全部代码
在这里我贴出所有python代码:

""" Metafor: meta fortran, is a small language used in
"Computabilit and Complexity" course in Jilin University.
 
@Author: Zhu Zhaolong(zzljlu@gmail.com)
@Date: 2011.12.25
 
 
1. Metafor has five instructions:
 
name instruction description
 
inc: x = x + 1, Increase by 1 the value of the variable x.
dec: x = x - 1, If the value of x is 0, leave it unchanged; otherwise
                decrease by 1 the value of x.
con_goto: TO A IF x != 0, If the value of x is nonzero, perform the instruction
                          with the label A next; otherwise proceed to the next
                          instruction in the list.
goto: TO A, Perform the instruction with the label A next.
setf: y = x, Change the value variable y to the value of variable x.
 
2. Some specification:
 
(1) Input variables are represented as:
x1, x2, x3, ...
 
(2) Local variables are represented as:
z1, z2, z3, ...
 
(3) Output variable is represented as:
y
 
note: the num 1 is often omitted(i.e., x stands for x1 and z stands
for z1).
 
3. Labeled instructions:
 
Instructions may or may not have labels. When an instruction is labeled,
the label is written to its left in square brackets. For example,
 
[B] Z = Z - l
 
4. A smaple program:
 
"Program to compute y = x1+ x2"
 
    y = x1
[B] TO A IF x2 != 0
    TO E
[A] x2 = x2 - 1
    y = y + 1
    TO B
 
4. For more information, please refer to 《可计算性与计算复杂性》,
周长林、李占山著.
 
""" 
###======================================================== 
import re 
from pprint import pprint 
 
###===========================parse======================== 
def parse(program): 
    return [read(e) for e in program] 
 
def read(s): 
    """Read a metafor insctruction from a string. and change
    it to internal represtation.""" 
    return read_from(tokenize(s)) 
 
def tokenize(s): 
    "Convert a string into alist of tokens." 
    return s.split() 
 
def read_from(tokens): 
    "Read an exp from a sequence of tokens." 
    if len(tokens) == 0: 
        raise SyntaxError("uncexpected EOF while reading.") 
 
    if is_setf(tokens): 
        return make_setf(tokens[0], tokens[2]) 
    elif is_inc(tokens): 
        return make_inc(tokens[0]) 
    elif is_dec(tokens): 
        return make_dec(tokens[0]) 
    elif is_goto(tokens): 
        return make_goto(tokens[1]) 
    elif is_con_goto(tokens): 
        return make_con_goto(tokens[1], tokens[3]) 
    elif is_labeled_exp(tokens): 
        return make_labeled_exp(get_label(tokens[0]), read_from(tokens[1:])) 
    else: 
        raise SyntaxError("unexpected instructions: %s" % " ".join(tokens)) 
 
 
#======================================================= 
 
def is_variable(token): 
    if token.startswith("x") or / 
       token.startswith("z") or / 
       token == "y": 
        return True 
    else: 
        return False 
 
def get_label(token): 
    "Token is like [A1]. we want get A1." 
    return token.replace('[', '').replace(']', '') 
 
## setf     
def is_setf(tokens): 
    if tokens[1] == "=" and len(tokens) == 3: 
        if is_variable(tokens[0]) and is_variable(tokens[2]): 
            return True 
        else: 
            raise SyntaxError("unexpected setf instruction: %s" % " ".join(tokens)) 
    else: 
        return False 
 
def make_setf(des_var, src_var): 
    return ["setf", des_var, src_var] 
 
## inc  
def is_inc(tokens): 
    if len(tokens) == 5 and tokens[1] == "=" / 
       and tokens[3] == "+" and tokens[4] == "1": 
        if tokens[0] == tokens[2] and is_variable(tokens[0]): 
            return True 
        else: 
            raise SyntaxError("unexpected inc instruction: %s" % " ".join(tokens)) 
    else: 
        return False 
def make_inc(var): 
    return ["inc", var] 
 
## dec 
def is_dec(tokens): 
    if len(tokens) == 5 and tokens[1] == "=" / 
       and tokens[3] == "-" and tokens[4] == "1": 
        if tokens[0] == tokens[2] and is_variable(tokens[0]): 
            return True 
        else: 
            raise SyntaxError("unexpected dec instruction: %s" % " ".join(tokens)) 
    else: 
        return False 
def make_dec(var): 
    return ["dec", var] 
 
## goto 
def is_goto(tokens): 
    if len(tokens) == 2 and tokens[0] == "TO": 
        return True 
    else: 
        return False 
 
def make_goto(lable): 
    return ["goto", lable] 
 
## con_goto 
def is_con_goto(tokens): 
    if len(tokens) == 6 and tokens[0] == "TO" / 
       and tokens[2] == "IF" and is_variable(tokens[3]) / 
       and tokens[4] == "!=" and tokens[5] == "0": 
        return True 
    else: 
        return False 
 
def make_con_goto(lable, var): 
    return ["con_goto", lable, var] 
 
## labeled exp 
def is_labeled_exp(tokens): 
    return tokens[0].startswith('[') 
 
def make_labeled_exp(label, exp): 
    return ["labeled_exp", label, exp] 
 
 
###==========================CodeSeg====================== 
 
def is_label(s): 
    return s == "labeled_exp" 
         
class CodeSeg: 
    """ To store internal instructions (exps).
    N : the number of instructions.
    exps: the list of instructions.
    label_2_exps: a dict of {'label': num} pairs.""" 
     
    def __init__(self, exps): 
        self.N = len(exps) 
        self.exps = [] 
        self.label_2_exp={} 
        for (i, e) in enumerate(exps): 
            if is_label(e[0]): 
                (_, label, exp) = e 
                self.label_2_exp[label] = i 
                self.exps.append(exp) 
            else: 
                self.exps.append(e) 
 
    def get_instruction_num(self, label): 
        "Return instruction num with the label." 
        if label in self.label_2_exp.keys(): 
            return self.label_2_exp[label] 
        # IF there is no instruction with that label, 
        # return a instruction num that doesn't exit. 
        else:  
            return self.N 
 
    def get_instruction(self, pc): 
        return self.exps[pc] 
     
         
###=============================Env====================== 
 
class Env(): 
    "An enviromnet: a dict of {'var': val} paris, var's default value is 0." 
    def __init__(self, args): 
        self.pairs = {} 
        for (i, v) in enumerate(args): 
            if i == 0: 
                self.pairs['x'] = int(v) 
            else: 
                self.pairs['x'+str(i+1)] = int(v) 
 
    def get_v(self, var): 
        if var in self.pairs.keys(): 
            return self.pairs[var] 
        else: 
            return 0 
         
    def set_v(self, var, value): 
        self.pairs[var] = value 
 
###=========================mainloop======================= 
 
def mainloop(codeseg, env): 
    "pc refers to the current instruction."  
    pc = 0 
    while pc < codeseg.N: 
        e = codeseg.get_instruction(pc) 
        if e[0] == 'inc': 
            (_, v) = e 
            env.set_v(v, env.get_v(v) + 1) 
            pc += 1 
        elif e[0] == 'dec': 
            (_, v) = e 
            env.set_v(v, max(env.get_v(v) - 1, 0)) 
            pc += 1 
        elif e[0] == 'setf': 
            (_, x, y) = e 
            env.set_v(x, env.get_v(y)) 
            pc += 1 
        elif e[0] == 'goto': 
            (_, label) = e 
            pc = codeseg.get_instruction_num(label) 
        elif e[0] == "con_goto": 
            (_, label, var) = e 
            val = env.get_v(var) 
            if val != 0: 
                pc = codeseg.get_instruction_num(label) 
            else: 
                pc += 1 
        else: 
            raise SyntaxError("unexpected instructions: %s" % " ".join(e)) 
    return env.get_v('y') 
 
###========================repl================================ 
 
def repl(input_prompt='inputs> '): 
    "A prompt-read-eval-print loop." 
    print_info() 
    while True: 
        program = get_prog_from_shell() 
        if program == []: 
            return "Return successfully!" 
        codes = CodeSeg(parse(program)) 
        inputs = raw_input(input_prompt) 
        env = Env(inputs.split()) 
        print("=>" + str(mainloop(codes, env))) 
             
 
def get_prog_from_shell(): 
    program = [] 
    exp = raw_input("/n/nInput your program:/n") 
    while exp: 
        program.append(exp) 
        exp = raw_input() 
    return program 
 
def print_info(): 
    print("======================================================/n" 
          "*Metafor 1.0, Welecome to Metafor shell environment. */n" 
          "*Author: Zhu zhaolong(zzljlu@gmail.com)              */n" 
          "======================================================/n") 
 
###========================run============================= 
 
def run(filepath, prompt='Metafor> '): 
    "This function can run Metafor program from file." 
    print_info() 
    program = get_prog_from_file(filepath) 
    parsed_prog = parse(program) 
    codes = CodeSeg(parsed_prog) 
 
    inputs = raw_input(prompt) 
    while True: 
        env = Env(inputs.split()) 
        print(mainloop(codes, env)) 
         
        inputs = raw_input(prompt) 
        if inputs == "exit": 
            return "Return successfully!" 
        if inputs == "debug": 
            while True: 
                var = raw_input('debug> ') 
                if var == "exit": 
                    return None 
                print(env.get_v(var)) 
        if inputs == "code": 
            pprint(parsed_prog) 
            return None 
 
def get_prog_from_file(filepath): 
    f = open(filepath, "r") 
    program = f.readlines() 
    f.close() 
 
    return program 
 
###===========================test================================== 
 
# y = x + x2 
test_file_1 = "metafor_test.mf" 
# y = 2 * x 
test_file_2 = "metafor_test2.mf" 
 
if __name__ == "__main__":  
    run(test_file_1) 
  
 摘自   zzljlu的专栏     

上一篇:python发邮件

图片内容