用python写语言的解释器
我花了一下午的时间完成了一个简单语言的解释器,我会在最后帖出所有代码,但是今天不打算详细解释程序的每一个步骤,最近考虑找实习、做论文,要花一些时间。有时间了我会解释每一部分,在这里只提醒一下读者,程序的写作过程和它呈现出来的不一样,总体来说我的写作过程是先写一个只包含一条指令的解释器,然后逐渐加入其他指令。ps:我是多么的想看看高手们写程序的过程,而不只是结果,但是就像graham说的“写的过程往往显得冗长”,所以这方面的书籍不多。我觉得比较接近的书包括《clean code》、《paip》、《software tools》。只可惜现在只看完了一本半。
想法的来源
昨天圣诞节,坐在不能上网的实验室,觉得好无聊。然后想起了学习《可计算性与计算复杂性》的时候,里面用到了一种含有5条指令的原语言,老师也要求我们用这个语言写程序,所以我就想写一个解释器,来执行用该语言,并把这个解释器作为自己的圣诞礼物。
原语言描述
书中说它是一个fortran式的简单语言,所以我就给它起了个简单的名字Metafor,表示meta fortran。下面来看一下它的具体描述,然后贴上我在代码里更详细的描述,我为每一种指令都起了个名字(注意其中把赋值语句命名为setf是由于我最近在写common lisp多一些)。
+-------------------------------------------------------------------------------------------+
指令 描述
x = x + 1 变量x的值加1
x = x - 1 变元x的值减1。若x的值为0,则结果仍为0
TO A IF x≠0 若x≠0,则转标号为A的指令;否则执行下一条指令
TO A 无条件转到标号为A的指令
y=x 把x的值赋给变元y,x值保持不变
+-------------------------------------------------------------------------------------------+
1. Metafor has five instructions:
name instruction description
inc: x = x + 1, Increase by 1 the value of the variable x.
dec: x = x - 1, If the value of x is 0, leave it unchanged; otherwise
decrease by 1 the value of x.
con_goto: TO A IF x != 0, If the value of x is nonzero, perform the instruction
with the label A next; otherwise proceed to the next
instruction in the list.
goto: TO A, Perform the instruction with the label A next.
setf: y = x, Change the value variable y to the value of variable x.
2. Some specification:
(1) Input variables are represented as:
x1, x2, x3, ...
(2) Local variables are represented as:
z1, z2, z3, ...
(3) Output variable is represented as:
y
note: the num 1 is often omitted(i.e., x stands for x1 and z stands
for z1).
3. Labeled instructions:
Instructions may or may not have labels. When an instruction is labeled,
the label is written to its left in square brackets. For example,
[B] Z = Z - l
4. A smaple program:
"Program to compute y = x1+ x2"
y = x1
[B] TO A IF x2 != 0
TO E
[A] x2 = x2 - 1
y = y + 1
TO B
4. For more information, please refer to 《可计算性与计算复杂性》,
周长林、李占山著.
整体思路
由于该语言中含有goto语句,所以我不能一句一句的解释执行,我需要把整个程序读进来,转换成一种方面的内部格式后,作为整体执行。例如,这是计算y = x + x2的语言程序:
y = x
TO A IF x2 != 0
TO E
x2 = x2 - 1
y = y + 1
TO B
它转换为内部格式后,为:
[['setf', 'y', 'x'],
['labeled_exp', 'B', ['con_goto', 'A', 'x2']],
['goto', 'E'],
['labeled_exp', 'A', ['dec', 'x2']],
['inc', 'y'],
['goto', 'B']]
运行演示——从文件获得程序
还是前面那个y = x + x2的程序,首先我需要为两个输入参数(x, x2)给定值,如果没有为其赋值,那么它们的值默认为0。另外,我们还可以查看程序的内部表示结构,以及简单调试(就查看其他环境变量的值)。其演示结果如下:
======================================================
*Metafor 1.0, Welecome to Metafor shell environment. *
*Author: Zhu zhaolong(zzljlu@gmail.com) *
======================================================
Metafor> 2 3
5
Metafor> 5 6
11
Metafor> code
[['setf', 'y', 'x'],
['labeled_exp', 'B', ['con_goto', 'A', 'x2']],
['goto', 'E'],
['labeled_exp', 'A', ['dec', 'x2']],
['inc', 'y'],
['goto', 'B']]
>>> ========================================== RESTART ==========================================
>>>
======================================================
*Metafor 1.0, Welecome to Metafor shell environment. *
*Author: Zhu zhaolong(zzljlu@gmail.com) *
======================================================
Metafor> 234 5
239
Metafor> debug
debug> x
234
debug> x2
0
debug> y
239
debug> exit
>>>
运行演示——从终端用户直接输入原语言程序
我写了一个简单的repl交互环境,这样我们可以方便的测试简单的原语言程序,例如下面给出了一个y = x + 1的程序,示例如下:
>>> repl()
======================================================
*Metafor 1.0, Welecome to Metafor shell environment. *
*Author: Zhu zhaolong(zzljlu@gmail.com) *
======================================================
Input your program:
x = x + 1
y = x
inputs> 5
=>6
Input your program:
全部代码
在这里我贴出所有python代码:
""" Metafor: meta fortran, is a small language used in
"Computabilit and Complexity" course in Jilin University.
@Author: Zhu Zhaolong(zzljlu@gmail.com)
@Date: 2011.12.25
1. Metafor has five instructions:
name instruction description
inc: x = x + 1, Increase by 1 the value of the variable x.
dec: x = x - 1, If the value of x is 0, leave it unchanged; otherwise
decrease by 1 the value of x.
con_goto: TO A IF x != 0, If the value of x is nonzero, perform the instruction
with the label A next; otherwise proceed to the next
instruction in the list.
goto: TO A, Perform the instruction with the label A next.
setf: y = x, Change the value variable y to the value of variable x.
2. Some specification:
(1) Input variables are represented as:
x1, x2, x3, ...
(2) Local variables are represented as:
z1, z2, z3, ...
(3) Output variable is represented as:
y
note: the num 1 is often omitted(i.e., x stands for x1 and z stands
for z1).
3. Labeled instructions:
Instructions may or may not have labels. When an instruction is labeled,
the label is written to its left in square brackets. For example,
[B] Z = Z - l
4. A smaple program:
"Program to compute y = x1+ x2"
y = x1
[B] TO A IF x2 != 0
TO E
[A] x2 = x2 - 1
y = y + 1
TO B
4. For more information, please refer to 《可计算性与计算复杂性》,
周长林、李占山著.
"""
###========================================================
import re
from pprint import pprint
###===========================parse========================
def parse(program):
return [read(e) for e in program]
def read(s):
"""Read a metafor insctruction from a string. and change
it to internal represtation."""
return read_from(tokenize(s))
def tokenize(s):
"Convert a string into alist of tokens."
return s.split()
def read_from(tokens):
"Read an exp from a sequence of tokens."
if len(tokens) == 0:
raise SyntaxError("uncexpected EOF while reading.")
if is_setf(tokens):
return make_setf(tokens[0], tokens[2])
elif is_inc(tokens):
return make_inc(tokens[0])
elif is_dec(tokens):
return make_dec(tokens[0])
elif is_goto(tokens):
return make_goto(tokens[1])
elif is_con_goto(tokens):
return make_con_goto(tokens[1], tokens[3])
elif is_labeled_exp(tokens):
return make_labeled_exp(get_label(tokens[0]), read_from(tokens[1:]))
else:
raise SyntaxError("unexpected instructions: %s" % " ".join(tokens))
#=======================================================
def is_variable(token):
if token.startswith("x") or /
token.startswith("z") or /
token == "y":
return True
else:
return False
def get_label(token):
"Token is like [A1]. we want get A1."
return token.replace('[', '').replace(']', '')
## setf
def is_setf(tokens):
if tokens[1] == "=" and len(tokens) == 3:
if is_variable(tokens[0]) and is_variable(tokens[2]):
return True
else:
raise SyntaxError("unexpected setf instruction: %s" % " ".join(tokens))
else:
return False
def make_setf(des_var, src_var):
return ["setf", des_var, src_var]
## inc
def is_inc(tokens):
if len(tokens) == 5 and tokens[1] == "=" /
and tokens[3] == "+" and tokens[4] == "1":
if tokens[0] == tokens[2] and is_variable(tokens[0]):
return True
else:
raise SyntaxError("unexpected inc instruction: %s" % " ".join(tokens))
else:
return False
def make_inc(var):
return ["inc", var]
## dec
def is_dec(tokens):
if len(tokens) == 5 and tokens[1] == "=" /
and tokens[3] == "-" and tokens[4] == "1":
if tokens[0] == tokens[2] and is_variable(tokens[0]):
return True
else:
raise SyntaxError("unexpected dec instruction: %s" % " ".join(tokens))
else:
return False
def make_dec(var):
return ["dec", var]
## goto
def is_goto(tokens):
if len(tokens) == 2 and tokens[0] == "TO":
return True
else:
return False
def make_goto(lable):
return ["goto", lable]
## con_goto
def is_con_goto(tokens):
if len(tokens) == 6 and tokens[0] == "TO" /
and tokens[2] == "IF" and is_variable(tokens[3]) /
and tokens[4] == "!=" and tokens[5] == "0":
return True
else:
return False
def make_con_goto(lable, var):
return ["con_goto", lable, var]
## labeled exp
def is_labeled_exp(tokens):
return tokens[0].startswith('[')
def make_labeled_exp(label, exp):
return ["labeled_exp", label, exp]
###==========================CodeSeg======================
def is_label(s):
return s == "labeled_exp"
class CodeSeg:
""" To store internal instructions (exps).
N : the number of instructions.
exps: the list of instructions.
label_2_exps: a dict of {'label': num} pairs."""
def __init__(self, exps):
self.N = len(exps)
self.exps = []
self.label_2_exp={}
for (i, e) in enumerate(exps):
if is_label(e[0]):
(_, label, exp) = e
self.label_2_exp[label] = i
self.exps.append(exp)
else:
self.exps.append(e)
def get_instruction_num(self, label):
"Return instruction num with the label."
if label in self.label_2_exp.keys():
return self.label_2_exp[label]
# IF there is no instruction with that label,
# return a instruction num that doesn't exit.
else:
return self.N
def get_instruction(self, pc):
return self.exps[pc]
###=============================Env======================
class Env():
"An enviromnet: a dict of {'var': val} paris, var's default value is 0."
def __init__(self, args):
self.pairs = {}
for (i, v) in enumerate(args):
if i == 0:
self.pairs['x'] = int(v)
else:
self.pairs['x'+str(i+1)] = int(v)
def get_v(self, var):
if var in self.pairs.keys():
return self.pairs[var]
else:
return 0
def set_v(self, var, value):
self.pairs[var] = value
###=========================mainloop=======================
def mainloop(codeseg, env):
"pc refers to the current instruction."
pc = 0
while pc < codeseg.N:
e = codeseg.get_instruction(pc)
if e[0] == 'inc':
(_, v) = e
env.set_v(v, env.get_v(v) + 1)
pc += 1
elif e[0] == 'dec':
(_, v) = e
env.set_v(v, max(env.get_v(v) - 1, 0))
pc += 1
elif e[0] == 'setf':
(_, x, y) = e
env.set_v(x, env.get_v(y))
pc += 1
elif e[0] == 'goto':
(_, label) = e
pc = codeseg.get_instruction_num(label)
elif e[0] == "con_goto":
(_, label, var) = e
val = env.get_v(var)
if val != 0:
pc = codeseg.get_instruction_num(label)
else:
pc += 1
else:
raise SyntaxError("unexpected instructions: %s" % " ".join(e))
return env.get_v('y')
###========================repl================================
def repl(input_prompt='inputs> '):
"A prompt-read-eval-print loop."
print_info()
while True:
program = get_prog_from_shell()
if program == []:
return "Return successfully!"
codes = CodeSeg(parse(program))
inputs = raw_input(input_prompt)
env = Env(inputs.split())
print("=>" + str(mainloop(codes, env)))
def get_prog_from_shell():
program = []
exp = raw_input("/n/nInput your program:/n")
while exp:
program.append(exp)
exp = raw_input()
return program
def print_info():
print("======================================================/n"
"*Metafor 1.0, Welecome to Metafor shell environment. */n"
"*Author: Zhu zhaolong(zzljlu@gmail.com) */n"
"======================================================/n")
###========================run=============================
def run(filepath, prompt='Metafor> '):
"This function can run Metafor program from file."
print_info()
program = get_prog_from_file(filepath)
parsed_prog = parse(program)
codes = CodeSeg(parsed_prog)
inputs = raw_input(prompt)
while True:
env = Env(inputs.split())
print(mainloop(codes, env))
inputs = raw_input(prompt)
if inputs == "exit":
return "Return successfully!"
if inputs == "debug":
while True:
var = raw_input('debug> ')
if var == "exit":
return None
print(env.get_v(var))
if inputs == "code":
pprint(parsed_prog)
return None
def get_prog_from_file(filepath):
f = open(filepath, "r")
program = f.readlines()
f.close()
return program
###===========================test==================================
# y = x + x2
test_file_1 = "metafor_test.mf"
# y = 2 * x
test_file_2 = "metafor_test2.mf"
if __name__ == "__main__":
run(test_file_1)
摘自 zzljlu的专栏