python 于web 2.0网站的应用 (qcon beijing 2010)

Post on 01-Sep-2014

962 Views

Category:

Technology

3 Downloads

Preview:

Click to see full reader

DESCRIPTION

 

TRANSCRIPT

Python于Web 2.0网站的应用

洪强宁QCon Beijing 2010

http://www.flickr.com/photos/arnolouise/2986467632/

Saturday, April 24, 2010

About Me• Python程序员

• 2002年开始接触Python

• 2004年开始完全使用Python工作

• http://www.douban.com/people/hongqn/

• hongqn@douban.com

• http://twitter.com/hongqn

Saturday, April 24, 2010

Python

• Python is a programming language that lets you work more quickly and integrate your systems more effectively. You can learn to use Python and see almost immediate gains in productivity and lower maintenance costs. (via http://python.org/)

Saturday, April 24, 2010

Languages in 豆瓣

其他(Pyrex/R/Erlang/Go/Shell)1%

C++3%

Javascript12%

C27%

Python58%

Saturday, April 24, 2010

Why Python?

Saturday, April 24, 2010

简单易学

Saturday, April 24, 2010

简单易学

• Hello World: 1分钟

Saturday, April 24, 2010

简单易学

• Hello World: 1分钟

• 小工具脚本: 1下午

Saturday, April 24, 2010

简单易学

• Hello World: 1分钟

• 小工具脚本: 1下午

• 实用程序: 1周

Saturday, April 24, 2010

简单易学

• Hello World: 1分钟

• 小工具脚本: 1下午

• 实用程序: 1周

• 做个豆瓣: 3个月

Saturday, April 24, 2010

开发迅捷

Saturday, April 24, 2010

开发迅捷统计各种语言的代码行数: 13行

Saturday, April 24, 2010

开发迅捷

import osfrom collections import defaultdict

d = defaultdict(int)

for dirpath, dirnames, filenames in os.walk('.'): for filename in filenames: path = os.path.join(dirpath, filename) ext = os.path.splitext(filename)[1] d[ext] += len(list(open(path)))

for ext, n_lines in d.items(): print ext, n_lines

统计各种语言的代码行数: 13行

Saturday, April 24, 2010

易于协作

• 强制缩进保证代码结构清晰易读• Pythonic避免强烈的个人风格

Saturday, April 24, 2010

部署方便

• 上线三部曲1. svn ci

2. svn up

3. restart

Saturday, April 24, 2010

适用面广

• Web应用

• 离线计算• 运维脚本• 数据分析

Saturday, April 24, 2010

资源丰富

• Battery Included: 标准库内置200+模块

• PyPI: 9613 packages currently

• 网络/数据库/桌面/游戏/科学计算/安全/文本处理/...

• easily extensible

Saturday, April 24, 2010

更重要的是,老赵也推荐Python

Saturday, April 24, 2010

更重要的是,老赵也推荐Python

Saturday, April 24, 2010

更重要的是,老赵也推荐Python

Just kidding :-p

Saturday, April 24, 2010

示例

Saturday, April 24, 2010

Web Server

Saturday, April 24, 2010

Web Server

• python -m SimpleHTTPServer

Saturday, April 24, 2010

Web Server

• python -m SimpleHTTPServer

Saturday, April 24, 2010

web.pyimport web

urls = ( '/(.*)', 'hello')app = web.application(urls, globals())

class hello: def GET(self, name): if not name: name = 'World' return 'Hello, ' + name + '!'

if __name__ == "__main__": app.run()

http://webpy.org/

Saturday, April 24, 2010

Flaskimport flask import Flaskapp = Flask(__name__)

@app.route("/<name>")def hello(name): if not name: name = 'World' return 'Hello, ' + name + '!'

if __name__ == "__main__": app.run()

http://flask.pocoo.org/

Saturday, April 24, 2010

WSGIhttp://www.python.org/dev/peps/pep-0333/

Saturday, April 24, 2010

Why so many Python web frameworks?

• Because you can write your own framework in 3 hours and a total of 60 lines of Python code.

• http://bitworking.org/news/Why_so_many_Python_web_frameworks

Saturday, April 24, 2010

doctestdef cube(x): """ >>> cube(10) 1000 """ return x * x

def _test(): import doctest doctest.testmod()

if __name__ == "__main__": _test()

Saturday, April 24, 2010

nose http://somethingaboutorange.com/mrl/projects/nose/

from cube import cube

def test_cube(): result = cube(10) assert result == 1000

Saturday, April 24, 2010

numpy

>>> from numpy import *>>> A = arange(4).reshape(2, 2)>>> Aarray([[0, 1], [2, 3]])>>> dot(A, A.T)array([[ 1, 3], [ 3, 13]])

http://numpy.scipy.org/

Saturday, April 24, 2010

ipython

$ ipython -pylabIn [1]: X = frange(0, 10, 0.1)In [2]: Y = [sin(x) for x in X]In [3]: plot(X, Y)

http://numpy.scipy.org/

Saturday, April 24, 2010

ipython

$ ipython -pylabIn [1]: X = frange(0, 10, 0.1)In [2]: Y = [sin(x) for x in X]In [3]: plot(X, Y)

http://numpy.scipy.org/

Saturday, April 24, 2010

virtualenv

$ python go-pylons.py --no-site-packages mydevenv$ cd mydevenv$ source bin/activate(mydevenv)$ paster create -t new9 helloworld

http://virtualenv.openplans.org/

创建一个干净的、隔离的python环境

Saturday, April 24, 2010

Pyrex/Cython

cdef extern from "math.h" double sin(double)

cdef double f(double x): return sin(x*x)

Saturday, April 24, 2010

哲学Pythonic

Saturday, April 24, 2010

>>> import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.

Explicit is better than implicit.

Simple is better than complex.

Complex is better than complicated.

Flat is better than nested.

Sparse is better than dense.

Readability counts.

Special cases aren't special enough to break the rules.

Although practicality beats purity.

Errors should never pass silently.

Unless explicitly silenced.

翻译:赖勇浩

http://bit.ly/pyzencn

优美胜于丑陋

明了胜于晦涩

简洁胜于复杂

复杂胜于凌乱

扁平胜于嵌套

间隔胜于紧凑

可读性很重要

即便假借特例的实用性之名,也不可违背这些规则

 

不要包容所有错误,除非你确定需要这样做

 

Saturday, April 24, 2010

In the face of ambiguity, refuse the temptation to guess.

There should be one-- and preferably only one --obvious way to do it.

Although that way may not be obvious at first unless you're Dutch.

Now is better than never.

Although never is often better than *right* now.

If the implementation is hard to explain, it's a bad idea.

If the implementation is easy to explain, it may be a good idea.

Namespaces are one honking great idea -- let's do more of those!

当存在多种可能,不要尝试去猜测

而是尽量找一种,最好是唯一一种明显的解决方案

虽然这并不容易,因为你不是 Python 之父

 

做也许好过不做,但不假思索就动手还不如不做

 

如果你无法向人描述你的方案,那肯定不是一个好方案;反之亦然

 

命名空间是一种绝妙的理念,我们应当多加利用

Saturday, April 24, 2010

Simple is better than complex

class HelloWorld{ public static void main(String args[]) { System.out.println("Hello World!"); }}

Saturday, April 24, 2010

Simple is better than complex

print "Hello World!"

Saturday, April 24, 2010

Readability counts

Saturday, April 24, 2010

Readability counts

• 强制块缩进,没有{}和end

Saturday, April 24, 2010

Readability counts

• 强制块缩进,没有{}和end

• 没有费解的字符 (except "@" for decorators)

Saturday, April 24, 2010

Readability counts

• 强制块缩进,没有{}和end

• 没有费解的字符 (except "@" for decorators)

if limit is not None and len(ids)>limit: ids = random.sample(ids, limit)

Saturday, April 24, 2010

TOOWTDI

• There (should be) Only One Way To Do It.

• vs. Perlish TIMTOWTDI (There Is More Than One Way To Do It)

Saturday, April 24, 2010

TOOWTDI

• There (should be) Only One Way To Do It.

• vs. Perlish TIMTOWTDI (There Is More Than One Way To Do It)

a = [1, 2, 3, 4, 5]b = []for i in range(len(a)): b.append(a[i]*2)

Saturday, April 24, 2010

TOOWTDI

• There (should be) Only One Way To Do It.

• vs. Perlish TIMTOWTDI (There Is More Than One Way To Do It)

a = [1, 2, 3, 4, 5]b = []for i in range(len(a)): b.append(a[i]*2)

Saturday, April 24, 2010

TOOWTDI

• There (should be) Only One Way To Do It.

• vs. Perlish TIMTOWTDI (There Is More Than One Way To Do It)

a = [1, 2, 3, 4, 5]b = []for i in range(len(a)): b.append(a[i]*2)

b = []for x in a: b.append(x*2)

Saturday, April 24, 2010

TOOWTDI

• There (should be) Only One Way To Do It.

• vs. Perlish TIMTOWTDI (There Is More Than One Way To Do It)

a = [1, 2, 3, 4, 5]b = []for i in range(len(a)): b.append(a[i]*2)

b = []for x in a: b.append(x*2)

Saturday, April 24, 2010

TOOWTDI

• There (should be) Only One Way To Do It.

• vs. Perlish TIMTOWTDI (There Is More Than One Way To Do It)

b = [x*2 for x in a]

Saturday, April 24, 2010

http://twitter.com/hongqn/status/9883515681

http://twitter.com/robbinfan/status/9879724095

Saturday, April 24, 2010

有图有真相

Python C

http://www.flickr.com/photos/nicksieger/281055485/ http://www.flickr.com/photos/nicksieger/281055530/

Saturday, April 24, 2010

看图不说话

Ruby

http://www.flickr.com/photos/nicksieger/280661836/

Saturday, April 24, 2010

看图不说话

Java

http://www.flickr.com/photos/nicksieger/280662707/

Saturday, April 24, 2010

利用Python的语言特性简化开发

Saturday, April 24, 2010

案例零

Saturday, April 24, 2010

案例零

• svn中保持缺省配置,开发者环境和线上环境按需特例配置

Saturday, April 24, 2010

案例零

• svn中保持缺省配置,开发者环境和线上环境按需特例配置

• 配置中需要复合结构数据(如list)

Saturday, April 24, 2010

案例零

• svn中保持缺省配置,开发者环境和线上环境按需特例配置

• 配置中需要复合结构数据(如list)

• 多个配置文件 + 部署时自动合并?

Saturday, April 24, 2010

案例零

• svn中保持缺省配置,开发者环境和线上环境按需特例配置

• 配置中需要复合结构数据(如list)

• 多个配置文件 + 部署时自动合并?

• 编写配置文件格式parser?

Saturday, April 24, 2010

MEMCACHED_ADDR = ['localhost:11211']

from local_config import *

config.py

Saturday, April 24, 2010

MEMCACHED_ADDR = ['localhost:11211']

from local_config import *

config.py

MEMCACHED_ADDR = [ 'frodo:11211', 'sam:11211', 'pippin:11211', 'merry:11211',]

local_config.py

Saturday, April 24, 2010

MEMCACHED_ADDR = ['localhost:11211']

from local_config import *

config.py

MEMCACHED_ADDR = [ 'frodo:11211', 'sam:11211', 'pippin:11211', 'merry:11211',]

local_config.py文件名后缀不为.py时,也可使用exec

Saturday, April 24, 2010

案例一

• 某些页面必须拥有某个权限才能访问

Saturday, April 24, 2010

class GroupUI(object): def new_topic(self, request): if self.group.can_post(request.user): return new_topic_ui(self.group) else: request.response.set_status(403, "Forbidden") return error_403_ui(msg="成为小组成员才能发帖")

def join(self, request): if self.group.can_join(request.user): ...

class Group(object): def can_post(self, user): return self.group.has_member(user)

def can_join(self, user): return not self.group.has_banned(user)

Saturday, April 24, 2010

class GroupUI(object): @check_permission('post', msg="成为小组成员才能发帖") def new_topic(self, request): return new_topic_ui(self.group)

@check_permission('join', msg="不能加入小组") def join(self, request): ...

class Group(object): def can_post(self, user): return self.group.has_member(user)

def can_join(self, user): return not self.group.has_banned(user)

Saturday, April 24, 2010

decorator

def print_before_exec(func): def _(*args, **kwargs): print "decorated" return func(*args, **kwargs) return _

@print_before_execdef double(x): print x*2

double(10)

Saturday, April 24, 2010

decorator

def print_before_exec(func): def _(*args, **kwargs): print "decorated" return func(*args, **kwargs) return _

@print_before_execdef double(x): print x*2

double(10)

输出:

decorated20

Saturday, April 24, 2010

class check_permission(object): def __init__(self, action, msg=None): self.action = action self.msg = msg

def __call__(self, func): def _(ui, *args, **kwargs): f = getattr(ui.perm_obj, 'can_' + action) if f(req.user): return func(ui, *args, **kwargs) raise BadPermission(ui.perm_obj, self.action, self.msg) return _

Saturday, April 24, 2010

class check_permission(object): def __init__(self, action, msg=None): self.action = action self.msg = msg

def __call__(self, func): def _(ui, *args, **kwargs): f = getattr(ui.perm_obj, 'can_' + action) if f(req.user): return func(ui, *args, **kwargs) raise BadPermission(ui.perm_obj, self.action, self.msg) return _

Saturday, April 24, 2010

class check_permission(object): def __init__(self, action, msg=None): self.action = action self.msg = msg

def __call__(self, func): def _(ui, *args, **kwargs): f = getattr(ui.perm_obj, 'can_' + action) if f(req.user): return func(ui, *args, **kwargs) raise BadPermission(ui.perm_obj, self.action, self.msg) return _

Saturday, April 24, 2010

class check_permission(object): def __init__(self, action, msg=None): self.action = action self.msg = msg

def __call__(self, func): def _(ui, *args, **kwargs): f = getattr(ui.perm_obj, 'can_' + action) if f(req.user): return func(ui, *args, **kwargs) raise BadPermission(ui.perm_obj, self.action, self.msg) return _

Saturday, April 24, 2010

class check_permission(object): def __init__(self, action, msg=None): self.action = action self.msg = msg

def __call__(self, func): def _(ui, *args, **kwargs): f = getattr(ui.perm_obj, 'can_' + action) if f(req.user): return func(ui, *args, **kwargs) raise BadPermission(ui.perm_obj, self.action, self.msg) return _

Saturday, April 24, 2010

class check_permission(object): def __init__(self, action, msg=None): self.action = action self.msg = msg

def __call__(self, func): def _(ui, *args, **kwargs): f = getattr(ui.perm_obj, 'can_' + action) if f(req.user): return func(ui, *args, **kwargs) raise BadPermission(ui.perm_obj, self.action, self.msg) return _

Saturday, April 24, 2010

class GroupUI(object): @check_permission('post', msg="成为小组成员才能发帖") def new_topic(self, request): return new_topic_ui(self.group)

@check_permission('join', msg="不能加入小组") def join(self, request): ...

class Group(object): def can_post(self, user): return self.group.has_member(user)

def can_join(self, user): return not self.group.has_banned(user)

Saturday, April 24, 2010

案例二

• 使用消息队列异步调用函数

Saturday, April 24, 2010

def send_notification_mail(email, subject, body): msg = MSG_SEND_MAIL + '\0' + email + '\0' + subject + '\0' + body mq.put(msg)

def async_worker(): msg = mq.get() msg = msg.split('\0') cmd = msg[0] if cmd == MSG_SEND_MAIL: email, subject, body = msg[1:] fromaddr = 'no-reply@douban.com' email_body = make_email_body(fromaddr, email, subject, body) smtp = smtplib.SMTP('mail') smtp.sendmail(fromaddr, email, email_body) elif cmd == MSG_xxxx: ... elif cmd == MSG_yyyy: ...

Saturday, April 24, 2010

@asyncdef send_notification_mail(email, subject, body): fromaddr = 'no-reply@douban.com' email_body = make_email_body(fromaddr, email, subject, body) smtp = smtplib.SMTP('mail') smtp.sendmail(fromaddr, email, email_body)

Saturday, April 24, 2010

def async(func): mod = sys.modules[func.__module__] fname = 'origin_' + func.__name__ mod[fname] = func def _(*a, **kw): body = cPickle.dumps((mod.__name__, fname, a, kw)) mq.put(body) return _

def async_worker(): modname, fname, a, kw = cPickle.loads(mq.get()) __import__(modname) mod = sys.modules[modname] mod[fname](*a, **kw)

Saturday, April 24, 2010

def async(func): mod = sys.modules[func.__module__] fname = 'origin_' + func.__name__ mod[fname] = func def _(*a, **kw): body = cPickle.dumps((mod.__name__, fname, a, kw)) mq.put(body) return _

def async_worker(): modname, fname, a, kw = cPickle.loads(mq.get()) __import__(modname) mod = sys.modules[modname] mod[fname](*a, **kw)

Saturday, April 24, 2010

def async(func): mod = sys.modules[func.__module__] fname = 'origin_' + func.__name__ mod[fname] = func def _(*a, **kw): body = cPickle.dumps((mod.__name__, fname, a, kw)) mq.put(body) return _

def async_worker(): modname, fname, a, kw = cPickle.loads(mq.get()) __import__(modname) mod = sys.modules[modname] mod[fname](*a, **kw)

Saturday, April 24, 2010

def async(func): mod = sys.modules[func.__module__] fname = 'origin_' + func.__name__ mod[fname] = func def _(*a, **kw): body = cPickle.dumps((mod.__name__, fname, a, kw)) mq.put(body) return _

def async_worker(): modname, fname, a, kw = cPickle.loads(mq.get()) __import__(modname) mod = sys.modules[modname] mod[fname](*a, **kw)

Saturday, April 24, 2010

def async(func): mod = sys.modules[func.__module__] fname = 'origin_' + func.__name__ mod[fname] = func def _(*a, **kw): body = cPickle.dumps((mod.__name__, fname, a, kw)) mq.put(body) return _

def async_worker(): modname, fname, a, kw = cPickle.loads(mq.get()) __import__(modname) mod = sys.modules[modname] mod[fname](*a, **kw)

Saturday, April 24, 2010

def async(func): mod = sys.modules[func.__module__] fname = 'origin_' + func.__name__ mod[fname] = func def _(*a, **kw): body = cPickle.dumps((mod.__name__, fname, a, kw)) mq.put(body) return _

def async_worker(): modname, fname, a, kw = cPickle.loads(mq.get()) __import__(modname) mod = sys.modules[modname] mod[fname](*a, **kw)

Saturday, April 24, 2010

def async(func): mod = sys.modules[func.__module__] fname = 'origin_' + func.__name__ mod[fname] = func def _(*a, **kw): body = cPickle.dumps((mod.__name__, fname, a, kw)) mq.put(body) return _

def async_worker(): modname, fname, a, kw = cPickle.loads(mq.get()) __import__(modname) mod = sys.modules[modname] mod[fname](*a, **kw)

Saturday, April 24, 2010

def async(func): mod = sys.modules[func.__module__] fname = 'origin_' + func.__name__ mod[fname] = func def _(*a, **kw): body = cPickle.dumps((mod.__name__, fname, a, kw)) mq.put(body) return _

def async_worker(): modname, fname, a, kw = cPickle.loads(mq.get()) __import__(modname) mod = sys.modules[modname] mod[fname](*a, **kw)

Saturday, April 24, 2010

案例三

• cache函数运行结果(SQL, 复杂运算, etc)

Saturday, April 24, 2010

def get_latest_review_id(): review_id = mc.get('latest_review_id') if review_id is None: review_id = exc_sql("select max(id) from review") mc.set('latest_review_id', review_id) return review_id

Saturday, April 24, 2010

@cache('latest_review_id')def get_latest_review_id(): return exc_sql("select max(id) from review")

Saturday, April 24, 2010

def cache(key): def deco(func): def _(*args, **kwargs): r = mc.get(key) if r is None: r = func(*args, **kwargs) mc.set(key, r) return r return _ return deco

Saturday, April 24, 2010

def cache(key): def deco(func): def _(*args, **kwargs): r = mc.get(key) if r is None: r = func(*args, **kwargs) mc.set(key, r) return r return _ return deco

Saturday, April 24, 2010

def get_review(id): key = 'review:%s' % id review = mc.get(key) if review is None: # cache miss id, author_id, text = exc_sql("select id, author_id, text from review where id=%s", id) review = Review(id, author_id, text) mc.set(key, review) return review

如果cache key需要动态生成呢?

Saturday, April 24, 2010

需要动态生成的cache key该如何写decorator?

@cache('review:{id}')def get_review(id): id, author_id, text = exc_sql("select id, author_id, text from review where id=%s", id) return Review(id, author_id, text)

Saturday, April 24, 2010

def cache(key_pattern, expire=0): def deco(f): arg_names, varargs, varkw, defaults = inspect.getargspec(f) if varargs or varkw: raise Exception("not support varargs") gen_key = gen_key_factory(key_pattern, arg_names, defaults)

def _(*a, **kw): key = gen_key(*a, **kw) r = mc.get(key) if r is None: r = f(*a, **kw) mc.set(key, r, expire) return r return _ return deco

Saturday, April 24, 2010

def cache(key_pattern, expire=0): def deco(f): arg_names, varargs, varkw, defaults = inspect.getargspec(f) if varargs or varkw: raise Exception("not support varargs") gen_key = gen_key_factory(key_pattern, arg_names, defaults)

def _(*a, **kw): key = gen_key(*a, **kw) r = mc.get(key) if r is None: r = f(*a, **kw) mc.set(key, r, expire) return r return _ return deco

Saturday, April 24, 2010

def cache(key_pattern, expire=0): def deco(f): arg_names, varargs, varkw, defaults = inspect.getargspec(f) if varargs or varkw: raise Exception("not support varargs") gen_key = gen_key_factory(key_pattern, arg_names, defaults)

def _(*a, **kw): key = gen_key(*a, **kw) r = mc.get(key) if r is None: r = f(*a, **kw) mc.set(key, r, expire) return r return _ return deco

Saturday, April 24, 2010

inspect.getargspec>>> import inspect>>> def f(a, b=1, c=2):... pass... >>> inspect.getargspec(f)ArgSpec(args=['a', 'b', 'c'], varargs=None, keywords=None, defaults=(1, 2))>>>>>>>>> def f(a, b=1, c=2, *args, **kwargs):... pass... >>> inspect.getargspec(f)ArgSpec(args=['a', 'b', 'c'], varargs='args', keywords='kwargs', defaults=(1, 2))

Saturday, April 24, 2010

def cache(key_pattern, expire=0): def deco(f): arg_names, varargs, varkw, defaults = inspect.getargspec(f) if varargs or varkw: raise Exception("not support varargs") gen_key = gen_key_factory(key_pattern, arg_names, defaults)

def _(*a, **kw): key = gen_key(*a, **kw) r = mc.get(key) if r is None: r = f(*a, **kw) mc.set(key, r, expire) return r return _ return deco

Saturday, April 24, 2010

def cache(key_pattern, expire=0): def deco(f): arg_names, varargs, varkw, defaults = inspect.getargspec(f) if varargs or varkw: raise Exception("not support varargs") gen_key = gen_key_factory(key_pattern, arg_names, defaults)

def _(*a, **kw): key = gen_key(*a, **kw) r = mc.get(key) if r is None: r = f(*a, **kw) mc.set(key, r, expire) return r return _ return deco

hint:• str.format in python 2.6: '{id}'.format(id=1) => '1'• dict(zip(['a', 'b', 'c'], [1, 2, 3])) => {'a': 1, 'b': 2, 'c': 3}

Saturday, April 24, 2010

def cache(key_pattern, expire=0): def deco(f): arg_names, varargs, varkw, defaults = inspect.getargspec(f) if varargs or varkw: raise Exception("not support varargs") gen_key = gen_key_factory(key_pattern, arg_names, defaults)

def _(*a, **kw): key = gen_key(*a, **kw) r = mc.get(key) if r is None: r = f(*a, **kw) mc.set(key, r, expire) return r return _ return deco

hint:• str.format in python 2.6: '{id}'.format(id=1) => '1'• dict(zip(['a', 'b', 'c'], [1, 2, 3])) => {'a': 1, 'b': 2, 'c': 3}

Saturday, April 24, 2010

def cache(key_pattern, expire=0): def deco(f): arg_names, varargs, varkw, defaults = inspect.getargspec(f) if varargs or varkw: raise Exception("not support varargs") gen_key = gen_key_factory(key_pattern, arg_names, defaults)

def _(*a, **kw): key = gen_key(*a, **kw) r = mc.get(key) if r is None: r = f(*a, **kw) mc.set(key, r, expire) return r return _ return deco

Saturday, April 24, 2010

案例四

• feed阅读器同时显示多个feed的文章,按entry_id合并排序。

Saturday, April 24, 2010

class Feed(object): def get_entries(self, limit=10): ids = exc_sqls("select id from entry where feed_id=%s order by id desc limit %s", (self.id, limit)) return [Entry.get(id) for id in ids]

class FeedCollection(object): def get_entries(self, limit=10): mixed_entries = [] for feed in self.feeds: entries = feed.get_entries(limit=limit) mixed_entries += entries mixed_entries.sort(key=lambda e: e.id, reverse=True) return mixed_entries[:10]

Saturday, April 24, 2010

class Feed(object): def get_entries(self, limit=10): ids = exc_sqls("select id from entry where feed_id=%s order by id desc limit %s", (self.id, limit)) return [Entry.get(id) for id in ids]

class FeedCollection(object): def get_entries(self, limit=10): mixed_entries = [] for feed in self.feeds: entries = feed.get_entries(limit=limit) mixed_entries += entries mixed_entries.sort(key=lambda e: e.id, reverse=True) return mixed_entries[:10]

Saturday, April 24, 2010

class Feed(object): def get_entries(self, limit=10): ids = exc_sqls("select id from entry where feed_id=%s order by id desc limit %s", (self.id, limit)) return [Entry.get(id) for id in ids]

class FeedCollection(object): def get_entries(self, limit=10): mixed_entries = [] for feed in self.feeds: entries = feed.get_entries(limit=limit) mixed_entries += entries mixed_entries.sort(key=lambda e: e.id, reverse=True) return mixed_entries[:10]

Saturday, April 24, 2010

class Feed(object): def get_entries(self, limit=10): ids = exc_sqls("select id from entry where feed_id=%s order by id desc limit %s", (self.id, limit)) return [Entry.get(id) for id in ids]

class FeedCollection(object): def get_entries(self, limit=10): mixed_entries = [] for feed in self.feeds: entries = feed.get_entries(limit=limit) mixed_entries += entries mixed_entries.sort(key=lambda e: e.id, reverse=True) return mixed_entries[:10]

数据库查询行数 = len(self.feeds) * limit

Saturday, April 24, 2010

class Feed(object): def get_entries(self, limit=10): ids = exc_sqls("select id from entry where feed_id=%s order by id desc limit %s", (self.id, limit)) return [Entry.get(id) for id in ids]

class FeedCollection(object): def get_entries(self, limit=10): mixed_entries = [] for feed in self.feeds: entries = feed.get_entries(limit=limit) mixed_entries += entries mixed_entries.sort(key=lambda e: e.id, reverse=True) return mixed_entries[:10]

浪费的Entry.get数 = len(self.feeds-1) * limit

Saturday, April 24, 2010

iterator and generatordef fib(): x, y = 1, 1 while True: yield x x, y = y, x+y

def odd(seq): return (n for n in seq if n%2)

def less_than(seq, upper_limit): for number in seq: if number >= upper_limit: break yield number

print sum(odd(less_than(fib(), 4000000)))

Saturday, April 24, 2010

itertools• count([n]) --> n, n+1, n+2

• cycle(p) --> p0, p1, ... plast, p0, p1, ...

• repeat(elem [,n]) --> elem, elem, elem, ... endless or up to n times

• izip(p, q, ...) --> (p[0], q[0]), (p[1], q[1]), ...

• islice(seq, [start,] stop [, step]) --> elements from seq[start:stop:step]

• ... and more ...

Saturday, April 24, 2010

class Feed(object): def iter_entries(self): start_id = sys.maxint while True: entry_ids = exc_sqls("select id from entry where feed_id=%s and id<%s order by id desc limit 5", (self.id, start_id)) if not entry_ids: break for entry_id in entry_ids: yield Entry.get(entry_id) start_id = entry_ids[-1]

class FeedCollection(object): def iter_entries(self): return imerge(*[feed.iter_entries() for feed in self.feeds])

def get_entries(self, limit=10): return list(islice(self.iter_entries(), limit))

Saturday, April 24, 2010

class Feed(object): def iter_entries(self): start_id = sys.maxint while True: entry_ids = exc_sqls("select id from entry where feed_id=%s and id<%s order by id desc limit 5", (self.id, start_id)) if not entry_ids: break for entry_id in entry_ids: yield Entry.get(entry_id) start_id = entry_ids[-1]

class FeedCollection(object): def iter_entries(self): return imerge(*[feed.iter_entries() for feed in self.feeds])

def get_entries(self, limit=10): return list(islice(self.iter_entries(), limit))

Saturday, April 24, 2010

class Feed(object): def iter_entries(self): start_id = sys.maxint while True: entry_ids = exc_sqls("select id from entry where feed_id=%s and id<%s order by id desc limit 5", (self.id, start_id)) if not entry_ids: break for entry_id in entry_ids: yield Entry.get(entry_id) start_id = entry_ids[-1]

class FeedCollection(object): def iter_entries(self): return imerge(*[feed.iter_entries() for feed in self.feeds])

def get_entries(self, limit=10): return list(islice(self.iter_entries(), limit))

Saturday, April 24, 2010

class Feed(object): def iter_entries(self): start_id = sys.maxint while True: entry_ids = exc_sqls("select id from entry where feed_id=%s and id<%s order by id desc limit 5", (self.id, start_id)) if not entry_ids: break for entry_id in entry_ids: yield Entry.get(entry_id) start_id = entry_ids[-1]

class FeedCollection(object): def iter_entries(self): return imerge(*[feed.iter_entries() for feed in self.feeds])

def get_entries(self, limit=10): return list(islice(self.iter_entries(), limit))

Saturday, April 24, 2010

class Feed(object): def iter_entries(self): start_id = sys.maxint while True: entry_ids = exc_sqls("select id from entry where feed_id=%s and id<%s order by id desc limit 5", (self.id, start_id)) if not entry_ids: break for entry_id in entry_ids: yield Entry.get(entry_id) start_id = entry_ids[-1]

class FeedCollection(object): def iter_entries(self): return imerge(*[feed.iter_entries() for feed in self.feeds])

def get_entries(self, limit=10): return list(islice(self.iter_entries(), limit))

Saturday, April 24, 2010

class Feed(object): def iter_entries(self): start_id = sys.maxint while True: entry_ids = exc_sqls("select id from entry where feed_id=%s and id<%s order by id desc limit 5", (self.id, start_id)) if not entry_ids: break for entry_id in entry_ids: yield Entry.get(entry_id) start_id = entry_ids[-1]

class FeedCollection(object): def iter_entries(self): return imerge(*[feed.iter_entries() for feed in self.feeds])

def get_entries(self, limit=10): return list(islice(self.iter_entries(), limit))

数据库查询行数 = len(self.feeds) * 5 ~

len(self.feeds)*5 + limit -5

Saturday, April 24, 2010

class Feed(object): def iter_entries(self): start_id = sys.maxint while True: entry_ids = exc_sqls("select id from entry where feed_id=%s and id<%s order by id desc limit 5", (self.id, start_id)) if not entry_ids: break for entry_id in entry_ids: yield Entry.get(entry_id) start_id = entry_ids[-1]

class FeedCollection(object): def iter_entries(self): return imerge(*[feed.iter_entries() for feed in self.feeds])

def get_entries(self, limit=10): return list(islice(self.iter_entries(), limit))

浪费的Entry.get数 = 0

Saturday, April 24, 2010

decorator 和 generator 是简化代码的利器

Saturday, April 24, 2010

案例五

• 优化不可变对象反序列化时间

Saturday, April 24, 2010

class User(object): def __init__(self, id, username, screen_name, sig): self.id = id self.username = username self.screen_name = screen_name self.sig = sig

user = User('1002211', 'hongqn', 'hongqn', "巴巴布、巴巴布巴布巴布!")

Saturday, April 24, 2010

$ python -m timeit -s '> from user import user> from cPickle import dumps, loads> s = dumps(user, 2)' \> 'loads(s)'100000 loops, best of 3: 6.6 usec per loop

$ python -m timeit -s '> from user import user> from marshal import dumps, loads> d = (user.id, user.username, user.screen_name, user.sig)> s = dumps(d, 2)' 'loads(s)'1000000 loops, best of 3: 0.9 usec per loop

cPickle vs. marshal

Saturday, April 24, 2010

$ python -m timeit -s '> from user import user> from cPickle import dumps, loads> s = dumps(user, 2)' \> 'loads(s)'100000 loops, best of 3: 6.6 usec per loop

$ python -m timeit -s '> from user import user> from marshal import dumps, loads> d = (user.id, user.username, user.screen_name, user.sig)> s = dumps(d, 2)' 'loads(s)'1000000 loops, best of 3: 0.9 usec per loop

cPickle vs. marshal

7倍速度提升

Saturday, April 24, 2010

$ python -m timeit -s '> from user import user> from cPickle import dumps, loads> s = dumps(user, 2)' \> 'loads(s)'100000 loops, best of 3: 6.6 usec per loop

$ python -m timeit -s '> from user import user> from marshal import dumps, loads> d = (user.id, user.username, user.screen_name, user.sig)> s = dumps(d, 2)' 'loads(s)'1000000 loops, best of 3: 0.9 usec per loop

cPickle vs. marshal

7倍速度提升

Saturday, April 24, 2010

$ python -c '> import cPickle, marshal> from user import user> print "pickle:", len(cPickle.dumps(user, 2))> print "marshal:", len(marshal.dumps((user.id, \> user.username, user.screen_name, user.sig), 2))'pickle: 129marshal: 74

cPickle vs. marshaltimeit

43%空间节省

Saturday, April 24, 2010

$ python -c '> import cPickle, marshal> from user import user> print "pickle:", len(cPickle.dumps(user, 2))> print "marshal:", len(marshal.dumps((user.id, \> user.username, user.screen_name, user.sig), 2))'pickle: 129marshal: 74

cPickle vs. marshaltimeit

43%空间节省

Saturday, April 24, 2010

namedtuple

from collections import namedtuple

User = namedtuple('User', 'id username screen_name sig')

user = User('1002211', 'hongqn', 'hongqn', sig="巴巴布、巴巴布巴布巴布!")

user.username-> 'hongqn'

Saturday, April 24, 2010

__metaclass__

class User(tuple): __metaclass__ = NamedTupleMetaClass __attrs__ = ['id', 'username', 'screen_name', 'sig']

user = User('1002211', 'hongqn', 'hongqn', sig="巴巴布、巴巴布巴布巴布!")

s = marshal.dumps(user.__marshal__())User.__load_marshal__(marshal.loads(s))

Saturday, April 24, 2010

from operator import itemgetter

class NamedTupleMetaClass(type): def __new__(mcs, name, bases, dict): assert bases == (tuple,) for i, a in enumerate(dict['__attrs__']): dict[a] = property(itemgetter(i)) dict['__slots__'] = () dict['__marshal__'] = tuple dict['__load_marshal__'] = classmethod(tuple.__new__) dict['__getnewargs__'] = lambda self: tuple(self) argtxt = repr(tuple(attrs)).replace("'", "")[1:-1] template = """def newfunc(cls, %(argtxt)s): return tuple.__new__(cls, (%(argtxt)s))""" % locals() namespace = {} exec template in namespace dict['__new__'] = namespace['newfunc'] return type.__new__(mcs, name, bases, dict)

Saturday, April 24, 2010

Warning!Saturday, April 24, 2010

案例六

• 简化request.get_environ(key)的写法

• e.g. request.get_environ('REMOTE_ADDR') --> request.remote_addr

Saturday, April 24, 2010

descriptor

• 一个具有__get__, __set__或者__delete__方法的对象

class Descriptor(object): def __get__(self, instance, owner): return 'descriptor'

class Owner(object): attr = Descriptor()

owner = Owner()owner.attr --> 'descriptor'

Saturday, April 24, 2010

常用的descriptor

• classmethod

• staticmethod

• property

class C(object): def get_x(self): return self._x def set_x(self, x): self._x = x x = property(get_x, set_x)

Saturday, April 24, 2010

class environ_getter(object): def __init__(self, key, default=None): self.key = key self.default = default

def __get__(self, obj, objtype): if obj is None: return self return obj.get_environ(self.key, self.default)

class HTTPRequest(quixote.http_request.HTTPRequest): for key in ['HTTP_REFERER', 'REMOTE_ADDR', 'SERVER_NAME', 'REQUEST_URI', 'HTTP_HOST']: locals()[key.lower()] = environ_getter(key) del key

locals()

Saturday, April 24, 2010

class environ_getter(object): def __init__(self, key, default=None): self.key = key self.default = default

def __get__(self, obj, objtype): if obj is None: return self return obj.get_environ(self.key, self.default)

class HTTPRequest(quixote.http_request.HTTPRequest): for key in ['HTTP_REFERER', 'REMOTE_ADDR', 'SERVER_NAME', 'REQUEST_URI', 'HTTP_HOST']: locals()[key.lower()] = environ_getter(key) del key

Saturday, April 24, 2010

案例七

• 让 urllib.urlopen 自动利用socks代理翻墙

Saturday, April 24, 2010

Monkey Patch

Saturday, April 24, 2010

import httplib

orig_connect = httplib.HTTPConnection.connect

def _patched_connect(self): if HOSTS_BLOCKED.match(self.host): return _connect_via_socks_proxy(self) else: return orig_connect(self)

def _connect_via_socks_proxy(self): ...

httplib.HTTPConnection.connect = _patched_connect

Saturday, April 24, 2010

使用Python时需要注意的问题

Saturday, April 24, 2010

使用Python时需要注意的问题

• Pythonic!

Saturday, April 24, 2010

使用Python时需要注意的问题

• Pythonic!

• Avoid gotchas http://www.ferg.org/projects/python_gotchas.html

Saturday, April 24, 2010

使用Python时需要注意的问题

• Pythonic!

• Avoid gotchas http://www.ferg.org/projects/python_gotchas.html

• Unicode / Character Encoding

Saturday, April 24, 2010

使用Python时需要注意的问题

• Pythonic!

• Avoid gotchas http://www.ferg.org/projects/python_gotchas.html

• Unicode / Character Encoding

• GIL (Global Interpreter Lock)

Saturday, April 24, 2010

使用Python时需要注意的问题

• Pythonic!

• Avoid gotchas http://www.ferg.org/projects/python_gotchas.html

• Unicode / Character Encoding

• GIL (Global Interpreter Lock)

• Garbage Collection

Saturday, April 24, 2010

开发环境

• 编辑器: Vim / Emacs / Ulipad

• 版本管理: subversion / mercurial / git

• wiki/错误跟踪/代码浏览: Trac

• 持续集成: Bitten

Saturday, April 24, 2010

Python Implementations

Saturday, April 24, 2010

Python Implementations

• CPython http://www.python.org/

Saturday, April 24, 2010

Python Implementations

• CPython http://www.python.org/

• Unlanden-Swallow http://code.google.com/p/unladen-swallow/

Saturday, April 24, 2010

Python Implementations

• CPython http://www.python.org/

• Unlanden-Swallow http://code.google.com/p/unladen-swallow/

• Stackless Python http://www.stackless.com/

Saturday, April 24, 2010

Python Implementations

• CPython http://www.python.org/

• Unlanden-Swallow http://code.google.com/p/unladen-swallow/

• Stackless Python http://www.stackless.com/

• IronPython http://ironpython.net/

Saturday, April 24, 2010

Python Implementations

• CPython http://www.python.org/

• Unlanden-Swallow http://code.google.com/p/unladen-swallow/

• Stackless Python http://www.stackless.com/

• IronPython http://ironpython.net/

• Jython http://www.jython.org/

Saturday, April 24, 2010

Python Implementations

• CPython http://www.python.org/

• Unlanden-Swallow http://code.google.com/p/unladen-swallow/

• Stackless Python http://www.stackless.com/

• IronPython http://ironpython.net/

• Jython http://www.jython.org/

• PyPy http://pypy.org/

Saturday, April 24, 2010

感谢国家,感谢大家Q & A

Saturday, April 24, 2010

top related