collecting web information with open source tools

27
Collecting useful information from web with open source tools

Upload: sammy-fung

Post on 13-May-2015

1.346 views

Category:

Technology


2 download

DESCRIPTION

my lightening talk slide at coscup 2011, taipei

TRANSCRIPT

Collecting useful information

from web with open source tools

@sammyfung

Hong Kong

First chairman of Hong Kong Linux User Group

opensource.hk webmaster

How does programmers

solve problemsin daily life ?程式員解決

現實問題的方法 ?

Coding!就是寫程式 !

a lot of popular web sites

running on II$ in Hong Kong.

香港很多大型網站都是用 II$

Very slow when you're using!當你在用的時候,就會很慢!

Visiting websites manually, repeatly for any latest update.

為了追蹤最新消息,人手重覆重瀏覽同一網站

Will you still addicted to plurk/twitter without

auto new response/reply alert ?

如果沒有自動新回應提示 , 你還會沉迷噗浪

和推特 ?

What do you need ?你需要甚麼 ?

Regular Expression

HTML Parser

Web Crawling Framework

scrapy.org

About Scrapy

written in python

x = HtmlXPathSelector(response)

torrent = TorrentItem()

torrent['url'] = response.url

torrent['name'] = x.select("//h1/text()").extract()

<h1>Hello World</h1>

all of above are available in

open source!以上所有的也有

開源軟件

Problem #1 a lot of popular web sites

running on II$ in Hong Kong.

develop a list of football matches live

on cable tv做了「電視足球直播時間表」

Problem #2 some web sites doesn't

provide data API.

Hong Kong Weather Info香港天氣

@weatherhk

Alerts of Tropical Cyclones in Northwest Pacific Ocean

@tctrack @tropicalhk

Path and Forecast of active tropical cyclone

Let's solve your own problems with

open source tools.所以多多利用開源軟件

來解決你生活上遇到的問題吧

 Thank you! 謝謝 !

solving problems with open source.

Thank you.