node cluster

27
node-cluster 张轩丞(朋春) @我是aleafs 进程管理的利器

Upload: aleafs

Post on 06-Jul-2015

2.686 views

Category:

Technology


3 download

TRANSCRIPT

Page 1: Node cluster

node-cluster张轩丞(朋春)

@我是aleafs

进程管理的利器

Page 2: Node cluster

• npm install node-cluster

• https://github.com/aleafs/node-cluster

项目地址

Page 3: Node cluster

千万pv的服务

Page 4: Node cluster

千万pv的服务

• 多进程协同提供服务

Page 5: Node cluster

千万pv的服务

• 多进程协同提供服务• 进程“死”掉时的容灾

Page 6: Node cluster

千万pv的服务

• 多进程协同提供服务• 进程“死”掉时的容灾

• 服务的平滑重启

Page 7: Node cluster

千万pv的服务

• 多进程协同提供服务• 进程“死”掉时的容灾

• 服务的平滑重启• 监控、管理接口

Page 8: Node cluster

进程

Page 9: Node cluster

进程

|-- master

Page 10: Node cluster

进程

|-- master

|-- n * core service

Page 11: Node cluster

进程

|-- master

|-- n * core service

|-- analysis daemon

Page 12: Node cluster

进程

|-- master

|-- n * core service

|-- analysis daemon

|-- control service

Page 13: Node cluster

Usagevar master = require(‘node-cluster’).Master({

‘pidfile‘ : __dirname + ‘/run/app.pid’,‘statusfile’ : __dirname + ‘/run/status.log’,

});

master.register(‘core’, __dirname + ‘/http.js’, {‘listen‘ : [ port1, socket1, /** ... */ ],‘children’ : 4,

});master.dispatch();

Page 14: Node cluster

Usagevar http = require(‘http’);var server = http.createServer(function (req, res) { res.end(‘hello world’);});

require(‘node-cluster’).Worker({‘heartbeat_interval’ : 2000,‘terminate_timeout’ : 1000,

}).ready(function (socket) { server.emit(‘connection’, socket);});

Page 15: Node cluster

协同服务

• 多进程“共用”同⼀一个端口

• 请求的负载均衡

Page 16: Node cluster

端口“共享”

1.request

3.response

2.child_process.sendhandle

worker 1

worker 2

worker 4

worker 3

listen

master

Page 17: Node cluster

worker分配算法

master推 worker拉

原理 master选择⼀一个worker,让它干活

master将请求投入队列,等worker来抢

分配算法 轮巡 / 权重 / 最空闲worker

worker来抢

劣势 worker“死”掉,影响⼀一批 多⼀一次进程通信

Page 18: Node cluster

case MESSAGE.WAKEUP: if (STATUS.RUNNING === mstat.status) { _send(MESSAGE.GET_FD); } break;

case MESSAGE.REQ_FD: mstat.scores++; _accept(handle, callback); if (STATUS.RUNNING === mstat.status && msg.data) { process.nextTick(function () { _send(MESSAGE.GET_FD); }); } break;

Page 19: Node cluster

var usepush = 2 * _options.children;_options.listen.forEach(function (item) { // ... _listener[item] = Listen(item, function (handle) { if (_fdqueue.push(handle) <= usepush) { _wakeups = (_wakeups + 1) % _pobject.length; try { _pobject[_wakeups].send({ 'type' : MESSAGE.WAKEUP, }); } catch (e) { } } });});

Page 20: Node cluster

进程容灾

sub.on(‘exit’, function (code, signal) {

// start a new child process

// max_fatal_restart

});

Page 21: Node cluster

进程容灾

setInterval(function () {

if (sub.last_hb_time < ?) {

// start a new child process and then kill this

}

}, 30000);

Page 22: Node cluster

平滑重启restart reload

原理 stop && start 新worker开始工作了停掉旧worker

信号 SIGTERM SIGUSR1

区别 master有退出 master无退出

Page 23: Node cluster

reload用在哪里?// 配置、资源的“热”加载

app.init(function () { require(‘node-cluster’).Worker({

‘heartbeat_interval’ : 2000, ‘terminate_timeout’ : 1000,

}).ready(function (socket) { // 正式开始提供服务 });});

Page 24: Node cluster

监控接口master.on(‘giveup’, function (name, fatals) { // XXX: alert // 暂时放弃某个子进程的尝试重启});

master.on(‘state’, function (name, current, before) { // XXX: alert // 可工作子进程数量变化});

Page 25: Node cluster

statusfile905: daemon 908 {"status":2,"scores":0,"mem":{"rss":22204416,"heapTotal":5306944,"heapUsed":2775704},"_time":1341026802149}905: http 910 {"status":2,"scores":1831,"mem":{"rss":27009024,"heapTotal":9085760,"heapUsed":4811288},"_time":1341026802157}905: daemon 907 {"status":2,"scores":0,"mem":{"rss":22200320,"heapTotal":5315072,"heapUsed":2756800},"_time":1341026804137}905: daemon 908 {"status":2,"scores":0,"mem":{"rss":22290432,"heapTotal":5306944,"heapUsed":2787056},"_time":1341026804149}

Page 27: Node cluster