logging and big data - wunca.uni.net.th and … · •มี hadoop file system ......

Download Logging and Big Data - wunca.uni.net.th and … · •มี Hadoop File System ... •ผู้ใช้: Yahoo!, Facebook, Amazon, eBay, American Airline, Apple, ... Python, Scala,

If you can't read please download the document

Upload: ngoduong

Post on 05-Feb-2018

229 views

Category:

Documents


6 download

TRANSCRIPT

  • WUNCA-33

    Logging and Big Data

    [email protected]@natawutnhttp://natawutn.wordpress.comhttp://www.slideshare.net/natawutnupairoj

    ..

  • IT LOG SERVER ?

    .

  • LOG

    Real-Time

    Server

  • 2-3

  • IT LOG

    Users 40,000+Servers = 500+Wifi + NAT

    Manual processes

    Approx. traffic: 5,000 events / sec

    Storage 90 days= 39,000,000,000 events (6.5TB)

    Use Syslog + Graylog-2 (based on

    ElasticSearch technology)

  • Log

    Real-Time

  • ARCHITECTURAL PATTERN #1BASIC ARCHITECTURE

  • ARCHITECTURAL PATTERN #2LAMBDA ARCHITECTURE

    Speed Layer

    Batch Layer Serving Layer

  • Python

  • Opensource software framework Google Search Engine Architecture

    Commodity Hardware

    Map-Reduced Cluster Parallel Processing

    Hadoop File System (HDFS) reliable

    : Yahoo!, Facebook, Amazon, eBay, American Airline, Apple, Google, HP, IBM, Microsoft, Netflix, New York Times,

  • (In-Memory Data Processing) UC Berkeley

    MapReduce batch executions, interactive queries, stream processing

    Java, Python, Scala, R analytic libraries (machine learning, graph processing)

    Hadoop 10-100

  • ELASTICSEARCH

    OpenSource Search Engine

    Real-Time data

    Scale-Out Cluster

    Shard

    Shard Timestamp Log

    1 Shard Copy (Replication)

  • # of shards = 1# of replicas = 1

    # of shards = 2# of replicas = 1

  • # of shards = 3# of replicas = 1

  • # of shards = 3# of replicas = 2

  • DATA COLLECTOR (LOG SHIPPER)

    Server Big Data Real-Time

    / In-Flight Data

    Adapter / Plugin Architecture

    Reliability Availability

  • Source: Sematext, Top 5 Most Popular Log Shipper,

    http://blog.sematext.com/2014/10/06/top-5-most-popular-log-shippers/

  • APACHE FLUME

    Opensource

    Distributed / Reliable / Scalable

    Event

  • APACHE FLUME OVERVIEW BY GETINDATA

  • LOG LAMBDA ARCHITECTURE

    /

    Traffic Anomaly ( SARIMA)