aws media day- aws 인공 지능 서비스를 활용한 미디어 서비스 개발화 (김기완...

김기완솔루션아키텍트

AWS 인공지능 서비스를활용한 미디어 서비스 개발

• 미디어 및 엔터테인먼트 산업에서의 인공 지능 기술의필요성

• AWS인공 지능 서비스 소개

• AWS ML Stack

• Vision, Speech, Language

• Deep Learning framework / AmazonSagemaker

• 고객 사례 :조선일보

• 미디어에서의 인공 지능 활용 사례

Agenda

인공 지능 활용의 필요성 – Media &

Entertainment

There are 3,700,000,000 Internet users in 2017*

1,200,000,000 photos will be taken in 2017 (9% YoYgrowth)*

50% of 2016 Internet traffic was video, and will likely be 70% by 2021**

Multi-petabyte asset storage with > 1 PB MoM growth is commonplace onAWS

Sources: * InfoTrends Worldwide, **StreamingMedia.com

미디어 산업에서의 인공 지능활용

• 메터데이터 활용 (Richer Metadata)• B2B 및 B2C를 위한 자동화된 메터데이터 생성

• IMDb 활용 및 출연자 정보의 적극 활용

• 디지털 아카이브의 대량 배치 프로세싱

• 사용자 경험 개선 (EnhancedExperiences)

• 지능적인 컨텐트 필터링

• 사용자 경험 개선을 위한 적합한 컨텐트 사용

• 보안 및 분석 (Security andAnalytics)• UGC (User Generated Content) 컨텐트에 대한 자동화된 필터링

• 워크플로우 개선• 시청자 참여

Vision, Speech, and Language

Amazon의 인공지능 활용

Fulfilment

& Logistics

Existing

Products

New

Products

Search

& Discovery

Put machine learning in the hands of every developer and datascientist

ML @ AWS: Our mission

AWS ML Stack

Frameworks &

Infrastructure

AWS Deep Learning AMI

GPU(P3 Instances)

MobileCPU IoT (Greengrass)

Vision:Rekognition Image Rekognition Video

Speech:Amazon Polly

Transcribe

Language:Lex Translate

Comprehend

Apache

MXNetPyTorch

Cognitive

ToolkitKeras

Caffe2

& CaffeTensorFlow Gluon

Application

Services

Platform

ServicesAmazon

Machine LearningMechanical

TurkSpark & EMR

Amazon

SageMakerAWS DeepLens

Vision : Amazon Rekognition

Object and Scene

DetectionFacial

Analysis

Face

Comparison

Facial

Recognition

Celebrity

Recognition

Image

Moderation

<0.5 second response time

Up to 10M faces

Enable Immediate response

New feature : Real Time FaceSearch

Real-time face recognition against tens of millions of faces

How can weapplythese powerful capabilities tovideo?

Frame-based analysis for videos• AWS Answers

(https://aws.amazon.com/answers/media-

entertainment/video-frame-based-analysis/)

• 서버리스 아키텍쳐 – AWS Lambda, Amazon

DynamoDB, AWS IoT, Amazon SNS, Amazon S3,

Amazon SQS, Amazon Rekognition

• ffmpeg

• Using Live Stream?

• Scalability?

• More features?

Object and Activity Detection

Person Tracking

Face Recognition

Real-time Live Stream

Content Moderation

Celebrity Recognition

New Service : Amazon Rekognition Video

Video Analysis


Object, Scene and ActivityDetection

Blowing a candle Drinking


Person Tracking


Live Streaming FaceRecognition


Activity recognition


One Solution forAll

Stored Video

Amazon S3

Media Search Index

Unsafe Video Detection

Investigative Analysis

Video Live Stream

Amazon Kinesis Video Stream

Public Safety Immediate Response

Home Monitoring

Rekognition Media UseCases

Playout and

DistributionFiltering and QualityControl

Visual Effectsand

EditingApplication and Filesystem

Texture and AssetSearch

AnalyticsSentiment Analysis

Other Amazon AI Services

(Lex, Polly)

DAM and ArchiveAuto-categorization

Metadata Augmentation

Digital Supply ChainTag on Ingest

Live and VOD Feature Extraction

Celebrity Detection

PublishingValue Add

API-Based Services

OTTFiltering and

QualityControl

AcquisitionPreprocessing and Opti

mization

Speech : Amazon Polly

Convert text into life-likespeech

• 25개국, 52가지 언어 지원

• 한글 포함 (서연)

• 리얼타임 시스템에 사용될 수 있도록 빠른 응답 속도 지원

• 서울리전 서비스 Endpoint제공

• 변환된 음성파일은 자유롭게 저장, 재생, 배포될 수있음

• 별도의 계약 없이 생성된 음원을 무제한 사용


Convert text into life-likespeech

• US English Male (Matthew)

• German Female (Vicki)

• Indian English Female(Aditi)

• Japanese Male (Takumi)

• Korean Female (Seoyeon)


Speech marks to synchronizeAudio-Video

Customer Case :아마존 폴리가 조선일보 뉴스를 들려드립니다Echo Alexa Skill - Chosun Flash Briefing (조선일보)

Customer Case :아마존 폴리가 조선일보 뉴스를 들려드립니다Create a beta service using Amazon Polly

Demo link

Customer Case :아마존 폴리가 조선일보 뉴스를 들려드립니다Architecture using the AWSserverless services

Time stamps and

confidence scores

Support for both

regular and

telephony audio

Punctuation

§

S3 integration

Hello/

Hola

English and Spanish

with more tocome

Amazon

S3

Speech : New : Amazon Transcribe

Automatic Speech Recognition

Subtitles for VoD, Broadcast Closed Caption, …

Language : Amazon Lex

• 컨택 센터• 챗봇, 고객 서비스

• 정보 전달/검색 봇• 고객의 평소 요청에 대응

하는 챗봇

• 어플리케이션 봇• 모바일 어플리케이션에 강

력한 인터페이스 제공

• 기업 생산성 향상 봇• 기업 워크플로우 효율성 재

고

• IoT 봇• 디바이스에 대화 기능 추가

REAL-TIME

TRANSLATION

POWERED BY

DEEP LEARNING12 LANGUAGEPAIRS (moreto

come)

LANGUAGE

DETECTION

Language : New : Amazon Translate (Preview)

Real-tiem translation service

Sentiment Entities LanguagesKey phrases Topic modeling

Powered By DeepLearning

Language : New : Amazon Comprehend

Natural Language Processing

Customer Case : Media & Entertainment

Opportunities

• Petabytes of images

• 100+ years of content

• How can we enrich our metadata in AWS?

• How can we unleash the value of contentwe

already own once in AWS?


Challenges

• Niche Image Categories

• Low & Ultra High Resolutions

• Artifacts & Noise

• Black and White Footage

• Historical Context

• High Accuracy Required


Digital Transformation

AWS Migration

• Storage /Archive

• Editing & Publishing

• Video Streaming

• Web Apps


Object & Scene Detection : AmazonRekognition

Shoe

Ramp

Person

Identify objects, scenes & concepts, and provide confidence scores

Sky

Person

Eagle

Desert

Mountain


Label

Detection

UUID

Generator

{

"FaceMatches": [

{"Face": {"BoundingB

"Height": 0.2683333456516266,

"Left": 0.5099999904632568,

"Top": 0.1783333271741867,

"Width": 0.17888888716697693},

UUID

API Gateway

Lambda(s)

Rekognition

CloudFront

Browser /

API Client

Image

Processing

Step Functions

Realtime

SearchElasticSearch

Client Lookup

Archive, DAM/MAM, Searching metadata, AI processing on AWSDelivery

Ingest

Processing

Service

Frontend

Asset

Metadata"

DynamoDB

Metadata

Service

API Gateway

Content

Archive

S3 Image

Storage


Back to the Challenges – Deep learning required

• Custom Concepts

NLP – Rekognition + spaCy, Others

• Specialized Categories

Transfer Learning w/ Finetuning

• Black & White Footage

Deep Learning-based Colorization

• Low Resolutions

Convolutional Neural Net ImageScaling

• Niche & Historical Context

Crowd working & OCR

Real-Time User-Guided Image Colorization with Learned Deep Priorshttps://richzhang.github.io/ideepcolor


Deep Learning in the AWS Cloud

AWS ML Stack - revisited

Frameworks &

Infrastructure

AWS Deep Learning AMI

GPU(P3 Instances)

MobileCPU IoT (Greengrass)

Vision:Rekognition Image Rekognition Video

Speech:Amazon Polly

Transcribe

Language:Lex Translate

Comprehend

Apache

MXNetPyTorch

Cognitive

ToolkitKeras

Caffe2

& CaffeTensorFlow Gluon

Application

Services

Platform

ServicesAmazon

Machine LearningMechanical

TurkSpark & EMR

Amazon

SageMakerAWS DeepLens

체크 포인트

• AWS ML (Machine Learning) Stack

• AWS MLApplications : Vision, Speech, Language

• AWS Media Capabilities – 8 Key media workloads

• Metadata Enrichment using AWS ML applications / platform services

• Continuous Update / Refinement is important

본 강연이 끝난 후…

• Amazon AI Home Page:

https://aws.amazon.com/blogs/ai/

• Amazon Rekognition Home Page:

https://aws.amazon.com/rekognition

• Amazon Polly Home Page:

https://aws.amazon.com/polly/

감사합니다

aws media day- aws 인공 지능 서비스를 활용한 미디어 서비스 개발화 (김기완...

Technology