aws media day- aws 인공 지능 서비스를 활용한 미디어 서비스 개발화 (김기완...
TRANSCRIPT
김기완솔루션아키텍트
AWS 인공지능 서비스를활용한 미디어 서비스 개발
• 미디어 및 엔터테인먼트 산업에서의 인공 지능 기술의필요성
• AWS인공 지능 서비스 소개
• AWS ML Stack
• Vision, Speech, Language
• Deep Learning framework / AmazonSagemaker
• 고객 사례 :조선일보
• 미디어에서의 인공 지능 활용 사례
Agenda
인공 지능 활용의 필요성 – Media &
Entertainment
There are 3,700,000,000 Internet users in 2017*
1,200,000,000 photos will be taken in 2017 (9% YoYgrowth)*
50% of 2016 Internet traffic was video, and will likely be 70% by 2021**
Multi-petabyte asset storage with > 1 PB MoM growth is commonplace onAWS
Sources: * InfoTrends Worldwide, **StreamingMedia.com
미디어 산업에서의 인공 지능활용
• 메터데이터 활용 (Richer Metadata)• B2B 및 B2C를 위한 자동화된 메터데이터 생성
• IMDb 활용 및 출연자 정보의 적극 활용
• 디지털 아카이브의 대량 배치 프로세싱
• 사용자 경험 개선 (EnhancedExperiences)
• 지능적인 컨텐트 필터링
• 사용자 경험 개선을 위한 적합한 컨텐트 사용
• 보안 및 분석 (Security andAnalytics)• UGC (User Generated Content) 컨텐트에 대한 자동화된 필터링
• 워크플로우 개선• 시청자 참여
Vision, Speech, and Language
Amazon의 인공지능 활용
Fulfilment
& Logistics
Existing
Products
New
Products
Search
& Discovery
Put machine learning in the hands of every developer and datascientist
ML @ AWS: Our mission
AWS ML Stack
Frameworks &
Infrastructure
AWS Deep Learning AMI
GPU(P3 Instances)
MobileCPU IoT (Greengrass)
Vision:Rekognition Image Rekognition Video
Speech:Amazon Polly
Transcribe
Language:Lex Translate
Comprehend
Apache
MXNetPyTorch
Cognitive
ToolkitKeras
Caffe2
& CaffeTensorFlow Gluon
Application
Services
Platform
ServicesAmazon
Machine LearningMechanical
TurkSpark & EMR
Amazon
SageMakerAWS DeepLens
Vision : Amazon Rekognition
Object and Scene
DetectionFacial
Analysis
Face
Comparison
Facial
Recognition
Celebrity
Recognition
Image
Moderation
New Feature : Text in Image
Results:
| IT’S - 97% |
| MONDAY – 99% |
|but – 97% |keep – 96% |
| Smiling – 99% |
DetectText
<0.5 second response time
Up to 10M faces
Enable Immediate response
New feature : Real Time FaceSearch
Real-time face recognition against tens of millions of faces
How can weapplythese powerful capabilities tovideo?
Frame-based analysis for videos• AWS Answers
(https://aws.amazon.com/answers/media-
entertainment/video-frame-based-analysis/)
• 서버리스 아키텍쳐 – AWS Lambda, Amazon
DynamoDB, AWS IoT, Amazon SNS, Amazon S3,
Amazon SQS, Amazon Rekognition
• ffmpeg
• Using Live Stream?
• Scalability?
• More features?
Object and Activity Detection
Person Tracking
Face Recognition
Real-time Live Stream
Content Moderation
Celebrity Recognition
New Service : Amazon Rekognition Video
Video Analysis
New Service : Amazon Rekognition Video
Object, Scene and ActivityDetection
Blowing a candle Drinking
New Service : Amazon Rekognition Video
Person Tracking
New Service : Amazon Rekognition Video
Live Streaming FaceRecognition
New Service : Amazon Rekognition Video
Activity recognition
New Service : Amazon Rekognition Video
One Solution forAll
Stored Video
Amazon S3
Media Search Index
Unsafe Video Detection
Investigative Analysis
Video Live Stream
Amazon Kinesis Video Stream
Public Safety Immediate Response
Home Monitoring
Rekognition Media UseCases
Playout and
DistributionFiltering and QualityControl
Visual Effectsand
EditingApplication and Filesystem
Texture and AssetSearch
AnalyticsSentiment Analysis
Other Amazon AI Services
(Lex, Polly)
DAM and ArchiveAuto-categorization
Metadata Augmentation
Digital Supply ChainTag on Ingest
Live and VOD Feature Extraction
Celebrity Detection
PublishingValue Add
API-Based Services
OTTFiltering and
QualityControl
AcquisitionPreprocessing and Opti
mization
Demo
Speech : Amazon Polly
Convert text into life-likespeech
• 25개국, 52가지 언어 지원
• 한글 포함 (서연)
• 리얼타임 시스템에 사용될 수 있도록 빠른 응답 속도 지원
• 서울리전 서비스 Endpoint제공
• 변환된 음성파일은 자유롭게 저장, 재생, 배포될 수있음
• 별도의 계약 없이 생성된 음원을 무제한 사용
Speech : Amazon Polly
Convert text into life-likespeech
• US English Male (Matthew)
• German Female (Vicki)
• Indian English Female(Aditi)
• Japanese Male (Takumi)
• Korean Female (Seoyeon)
Speech : Amazon Polly
Speech marks to synchronizeAudio-Video
Customer Case :아마존 폴리가 조선일보 뉴스를 들려드립니다Echo Alexa Skill - Chosun Flash Briefing (조선일보)
Customer Case :아마존 폴리가 조선일보 뉴스를 들려드립니다Create a beta service using Amazon Polly
Demo link
Customer Case :아마존 폴리가 조선일보 뉴스를 들려드립니다Architecture using the AWSserverless services
Time stamps and
confidence scores
Support for both
regular and
telephony audio
Punctuation
§
S3 integration
Hello/
Hola
English and Spanish
with more tocome
Amazon
S3
Speech : New : Amazon Transcribe
Automatic Speech Recognition
Subtitles for VoD, Broadcast Closed Caption, …
Language : Amazon Lex
• 컨택 센터• 챗봇, 고객 서비스
• 정보 전달/검색 봇• 고객의 평소 요청에 대응
하는 챗봇
• 어플리케이션 봇• 모바일 어플리케이션에 강
력한 인터페이스 제공
• 기업 생산성 향상 봇• 기업 워크플로우 효율성 재
고
• IoT 봇• 디바이스에 대화 기능 추가
REAL-TIME
TRANSLATION
POWERED BY
DEEP LEARNING12 LANGUAGEPAIRS (moreto
come)
LANGUAGE
DETECTION
Language : New : Amazon Translate (Preview)
Real-tiem translation service
Sentiment Entities LanguagesKey phrases Topic modeling
Powered By DeepLearning
Language : New : Amazon Comprehend
Natural Language Processing
Demo
Customer Case : Media & Entertainment
Opportunities
• Petabytes of images
• 100+ years of content
• How can we enrich our metadata in AWS?
• How can we unleash the value of contentwe
already own once in AWS?
Customer Case : Media & Entertainment
Challenges
• Niche Image Categories
• Low & Ultra High Resolutions
• Artifacts & Noise
• Black and White Footage
• Historical Context
• High Accuracy Required
Customer Case : Media & Entertainment
Digital Transformation
AWS Migration
• Storage /Archive
• Editing & Publishing
• Video Streaming
• Web Apps
Customer Case : Media & Entertainment
Object & Scene Detection : AmazonRekognition
Shoe
Ramp
Person
Identify objects, scenes & concepts, and provide confidence scores
Sky
Person
Eagle
Desert
Mountain
Customer Case : Media & Entertainment
Label
Detection
UUID
Generator
{
"FaceMatches": [
{"Face": {"BoundingB
"Height": 0.2683333456516266,
"Left": 0.5099999904632568,
"Top": 0.1783333271741867,
"Width": 0.17888888716697693},
UUID
API Gateway
Lambda(s)
Rekognition
CloudFront
Browser /
API Client
Image
Processing
Step Functions
Realtime
SearchElasticSearch
Client Lookup
Archive, DAM/MAM, Searching metadata, AI processing on AWSDelivery
Ingest
Processing
Service
Frontend
Asset
Metadata"
DynamoDB
Metadata
Service
API Gateway
Content
Archive
S3 Image
Storage
Customer Case : Media & Entertainment
Back to the Challenges – Deep learning required
• Custom Concepts
NLP – Rekognition + spaCy, Others
• Specialized Categories
Transfer Learning w/ Finetuning
• Black & White Footage
Deep Learning-based Colorization
• Low Resolutions
Convolutional Neural Net ImageScaling
• Niche & Historical Context
Crowd working & OCR
Real-Time User-Guided Image Colorization with Learned Deep Priorshttps://richzhang.github.io/ideepcolor
Customer Case : Media & Entertainment
Deep Learning in the AWS Cloud
AWS ML Stack - revisited
Frameworks &
Infrastructure
AWS Deep Learning AMI
GPU(P3 Instances)
MobileCPU IoT (Greengrass)
Vision:Rekognition Image Rekognition Video
Speech:Amazon Polly
Transcribe
Language:Lex Translate
Comprehend
Apache
MXNetPyTorch
Cognitive
ToolkitKeras
Caffe2
& CaffeTensorFlow Gluon
Application
Services
Platform
ServicesAmazon
Machine LearningMechanical
TurkSpark & EMR
Amazon
SageMakerAWS DeepLens
체크 포인트
• AWS ML (Machine Learning) Stack
• AWS MLApplications : Vision, Speech, Language
• AWS Media Capabilities – 8 Key media workloads
• Metadata Enrichment using AWS ML applications / platform services
• Continuous Update / Refinement is important
본 강연이 끝난 후…
• Amazon AI Home Page:
https://aws.amazon.com/blogs/ai/
• Amazon Rekognition Home Page:
https://aws.amazon.com/rekognition
• Amazon Polly Home Page:
https://aws.amazon.com/polly/
감사합니다