guides to analyzing webkit performance
DESCRIPTION
This talk will be about approaching WebKit. Goals: (1) Show ways to approach the codebase. (2) Establish ''do not guess but measure'' attitude.TRANSCRIPT
Guides To Analyzing WebKit Performance
– Looking at the internals –
Holger Freyther
Developer, WebKit Project
Jim Huang ( 黃敬群 ) <[email protected]>
Developer & Co-founder, 0xlab
April 28, 2011 / Android System Development Forum
Rights to copy
Attribution – ShareAlike 3.0You are free
to copy, distribute, display, and perform the workto make derivative worksto make commercial use of the work
Under the following conditionsAttribution. You must give the original author credit.Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under a license identical to this one.
For any reuse or distribution, you must make clear to others the license terms of this work.Any of these conditions can be waived if you get permission from the copyright holder.
Your fair use and other rights are in no way affected by the above.License text: http://creativecommons.org/licenses/by-sa/3.0/legalcode
© Copyright 2011 0xlabhttp://0xlab.org/
Corrections, suggestions, contributions and translations are welcome!
Latest update:April 28, 2011
• Optimize WebKit for the Content?
• Optimize the Content for WebKit?
The Goal of This Talk(1)
• For optimizing content, see the remote inspector work• Another example: Opera Mobile Accelerator
http://www.opera.com/press/releases/2004/06/09/• This talk will be about approaching WebKit
The Goal of This Talk(2)
• Take the fear from working on a big project.
• Show ways to approach the codebase.• Establish ''do not guess but measure''
attitude.
The Goal of This Talk(3)
Agenda (1) What is WebKit (project)?
(2) Android & WebKit
(3) How to prepare Android to measure?
What Groups are Responsible for Managing it?
8
GraphicsNetworking
WebCore
WebKit GTK WebKit Mac QtWebKit
JavascriptCore
WebKit Architecture
9
cairo & pangocurl / soup
WebCore
WebKit GTK
JavascriptCore
(Flexible) WebKit Implementations
Gtk+ windowGtk+ windowGtk+
QtWebKit
WebCore event Refresh the surface(expose event)
cairocairo„expose event“ callback
Gtk+ surfaceGtk+ surface
WebKit drawn to Gtk+
Gtk+ program
Gtk+
QtWebKit
WebCore event
Refreshing the surface
cairo
„update-requested“ callback
cairo surface (every supported cairo surface)
WebKit drawn to Gtk+ applications
„update-requested“ event
• ~80 Reviewers• ~110 Committers• 8 ports in the tree• Apple and Google (Chromium) are major
contributors• Many commits per day
What Is the WebKit Project?
• Check http://webkit.org/projects/goals.html• Goals vs. Non-Goals
• Content Engine, Security, Performance and more• Every change needs review, no performance
regression allowed• But performance tests are private due to
copyright laws
How Does WebKit Work?
• Android is not involved with the WebKit project• Android style open source model
• Android is using Chromium as upstream• Android does not include the data for Quality
Assurance (tests)• Who is fixing known security issues in the Android
code?
Android & WebKit
ChromiumAndroid'sWebKit fork
WebKit
JNIJNI
Skia bridge
Gtk+
WebKit
WebCore event Refresh the surface(expose event)
skiaskia
SurfaceSurface
WebKit in Android
Android.webkit.WebViewCoreandroid.webkit.WebView
...
v8
libpixelflinger
SurfaceFlinger
libui
SurfaceOverlay Camera EglWindowsformat Key / Event
OpenGL|ES
android. graphics. Canvasandroid.view. Surface
Graphic JNISurface JNI
android.view. View
GLSurfaceView javax .microedition .khronos .openglesUI Element
OpenGL JNI
com.google.android.gles_jni
FrameBuffer Driver
NativeFramework
JAVAFramework
Event Input Driver
skia
libui +SurfaceFlinger +
libpixelflinger
libui +SurfaceFlinger +
libpixelflinger
View and Canvas → SurfaceView and Canvas → Surface
Java Graphicsis implemented
via JNI
Java Graphicsis implemented
via JNI
• What is performance?• How to measure it on GNU/Linux?
• How to do it on Android/ARM?
Moving To Performance Now
• Amount of useful work accomplished• Examples:
• how fast does the page load?• How many frames per second are drawn?• How little/much bandwidth is used?
• Optimization mostly trade off between Memory and CPU usage
Computer Performance
• Do not assume, meassure it!• Have a manual or automatic testcase• Observe the system while running the testcase• Analyze the situation, make changes• Repeat until considered good enough
Performance Experiments
• Easy to setup• Open a site and wait, or scroll• Good for getting an idea• Bad for repeating and comparing results
Performance Experiments – Manual
• More difficult to create• Requires stable content• Should allow to compare results• Talos and others as a framework
• https://wiki.mozilla.org/Buildbot/Talos
Performance Experiments – Automatic
• perf (new way)• oprofile (old way)• Both are sampling profilers
• ARM Performance Counter; PMU (Performance Measurement Unit)
How to Observe on GNU/Linux
ProfilingAnalysisEngine
Prof.out
Profiling data preparation
Trigger Profiling
Profiling
Simulate
Profiling
Qualified?
END
Tune Performance by CPU Configuration
Yes
No
Evaluating and Tuning
# prepare the setup
$ rm -rf /var/lib/oprofile
$ opcontrol --start-daemon -p library -c 10
# run the app once to force loading it from nfs into the cache
$ ./tst_something
# start profiling
$ opcontrol –start
$ ./tst_something -iterations enough
# stop profiling
$ opcontrol -h
# generate reports
$ opreport -l
$ opreport -c …
Reference oprofile usage
Profile Result (A)
Profile Result (B)
Profiler sample
http://anton.ozlabs.org/blog/2009/09/04/using-performance-counters-for-linux/http://anton.ozlabs.org/blog/2009/09/04/using-performance-counters-for-linux/
perf is powerful.# perf record -s CMD Error: perfcounter syscall returned with -1 (Function not implemented) Fatal: No CONFIG_PERF_EVENTS=y kernel support configured?
perf is powerful.# perf record -s CMD Error: perfcounter syscall returned with -1 (Function not implemented) Fatal: No CONFIG_PERF_EVENTS=y kernel support configured?
• for Native libraries →• Use 'perf' built without libperl, libpython• oprofiled and opcontrol are there, CPU data is
missing• Binaries for ARM need frame pointers to have
backtraces• Java part is the performance hell always.
• traceview is a great tool for Java performance analysis.
• JVMTI / JDWP (Java Debug Wire Protocol, normally spoken between a VM and a debugger)
How to Do On Android/ARM?
• Upload some more files• Start oprofile with opcontrol on the device• Run the test on the device• Analyze with opreport on the PC
How to Do On Android/ARM?
# Overhead Command Shared Object Symbol# ........ ............... ..................... ......# 89.23% system_server 2b0c6c [.] 0x000000002b0c6c 1.26% MLVdo_thread [kernel_helper] [k] 0x0000000017aa90 1.05% d.process.acore libskia.so [.] S32A_Opaque_BlitRow32_arm 0.83% d.process.acore libcutils.so [.] android_memset32 0.63% system_server libc.so [.] memcpy 0.63% d.process.acore libc.so [.] memset
system_server is the process name of Android Framework runtime. It occupies most of CPU resources, but it is hard to figure out details only by native tools like perf.
system_server is the process name of Android Framework runtime. It occupies most of CPU resources, but it is hard to figure out details only by native tools like perf.
We can always optimize known performance hotspot routines such as S32A_Opaque_BlitRow32_arm but should be measured in advance.
We can always optimize known performance hotspot routines such as S32A_Opaque_BlitRow32_arm but should be measured in advance.
• What to measure? Loading, Painting, Scrolling?• No excellent benchmark suite available due to
copyright issues• Some frameworks are available, but mostly
manual work
Picking or Creating a Testcase
• Small Framework from the University of Szeged• Can load pages and count the time• Provides a summary with error interval
Introducing The Methanol Framework
http://webkit.sed.hu/blog/20101216/benchmarking-qtwebkit-v8-linux
• Webpages need to be converted• Everything loaded from the same URL• http://gitorious.org/methanol
Issues With Methanol Framework
• Using methanol with the example page• Executing it in the Browser• Observing it with oprofile/perf
Putting Everything Together
http://0xlab.org