Velocity 2011 - Part 2: Wednesday (1st day)

My notes on the first conference day at the Velocity Conference.

The keynotes where really good, especially the first one about the right mentality for web operations and how to build a career in web operations.

Videos are available on the Velocity 2011 Videos page, slides can be found on the Velocity 2011 Speakers Slides and Video page.

Read also about the workshops and the second day.

Keynotes Wednesday

A Career - How and Why

Theo Schlossnagl

http://www.youtube.com/watch?v=y0mHo7SMCQk&p=C394849408B5F203
What makes a career in web IT?
Career
- persuit, a willingness to mature, patience to become exceptional
truly excellent
- treat it as a craft
- become a craftsman
how to
1. educate yourself
2. be disciplined, keep doing it all the time seriously
3. learn from & share with your peers, the only ressource that evolves with you
4. be patient, experience takes time and mistakes
The Web changed everything, the Cloud made us realize that
Every modern website is SaaS (Software as a Service), provides a service to the users
Special: Each website is a single copy of a software
Web Operations is the job to operate it and make it run and keep running better
DevOps is wrong, it is incomplete, it should be *Ops
- Everyone in the organisation needs operational mentality
- the Software must be running, everybody should feel responsible for it
Virtualization is the game changer
- Provisioning new systems became cheap and instant
- Managing the risk is simple and painless if you skip it because some Ops guy does it.
Operations is like Security
- Not a feature, not "phase 2"
- It is a state of mind
- Everything we do needs to consider Operations
NEW: Everybody is in charge of Operations
- Oncall for developers
DevOps
- needs more Ops in Dev
- more Dev in Ops: Infrastructure as code
- think operationally

JavaScript & Metaperformance

Douglas Crockford

http://www.youtube.com/watch?v=HrFpqmgv2DY&p=C394849408B5F203
http://assets.en.oreilly.com/1/event/60/JavaScript%20_%20Metaperformance%20Presentation.pptx
JavaScript is a mix of Java, Scheme, Self
Early implementations were optimized to time-to-market, gradual performance degradation through 15y of patching
Now everybody wants to have the fastest engine
ECMAScript tradeoff between optimization and portability, the Web chose portability
Remember Java?
- Java specifies a VM and executable format
- Java applets where supposed to rule the world, but became the biggest failure in software development
  - write once, debug everywhere
  - horrendous UI
  - slow startup times (classloading)
  - bad security model
- No problem on servers, hence Java now very popular on servers
There is no executable JS format that
Instead, Optimize as you go (execution style adapted from Self)
How fast are the new engines?
- That is complicated
- There are no easy ways to compare, it depends on what the programs do
- Benchmarks are Lies
  - Made by people who don’t use the language but want to promote patterns that work well on their engines
Life on the Web
- Web developers have a difficult time with the browser
- Desire to produce applications that are highly responsive
- Browser platzform contains some deep inefficiencies
The DOM Bottleneck
- Browsers spend most of their time in layout, painting, marshalling, styling, IO …
Optimizing JS code is not always effective to optimize a website
- Because it is easy to do, difficult to optimize browser
- Same like drunk and lamppost, searching for glasses under light not where lost
- Web developers believe they must optimize prematurely because they are afraid they won’t be allowed to do so later
Evil patterns:
- The benchmarks prevent good coding practices
- Benchmarks do not reflect the characteristics of good programs
Douglas is the author of "JavaScript: The Good Parts"
- Discovered that there are good parts to JS
- Developed programming patterns that make use of the good parts of JS while avoiding the bad parts
- Wrote JSlint, a code quality tool, to analyze JS code to make it better
Invented new benchmark
- run jslint analyzing jslint.js
- it demonstrates real application behaviours that are not exercises in the other benchmarks, especially protoypal and functional patterns
- It is a test only of JS performance, it does not measure browser performance
- Results:
  - Chrome 10: 2.8s
  - IE 9: 1.15
  - …
  - Safari 5: 0.98
  - FF 4: 0.95
  - IE 10 preview: 0.56
  - Chrome 13 canary: 0.54
- Chrome was the slowest, Google took up the challange to optimize V8 JS engine to be the fastest for good JS programming patterns
Conclusion:
- Fastest JS engine is also possible for good programming patterns
- Browsers get closer to each other in JS performance
- http://crockford.com/javascript/performance.html

Looking at Your Data

John Rauser, Amazon

http://www.youtube.com/watch?v=coNDCIMH8bk&p=C394849408B5F203
When looking at typical web performance charts (time series plots), what information did you already loose?
- A single number represents all the information for 5 minutes!
- But is the average really the interesting part of this 5 minutes?
Often, a simple average is the worst:
- Worst compression algorithm
- Most irrelevant data
- Examples from analyzing Van Gogh painting and Moby Dick novel
Better ways:
- Histograms?
- Example: Height of humans. Histogram is actually an overlay of 2 histograms, one for men and women each.
Look at the histogram of Latency of your website
- Typically gamma distribution of latencies (peak at the average, exponential decline after)
- If you see more peaks then you probably you have different populations of users, e.g. slow modem users and DSL user. Or half the users are logged in and you show them their foto but this takes time.
- This is much more info than the average
What do we learn from this:
- Understand the different groups of users and how to help them
How do we get good histograms?
- graph tool vendors should add the histogram info
- combined time-series plot and histogram, add color information to show the histrogram for each point of the data.
Analytical methods are sometimes difficult to apply, much effort
Better look at the data and use the actual data. Example from Amazon
- Good story how a user beeing redirected to an early test server got severely slowed down by those redirects.
- Analysed raw logs printed out on paper
Conclusion:
- Look at your data in detail
- You will learn much more from that then from plain average numbers

Testing and Monitoring Mobile Apps

Vik Chaudhary, Manny Gonzales

http://www.youtube.com/watch?v=C4rdHh7waVw&p=C394849408B5F203
http://assets.en.oreilly.com/1/event/60/Testing%20and%20Monitoring%20the%20Smartphone%20Experience%20%20Presentation.pdf
Keynote (Platinum Diamond Sponsor) product presentation
Why test and monitor mobile apps?
- Huge and important market
- >400k iPhone Apps, >200k Android Apps
Challenges
- remotely connect & control live smartphones everywhere
- Scripting interface purpose-built for smartphones (much more than e.g. a browser). Turning the phones, gestures on the touch screen…
- Intelligence for troubleshooting smartphone apps
"Mobile Device Perspective 5.0" Demo
- Remote interactive testing
- Scripting and Replay
- Automated Monitoring by running it 24/7
Live Demo with Fandango
Conclusion:
- Live Interactive testing with real smartphones
- What we know from the web is now also possible for mobile apps - fully automated!

Change = Mass x Velocity, and other laws of Infrastructure

Mark Burgess

http://www.youtube.com/watch?v=KOeVBanjC18&p=C394849408B5F203
cfengine author, CTO and Founder of CFEngine
Talk is about how to manage change
Rate of change = impact = MV
- V: How fast do we need to respond (don’t let things get worse)?
- M: What bulk/inertia is holding us back (what is the scale of the change)? …
Conclusion:
- Re-Humanize IT operations
- Knowlegde-based Operations

Lightning Demos

Next Gen YSLOW

Marcel Duran

http://developer.yahoo.com/yslow/
yslow is a very popular website performance tool
- ~2.6 mio. downloads
- but still many web developers don’t know about web site performance
New Yahoo Developer Network website
- YSlow Scoremeter
  - See how various improvements would affect the YSlow score
  - Help decide what to optimize
  - Understand internals of YSlow measuring metrics
- YSlow as bookmarklet for mobile devices
coming soon: HAR importer

webpagetest.org

Patrick Meenan

http://www.webpagetest.org/
web tool for testing websites
check out the advanced tab
- scripting language to customize tests
- used by backend engine
Demo: amazon.com vs. newegg comparison
- same workflow (buy a HD)
- record movie of web site performance

Web Inspector Remote

Dave Johnson

http://pmuellr.github.com/weinre/
Remote debugging for mobile phones
Like Firebug, but for a mobile phone
WEINRE, phonegap mobile phone platform
Using webkit developer tools
Run a server somewhere (Java)
Add little script to the page to load JavaScript from WEINRE server
Desktop running WEINRE becomes a remote control for Web Inspector running on the mobile device
Live Demo with Android phone

HTTPArchive

Steve Souders

http://httparchive.org
The HTTP Archive tracks how the Web is built
automated webpagetest.org runs of thousands of sites (e.g. top 100 and top 1000 from alexa, but also others)
collect aggregate statistics of interesting web site data
compare different times or selections
discover trends
while internet archive tracks the content of the web, http archive tracks the technology behind the websites
Announcement: HTTP Archive will be a project under the Internet Archive
http://www.stevesouders.com/blog/2011/06/15/http-archive-1m-urls-internet-archive-sponsors/

Cotendo

Michael Kuperman, Ronni Zehavi

cotendo
young company, focusing on performance for mobile and web sites
Announce: Cooperation with Citrix to combine Citrix Netscaler and Cotendo cloud
Announce: Launching Mobile Acceleration Suite
SPDY Protocol now available on Cotendo’s global network

The neustar Story From Inception to Acquisition

Patrick Lightbody

Now part of Neustar
About the BrowserMob startup
- 3 employees
- 500+ customers
- profitable in the first month of launch
- 2008/09 - initial idea
- 2010/07 - acquired by Neustar
cloud testing service
Lessons Learned:
- Building Business Plan
  - Quick and easy to setup selenium
  - Very difficult and expensive to write equivalent JMeter scripts
  - Actually why not run all tests in Selenium via Browsers? It is cheaper to run many browsers in the Cloud than to write a JMeter script!
- Outsource Everything
- Optimize for the Cloud
  - Myth: cloud is infinitely scalable
  - Launch: limited to 20 instances at once
  - Solution: Be a good "cloud citizen" and adjust to be nice to the cloud
  - Now spin up thousands of instances every day, run them only for short time
  - Lesson: Cloud is not a black box, work also with the people running the cloud
- Benchmark differences between EC2 instances
- Finance models depend on architecture
  - Storing results in S3 buckets
  - Store duplicate data in each region (storage costs <<< bandwith costs)
  - Use user-data to automate generic AMIs
- Processing Lots of Data
  - Don’t solve the wrong problem with the wrong product
    - Focused on self-service over analytical features
    - Dedicated MySQL database was simple yet powerful and perfect for EC2
  - Don’t build the right product the wrong way
    - Fail fast: experiment with data storage techniques
    - Stored monitoring data in many ways (in parallel + try/catch) to find out optimal structure
- Sleeping Well at Night
  - Monitoring everything
    - Script to do phone calls
    - If you get pulled out of bed you tend to write better code
    - Auto-heal everything possible
- Startup Tips
  - Alering: PagerDuty
  - NoSQL, ideally hosted
  - Subscriptions and Billing: Zuora, Chargify
  - Performance Metrics: CloudWatch, RRDTool
  - RackSpace

O’Reilly Radar

Tim O’Reilly

http://www.youtube.com/watch?v=9Kn-RrAg9FI
Tim telling his version of why what we web site people do is so important

Addressing the Scalability Challenge of Server-Sent Events (AKA Long Polling)

Stephen Ludin, Chief Architect, Akamai Technologies

http://assets.en.oreilly.com/1/event/60/Addressing%20the%20Scalability%20Challenge%20of%20Server-Sent%20Events%20Presentation.pptx
Long Poll
- Browser uses XMLHttpRequest to connect to origin and waits
- When there is data to send, the origin responds
- Variants:
  - Long Polling
  - Server-Sent Events
  - HTTP Streaming
  - Bayeux
  - BOSH
  - Comet
- Usage is growing
Customers start using Long Poll, but get Problems
- Too many connections
- Trading off high request rate (polling) for massive concurrent connections
- Scaling at the Origin
  - Not everyone uses event-driven Web servers (Jetty, lighttpd, nginx …)
  - Still a lot of older architectures out there
What is really desired is a "Server Push" model
- But long polling is still wide spread and increasing
- simple to use, provides modern applications right now
How can a CDN help with long polling?
- Offload via edge caching or computing?
- Application of business logic (Akamai running customer business logic)
Akamai Solutions
- Subscribe/Publish model (Half Sync / Half Async)
  - CDN terminates the HTTP connection and sends a event ID with a token to origin
  - If the server wants to reply to that event subscription it connects to the CDN and sends it the event signed with the token
- Half Sync / Half-Async Benefits
  - Ability to scale
  - Enables "true" Server Push
  - Retains "real time" notification
- Token Construction
  - Info needed to get back to the edge machine (IP)
  - Customer specific code
  - User information
  - Subscriptiong (Event) information
  - Expiration
- On the client
  - Use HTML5 Server-Sent Events
  - Use long-polling
  - No change required
- On the edge
Implementation details
- Akamai XML config
- Special handling on the origin side
- To deliver a message, origin sends a request to the CDN
- The CDN converts this request into HTTP reply to the open connection on the edge
Subscription Types
- Single Event (one shot)
  - Simple
- Repeatable Event
  - Origin → CDN: Multiple Requests
  - DCN → Client: HTTP Streaming
- HTTP Streaming
  - Similar to Multiple Events
  - Potential for multiplexing, e.g. sending data to several clients
Security
- Risk: Bogus Event Injection
  - SSL on all sides will help
  - Origin → CDN must be authenticated
  - The token must be secure, shared secret or asymetrical, signed, replay protection
Some Error Cases
- If token is wrong or something does not match, drop connections and force resubscription
- Origin should detect multiple subscriptions and resolve conflicts
- Annoying routers dropping quiet connections
Mobile - Connectionless Push Friendly
- The ISPs and CDN do all the work
- Clients and Origin not involved as long as no new data
What about WebSockets
- Not a good candidate (today)
- Bi-directional
- Opaque
- Standard Acceleration techniques are ideal
- Probably over time some "standard" patterns will evolve
Use Cases
- E-Mail - Everybody wants to know when new mail arrives
- Social Networks - Friends stream / updates
- Stock Quotes
- Cloud printing - printers keep connections to print servers in the cloud
Conclusion
- Server-Sent events is a great thing
- Introduces connection scaling problems
- CDNs can help with the scaling problem
- Offload open connections to the edge
- CDNs can offer a true server-push paradigm that is not possible without CDNs

Creating the Dev/Test/PM/Ops Supertribe: From "Visible Ops" to DevOps

Gene Kim

http://assets.en.oreilly.com/1/event/60/Creating%20the%20Dev_Test_PM_Ops%20Supertribe_%20From%20_Visible%20Ops_%20To%20DevOps%20Presentation.pptx
Very good !!!
genek at realgenekim.me

Building for the Cloud: Lessons Learned at Heroku

Mark Imbriaco, Director of Cloud Operations

Very good paradigms and lessons learned for building cloud applications !!!
Sadly, no public video and no slides

Search This Blog

Schlomo Schapiro