2011-06-15

Velocity 2011 - Part 2: Wednesday (1st day)

My notes on the first conference day at the Velocity Conference.

The keynotes where really good, especially the first one about the right mentality for web operations and how to build a career in web operations.

Videos are available on the Velocity 2011 Videos page, slides can be found on the Velocity 2011 Speakers Slides and Video page.

Read also about the workshops and the second day.

Keynotes Wednesday

A Career - How and Why

Theo Schlossnagl
  • http://www.youtube.com/watch?v=y0mHo7SMCQk&p=C394849408B5F203
  • What makes a career in web IT?
  • Career
    • persuit, a willingness to mature, patience to become exceptional
  • truly excellent
    • treat it as a craft
    • become a craftsman
  • how to
    1. educate yourself
    2. be disciplined, keep doing it all the time seriously
    3. learn from & share with your peers, the only ressource that evolves with you
    4. be patient, experience takes time and mistakes
  • The Web changed everything, the Cloud made us realize that
  • Every modern website is SaaS (Software as a Service), provides a service to the users
  • Special: Each website is a single copy of a software
  • Web Operations is the job to operate it and make it run and keep running better
  • DevOps is wrong, it is incomplete, it should be *Ops
    • Everyone in the organisation needs operational mentality
    • the Software must be running, everybody should feel responsible for it
  • Virtualization is the game changer
    • Provisioning new systems became cheap and instant
    • Managing the risk is simple and painless if you skip it because some Ops guy does it.
  • Operations is like Security
    • Not a feature, not "phase 2"
    • It is a state of mind
    • Everything we do needs to consider Operations
  • NEW: Everybody is in charge of Operations
    • Oncall for developers
  • DevOps
    • needs more Ops in Dev
    • more Dev in Ops: Infrastructure as code
    • think operationally

JavaScript & Metaperformance

Douglas Crockford
  • http://www.youtube.com/watch?v=HrFpqmgv2DY&p=C394849408B5F203
  • http://assets.en.oreilly.com/1/event/60/JavaScript%20_%20Metaperformance%20Presentation.pptx
  • JavaScript is a mix of Java, Scheme, Self
  • Early implementations were optimized to time-to-market, gradual performance degradation through 15y of patching
  • Now everybody wants to have the fastest engine
  • ECMAScript tradeoff between optimization and portability, the Web chose portability
  • Remember Java?
    • Java specifies a VM and executable format
    • Java applets where supposed to rule the world, but became the biggest failure in software development
      • write once, debug everywhere
      • horrendous UI
      • slow startup times (classloading)
      • bad security model
    • No problem on servers, hence Java now very popular on servers
  • There is no executable JS format that
  • Instead, Optimize as you go (execution style adapted from Self)
  • How fast are the new engines?
    • That is complicated
    • There are no easy ways to compare, it depends on what the programs do
    • Benchmarks are Lies
      • Made by people who don’t use the language but want to promote patterns that work well on their engines
  • Life on the Web
    • Web developers have a difficult time with the browser
    • Desire to produce applications that are highly responsive
    • Browser platzform contains some deep inefficiencies
  • The DOM Bottleneck
    • Browsers spend most of their time in layout, painting, marshalling, styling, IO …
  • Optimizing JS code is not always effective to optimize a website
    • Because it is easy to do, difficult to optimize browser
    • Same like drunk and lamppost, searching for glasses under light not where lost
    • Web developers believe they must optimize prematurely because they are afraid they won’t be allowed to do so later
  • Evil patterns:
    • The benchmarks prevent good coding practices
    • Benchmarks do not reflect the characteristics of good programs
  • Douglas is the author of "JavaScript: The Good Parts"
    • Discovered that there are good parts to JS
    • Developed programming patterns that make use of the good parts of JS while avoiding the bad parts
    • Wrote JSlint, a code quality tool, to analyze JS code to make it better
  • Invented new benchmark
    • run jslint analyzing jslint.js
    • it demonstrates real application behaviours that are not exercises in the other benchmarks, especially protoypal and functional patterns
    • It is a test only of JS performance, it does not measure browser performance
    • Results:
      • Chrome 10: 2.8s
      • IE 9: 1.15

      • Safari 5: 0.98
      • FF 4: 0.95
      • IE 10 preview: 0.56
      • Chrome 13 canary: 0.54
    • Chrome was the slowest, Google took up the challange to optimize V8 JS engine to be the fastest for good JS programming patterns
  • Conclusion:

Looking at Your Data

John Rauser, Amazon
  • http://www.youtube.com/watch?v=coNDCIMH8bk&p=C394849408B5F203
  • When looking at typical web performance charts (time series plots), what information did you already loose?
    • A single number represents all the information for 5 minutes!
    • But is the average really the interesting part of this 5 minutes?
  • Often, a simple average is the worst:
    • Worst compression algorithm
    • Most irrelevant data
    • Examples from analyzing Van Gogh painting and Moby Dick novel
  • Better ways:
    • Histograms?
    • Example: Height of humans. Histogram is actually an overlay of 2 histograms, one for men and women each.
  • Look at the histogram of Latency of your website
    • Typically gamma distribution of latencies (peak at the average, exponential decline after)
    • If you see more peaks then you probably you have different populations of users, e.g. slow modem users and DSL user. Or half the users are logged in and you show them their foto but this takes time.
    • This is much more info than the average
  • What do we learn from this:
    • Understand the different groups of users and how to help them
  • How do we get good histograms?
    • graph tool vendors should add the histogram info
    • combined time-series plot and histogram, add color information to show the histrogram for each point of the data.
  • Analytical methods are sometimes difficult to apply, much effort
  • Better look at the data and use the actual data. Example from Amazon
    • Good story how a user beeing redirected to an early test server got severely slowed down by those redirects.
    • Analysed raw logs printed out on paper
  • Conclusion:
    • Look at your data in detail
    • You will learn much more from that then from plain average numbers

Testing and Monitoring Mobile Apps

Vik Chaudhary, Manny Gonzales

Change = Mass x Velocity, and other laws of Infrastructure

Mark Burgess
  • http://www.youtube.com/watch?v=KOeVBanjC18&p=C394849408B5F203
  • cfengine author, CTO and Founder of CFEngine
  • Talk is about how to manage change
  • Rate of change = impact = MV
    • V: How fast do we need to respond (don’t let things get worse)?
    • M: What bulk/inertia is holding us back (what is the scale of the change)? …
  • Conclusion:
    • Re-Humanize IT operations
    • Knowlegde-based Operations

Lightning Demos

Next Gen YSLOW

Marcel Duran
  • http://developer.yahoo.com/yslow/
  • yslow is a very popular website performance tool
    • ~2.6 mio. downloads
    • but still many web developers don’t know about web site performance
  • New Yahoo Developer Network website
    • YSlow Scoremeter
      • See how various improvements would affect the YSlow score
      • Help decide what to optimize
      • Understand internals of YSlow measuring metrics
    • YSlow as bookmarklet for mobile devices
  • coming soon: HAR importer

webpagetest.org

Patrick Meenan
  • http://www.webpagetest.org/
  • web tool for testing websites
  • check out the advanced tab
    • scripting language to customize tests
    • used by backend engine
  • Demo: amazon.com vs. newegg comparison
    • same workflow (buy a HD)
    • record movie of web site performance

Web Inspector Remote

Dave Johnson
  • http://pmuellr.github.com/weinre/
  • Remote debugging for mobile phones
  • Like Firebug, but for a mobile phone
  • WEINRE, phonegap mobile phone platform
  • Using webkit developer tools
  • Run a server somewhere (Java)
  • Add little script to the page to load JavaScript from WEINRE server
  • Desktop running WEINRE becomes a remote control for Web Inspector running on the mobile device
  • Live Demo with Android phone

HTTPArchive

Steve Souders

Cotendo

Michael Kuperman, Ronni Zehavi
  • cotendo
  • young company, focusing on performance for mobile and web sites
  • Announce: Cooperation with Citrix to combine Citrix Netscaler and Cotendo cloud
  • Announce: Launching Mobile Acceleration Suite
  • SPDY Protocol now available on Cotendo’s global network

The neustar Story From Inception to Acquisition

Patrick Lightbody
  • Now part of Neustar
  • About the BrowserMob startup
    • 3 employees
    • 500+ customers
    • profitable in the first month of launch
    • 2008/09 - initial idea
    • 2010/07 - acquired by Neustar
  • cloud testing service
  • Lessons Learned:
    • Building Business Plan
      • Quick and easy to setup selenium
      • Very difficult and expensive to write equivalent JMeter scripts
      • Actually why not run all tests in Selenium via Browsers? It is cheaper to run many browsers in the Cloud than to write a JMeter script!
    • Outsource Everything
    • Optimize for the Cloud
      • Myth: cloud is infinitely scalable
      • Launch: limited to 20 instances at once
      • Solution: Be a good "cloud citizen" and adjust to be nice to the cloud
      • Now spin up thousands of instances every day, run them only for short time
      • Lesson: Cloud is not a black box, work also with the people running the cloud
    • Benchmark differences between EC2 instances
    • Finance models depend on architecture
      • Storing results in S3 buckets
      • Store duplicate data in each region (storage costs <<< bandwith costs)
      • Use user-data to automate generic AMIs
    • Processing Lots of Data
      • Don’t solve the wrong problem with the wrong product
        • Focused on self-service over analytical features
        • Dedicated MySQL database was simple yet powerful and perfect for EC2
      • Don’t build the right product the wrong way
        • Fail fast: experiment with data storage techniques
        • Stored monitoring data in many ways (in parallel + try/catch) to find out optimal structure
    • Sleeping Well at Night
      • Monitoring everything
        • Script to do phone calls
        • If you get pulled out of bed you tend to write better code
        • Auto-heal everything possible
    • Startup Tips
      • Alering: PagerDuty
      • NoSQL, ideally hosted
      • Subscriptions and Billing: Zuora, Chargify
      • Performance Metrics: CloudWatch, RRDTool
      • RackSpace

O’Reilly Radar

Tim O’Reilly

Addressing the Scalability Challenge of Server-Sent Events (AKA Long Polling)

Stephen Ludin, Chief Architect, Akamai Technologies
  • http://assets.en.oreilly.com/1/event/60/Addressing%20the%20Scalability%20Challenge%20of%20Server-Sent%20Events%20Presentation.pptx
  • Long Poll
    • Browser uses XMLHttpRequest to connect to origin and waits
    • When there is data to send, the origin responds
    • Variants:
      • Long Polling
      • Server-Sent Events
      • HTTP Streaming
      • Bayeux
      • BOSH
      • Comet
    • Usage is growing
  • Customers start using Long Poll, but get Problems
    • Too many connections
    • Trading off high request rate (polling) for massive concurrent connections
    • Scaling at the Origin
      • Not everyone uses event-driven Web servers (Jetty, lighttpd, nginx …)
      • Still a lot of older architectures out there
  • What is really desired is a "Server Push" model
    • But long polling is still wide spread and increasing
    • simple to use, provides modern applications right now
  • How can a CDN help with long polling?
    • Offload via edge caching or computing?
    • Application of business logic (Akamai running customer business logic)
  • Akamai Solutions
    • Subscribe/Publish model (Half Sync / Half Async)
      • CDN terminates the HTTP connection and sends a event ID with a token to origin
      • If the server wants to reply to that event subscription it connects to the CDN and sends it the event signed with the token
    • Half Sync / Half-Async Benefits
      • Ability to scale
      • Enables "true" Server Push
      • Retains "real time" notification
    • Token Construction
      • Info needed to get back to the edge machine (IP)
      • Customer specific code
      • User information
      • Subscriptiong (Event) information
      • Expiration
    • On the client
      • Use HTML5 Server-Sent Events
      • Use long-polling
      • No change required
    • On the edge
  • Implementation details
    • Akamai XML config
    • Special handling on the origin side
    • To deliver a message, origin sends a request to the CDN
    • The CDN converts this request into HTTP reply to the open connection on the edge
  • Subscription Types
    • Single Event (one shot)
      • Simple
    • Repeatable Event
      • Origin → CDN: Multiple Requests
      • DCN → Client: HTTP Streaming
    • HTTP Streaming
      • Similar to Multiple Events
      • Potential for multiplexing, e.g. sending data to several clients
  • Security
    • Risk: Bogus Event Injection
      • SSL on all sides will help
      • Origin → CDN must be authenticated
      • The token must be secure, shared secret or asymetrical, signed, replay protection
  • Some Error Cases
    • If token is wrong or something does not match, drop connections and force resubscription
    • Origin should detect multiple subscriptions and resolve conflicts
    • Annoying routers dropping quiet connections
  • Mobile - Connectionless Push Friendly
    • The ISPs and CDN do all the work
    • Clients and Origin not involved as long as no new data
  • What about WebSockets
    • Not a good candidate (today)
    • Bi-directional
    • Opaque
    • Standard Acceleration techniques are ideal
    • Probably over time some "standard" patterns will evolve
  • Use Cases
    • E-Mail - Everybody wants to know when new mail arrives
    • Social Networks - Friends stream / updates
    • Stock Quotes
    • Cloud printing - printers keep connections to print servers in the cloud
  • Conclusion
    • Server-Sent events is a great thing
    • Introduces connection scaling problems
    • CDNs can help with the scaling problem
    • Offload open connections to the edge
    • CDNs can offer a true server-push paradigm that is not possible without CDNs

Creating the Dev/Test/PM/Ops Supertribe: From "Visible Ops" to DevOps

Gene Kim

Building for the Cloud: Lessons Learned at Heroku

Mark Imbriaco, Director of Cloud Operations
  • Very good paradigms and lessons learned for building cloud applications !!!
  • Sadly, no public video and no slides