Tagged: web RSS

  • bixuan 11:13 on 2011 年 11 月 10 日 链接地址 | 回复
    Tags: web, 准则, 响应延迟,   

    WEB访问怎么样才算足够快?

    Jakob Nielsen是web可用性领域知名且备受推崇的专家,下面引用的内容论述了“足够快”的问题:

    基于Web应用的响应时间准则和其他应用一样。37年来这些准则毫无变化,所以它们也不太可能因新技术的出现而发生改变。

    0.1秒:用户直接操作UI中对象的感觉极限。比如,从用户选择表格中的一列到该列高亮或向用户反馈已被选择的时间间隔。理想情况下,它也是对列进行排序的响应时间——这种情况下用户会感到他们正在给表格排序。

    1秒:用户随意地在计算机指令空间进行操作而无需过度等待的感觉极限。0.2~1.0秒的延迟意味着会被用户注意到,因此感觉到计算机处于对指令的“处理中”,这有别于直接响应用户行为的指令。例如:如果根据被选择的列对表格进行排序无法在0.1秒内完成,那么必须在1秒内完成,否则用户将感觉到UI变得缓慢且在执行任务中失去“流畅(flow)”的体验。超过1秒的延迟要提示用户计算机正在解决这个问题,例如改变光标的形态。

    10秒:用户专注于任务的极限。超过10秒的任何操作都需要一个百分比完成指示器,以及一个方便用户中断操作且有清晰标识的方法。假设用户遭遇超过10秒延迟后才返回到原UI的情况,他们将需要重新适应。在用户的工作中,超过10秒的延迟仅在自然中断时可以接受,比如切换任务时。

     
  • bixuan 09:58 on 2010 年 11 月 30 日 链接地址 | 回复
    Tags: web,   

    优化页面加载时间 

    It is widely accepted that fast-loading pages improve the user experience. In recent years, many sites have started using AJAX techniques to reduce latency. Rather than round-trip through the server retrieving a completely new page with every click, often the browser can either alter the layout of the page instantly or fetch a small amount of HTML, XML, or javascript from the server and alter the existing page. In either case, this significantly decreases the amount of time between a user click and the browser finishing rendering the new content.

    However, for many sites that reference dozens of external objects, the majority of the page load time is spent in separate HTTP requests for images, javascript, and stylesheets. AJAX probably could help, but speeding up or eliminating these separate HTTP requests might help more, yet there isn’t a common body of knowledge about how to do so.

    While working on optimizing page load times for a high-profile AJAX application, I had a chance to investigate how much I could reduce latency due to external objects. Specifically, I looked into how the HTTP client implementation in common browsers and characteristics of common Internet connections affect page load time for pages with many small objects.

    I found a few things to be interesting:

    • IE, Firefox, and Safari ship with HTTP pipelining disabled by default; Opera is the only browser I know of that enables it. No pipelining means each request has to be answered and its connection freed up before the next request can be sent. This incurs average extra latency of the round-trip (ping) time to the user divided by the number of connections allowed. Or if your server has HTTP keepalives disabled, doing another TCP three-way handshake adds another round trip, doubling this latency.
    • By default, IE allows only two outstanding connections per hostname when talking to HTTP/1.1 servers or eight-ish outstanding connections total. Firefox has similar limits. Using up to four hostnames instead of one will give you more connections. (IP addresses don’t matter; the hostnames can all point to the same IP.)
    • Most DSL or cable Internet connections have asymmetric bandwidth, at rates like 1.5Mbit down/128Kbit up, 6Mbit down/512Kbit up, etc. Ratios of download to upload bandwidth are commonly in the 5:1 to 20:1 range. This means that for your users, a request takes the same amount of time to send as it takes to receive an object of 5 to 20 times the request size. Requests are commonly around 500 bytes, so this should significantly impact objects that are smaller than maybe 2.5k to 10k. This means that serving small objects might mean the page load is bottlenecked on the users’ upload bandwidth, as strange as that may sound.

    Using these, I came up with a model to guesstimate the effective bandwidth of users of various flavors of network connections when loading various object sizes. It assumes that each HTTP request is 500 bytes and that the HTTP reply includes 500 bytes of headers in addition to the object requested. It is simplistic and only covers connection limits and asymmetric bandwidth, and doesn’t account for the TCP handshake of the first request of a persistent (keepalive) connection, which is amortized when requesting many objects from the same connection. Note that this is best-case effective bandwidth and doesn’t include other limitations like TCP slow-start, packet loss, etc. The results are interesting enough to suggest avenues of exploration but are no substitute for actually measuring the difference with real browsers.

    To show the effect of keepalives and multiple hostnames, I simulated a user on net offering 1.5Mbit down/384Kbit up who is 100ms away with 0% packet loss. This roughly corresponds to medium-speed ADSL on the other side of the U.S. from your servers. Shown here is the effective bandwidth while loading a page with many objects of a given size, with effective bandwidth defined as total object bytes received divided by the time to receive them:

    [1.5megabit 100ms graph]

    Interesting things to note:

    • For objects of relatively small size (the left-hand portion of the graph), you can see from the empty space above the plotted line how little of the user’s downstream bandwidth is being used, even though the browser is requesting objects as fast as it can. This user has to be requesting objects larger than 100k before he’s mostly filling his available downstream bandwidth.
    • For objects under roughly 8k in size, you can double his effective bandwidth by turning keepalives on and spreading the requests over four hostnames. This is a huge win.
    • If the user were to enable pipelining in his browser (such as setting Firefox’s network.http.pipelining in about:config), the number of hostnames we use wouldn’t matter, and he’d make even more effective use of his available bandwidth. But we can’t control that server-side.

    Perhaps more clearly, the following is a graph of how much faster pages could load for an assortment of common access speeds and latencies with many external objects spread over four hostnames and keepalives enabled. Baseline (0%) is one hostname and keepalives disabled.

    [Speedup of 4 hostnames and keepalives on]

    Interesting things from that graph:

    • If you load many objects smaller than 10k, both local users and ones on the other side of the world could see substantial improvement from enabling keepalives and spreading requests over 4 hostnames.
    • There is a much greater improvement for users further away.
    • This will matter more as access speeds increase. The user on 100meg ethernet only 20ms away from the server saw the biggest improvement.

    One more thing I examined was the effect of request size on effective bandwidth. The above graphs assumed 500 byte requests and 500 bytes of reply headers in addition to the object contents. How does changing that affect performance of our 1.5Mbit down/384Kbit up and 100ms away user, assuming we’re already using four hostnames and keepalives?

    [Effective bandwidth at various request sizes]

    This shows that at small object sizes, we’re bottlenecked on the upstream bandwidth. The browser sending larger requests (such as ones laden with lots of cookies) seems to slow the requests down by 40% worst-case for this user.

    As I’ve said, these graphs are based on a simulation and don’t account for a number of real-world factors. But I’ve unscientifically verified the results with real browsers on real net and believe them to be a useful gauge. I’d like to find the time and resources to reproduce these using real data collected from real browsers over a range of object sizes, access speeds, and latencies.

    Measuring the effective bandwidth of your users

    You can measure the effective bandwidth of your users on your site relatively easily, and if the effective bandwidth of users viewing your pages is substantially below their available downstream bandwidth, it might be worth attempting to improve this.

    Before giving the browser any external object references (<img src=”…”>, <link rel=”stylesheet” href=”…”>, <script src=”…”>, etc), record the current time. After the page load is done, subtract the time you began, and include that time in the URL of an image you reference off of your server.

    Sample javascript implementing this:

    <html>
    <head>
    <title>...</title>
    <script type="text/javascript">
    <!--
    var began_loading = (new Date()).getTime();
    
    function done_loading() {
     (new Image()).src = '/timer.gif?u=' + self.location + '&t=' +
      (((new Date()).getTime() - began_loading) / 1000);
    }
    // -->
    </script>
    <!--
    Reference any external javascript or stylesheets after the above block.
    // -->
    </head>
    <body onload="done_loading()">
    <!--
    Put your normal page content here.
    // -->
    </body>
    </html>

    This will produce web log entries of the form:

    10.1.2.3 - - [28/Oct/2006:13:47:45 -0700] "GET /timer.gif?u=http://example.com/page.html&t=0.971 HTTP/1.1" 200 49 ...

    in this case, showing that for this user, loading the rest of http://example.com/page.html took 0.971 seconds. And if you know that the combined size of everything referenced from that page is 57842 bytes, 57842 bytes * 8 bits per byte / 0.971 seconds = 476556 bits per second effective bandwidth for that page load. If this user should be getting 1.5Mbit downstream bandwidth, there is substantial room for improvement.

    Tips to reduce your page load time

    After you gather some page-load times and effective bandwidth for real users all over the world, you can experiment with changes that will improve those times. Measure the difference and keep any that offer a substantial improvement.

    Try some of the following:

    • Turn on HTTP keepalives for external objects. Otherwise you add an extra round-trip to do another TCP three-way handshake and slow-start for every HTTP request. If you are worried about hitting global server connection limits, set the keepalive timeout to something short, like 5-10 seconds. Also look into serving your static content from a different webserver than your dynamic content. Having thousands of connections open to a stripped down static file webserver can happen in like 10 megs of RAM total, whereas your main webserver might easily eat 10 megs of RAM per connection.
    • Load fewer external objects. Due to request overhead, one bigger file just loads faster than two smaller ones half its size. Figure out how to globally reference the same one or two javascript files and one or two external stylesheets instead of many; if you have more, try preprocessing them when you publish them. If your UI uses dozens of tiny GIFs all over the place, consider switching to a much cleaner CSS-based design which probably won’t need so many images. Or load all of your common UI images in one request using a technique called “CSS sprites“.
    • If your users regularly load a dozen or more uncached or uncacheable objects per page, consider evenly spreading those objects over four hostnames. This usually means your users can have 4x as many outstanding connections to you. Without HTTP pipelining, this results in their average request latency dropping to about 1/4 of what it was before.

      When you generate a page, evenly spreading your images over four hostnames is most easily done with a hash function, like MD5. Rather than having all <img> tags load objects from http://static.example.com/, create four hostnames (e.g. static0.example.com, static1.example.com, static2.example.com, static3.example.com) and use two bits from an MD5 of the image path to choose which of the four hosts you reference in the <img> tag. Make sure all pages consistently reference the same hostname for the same image URL, or you’ll end up defeating caching.

      Beware that each additional hostname adds the overhead of an extra DNS lookup and an extra TCP three-way handshake. If your users have pipelining enabled or a given page loads fewer than around a dozen objects, they will see no benefit from the increased concurrency and the site may actually load more slowly. The benefits only become apparent on pages with larger numbers of objects. Be sure to measure the difference seen by your users if you implement this.

    • Possibly the best thing you can do to speed up pages for repeat visitors is to allow static images, stylesheets, and javascript to be unconditionally cached by the browser. This won’t help the first page load for a new user, but can substantially speed up subsequent ones.

      Set an Expires header on everything you can, with a date days or even months into the future. This tells the browser it is okay to not revalidate on every request, which can add latency of at least one round-trip per object per page load for no reason.

      Instead of relying on the browser to revalidate its cache, if you change an object, change its URL. One simple way to do this for static objects if you have staged pushes is to have the push process create a new directory named by the build number, and teach your site to always reference objects out of the current build’s base URL. (Instead of <img src=”http://example.com/logo.gif”> you’d use <img src=”http://example.com/build/1234/logo.gif”>. When you do another build next week, all references change to <img src=”http://example.com/build/1235/logo.gif”>.) This also nicely solves problems with browsers sometimes caching things longer than they should — since the URL changed, they think it is a completely different object.

      If you conditionally gzip HTML, javascript, or CSS, you probably want to add a “Cache-Control: private” if you set an Expires header. This will prevent problems with caching by proxies that won’t understand that your gzipped content can’t be served to everyone. (The Vary header was designed to do this more elegantly, but you can’t use it because of IE brokenness.)

      For anything where you always serve the exact same content when given the same URL (e.g. static images), add “Cache-Control: public” to give proxies explicit permission to cache the result and serve it to different users. If a caching proxy local to the user has the content, it is likely to have much less latency than you; why not let it serve your static objects if it has them?

      Avoid the use of query params in image URLs, etc. At least the Squid cache refuses to cache any URL containing a question mark by default. I’ve heard rumors that other things won’t cache those URLs at all, but I don’t have more information.

    • On pages where your users are often sent the exact same content over and over, such as your home page or RSS feeds, implementing conditional GETs can substantially improve response time and save server load and bandwidth in cases where the page hasn’t changed.

      When serving a static files (including HTML) off of disk, most webservers will generate Last-Modified and/or ETag reply headers for you and make use of the correspondingIf-Modified-Since and/or If-None-Match mechanisms on requests. But as soon as you add server-side includes, dynamic templating, or have code generating your content as it is served, you are usually on your own to implement these.

      The idea is pretty simple: When you generate a page, you give the browser a little extra information about exactly what was on the page you sent. When the browser asks for the same page again, it gives you this information back. If it matches what you were going to send, you know that the browser already has a copy and send a much smaller 304 (Not Modified) reply instead of the contents of the page again. And if you are clever about what information you include in an ETag, you can usually skip the most expensive database queries that would’ve gone into generating the page.

    • Minimize HTTP request size. Often cookies are set domain-wide, which means they are also unnecessarily sent by the browser with every image request from within that domain. What might’ve been a 400 byte request for an image could easily turn into 1000 bytes or more once you add the cookie headers. If you have a lot of uncached or uncacheable objects per page and big, domain-wide cookies, consider using a separate domain to host static content, and be sure to never set any cookies in it.
    • Minimize HTTP response size by enabling gzip compression for HTML and XML for browsers that support it. For example, the 17k document you are reading takes 90ms of the full downstream bandwidth of a user on 1.5Mbit DSL. Or it will take 37ms when compressed to 6.8k. That’s 53ms off of the full page load time for a simple change. If your HTML is bigger and more redundant, you’ll see an even greater improvement.

      If you are brave, you could also try to figure out which set of browsers will handle compressed Javascript properly. (Hint: IE4 through IE6 asks for its javascript compressed, then breaks badly if you send it that way.) Or look into Javascript obfuscators that strip out whitespace, comments, etc and usually get it down to 1/3 to 1/2 its original size.

    • Consider locating your small objects (or a mirror or cache of them) closer to your users in terms of network latency. For larger sites with a global reach, either use a commercial Content Delivery Network, or add a colo within 50ms of 80% of your users and use one of the many available methods for routing user requests to your colo nearest them.
    • Regularly use your site from a realistic net connection. Convincing the web developers on my project to use a “slow proxy” that simulates bad DSL in New Zealand (768Kbit down, 128Kbit up, 250ms RTT, 1% packet loss) rather than the gig ethernet a few milliseconds from the servers in the U.S. was a huge win. We found and fixed a number of usability and functional problems very quickly.

      To implement the slow proxy, I used the netem and HTB kernel modules available in the Linux 2.6 kernel, both of which are set up with the tc command line tool. These offer the most accurate simulation I could find, but are definitely not for the faint of heart. I’ve not used them, but supposedly Tamper Data for Firefox, Fiddler for Windows, and Charles for OSX can all rate-limit and are probably easier to set up, but they may not simulate latency properly.

    • Use Firebug for Firefox from a realistic net connection to see a graphical timeline of what it is doing during a page load. This shows where Firefox has to wait for one HTTP request to complete before starting the next one and how page load time increases with each object loaded. YSlow extends Firebug to offer tips on how to improve your site’s performance.

      The Safari team offers a tip on a hidden feature in their browser that offers some timing data too.

      Or if you are familiar with the HTTP protocol and TCP/IP at the packet level, you can watch what is going on using tcpdumpngrep, or ethereal. These tools are indispensable for all sorts of network debugging.

    • Try benchmarking common pages on your site from a local network with ab, which comes with the Apache webserver. If your server is taking longer than 5 or 10 milliseconds to generate a page, you should make sure you have a good understanding of where it is spending its time.

      If your latencies are high and your webserver process (or CGI if you are using that) is eating a lot of CPU during this test, it is often a result of using a scripting language that needs to recompile your scripts with every request. Software like eAccelerator for PHP, mod_perl for perl, mod_python for python, etc can cache your scripts in a compiled state, dramatically speeding up your site. Beyond that, look at finding a profiler for your language that can tell you where you are spending your CPU. If you improve that, your pages will load faster and you’ll be able to handle more traffic with fewer machines.

      If your site relies on doing a lot of database work or some other time-consuming task to generate the page, consider adding server-side caching of the slow operation. Most people start with writing a cache to local memory or local disk, but that starts to fall down if you expand to more than a few web server machines. Look into usingmemcached, which essentially creates an extremely fast shared cache that’s the combined size of the spare RAM you give it off of all of your machines. It has clients available in most common languages.

    • (Optional) Petition browser vendors to turn on HTTP pipelining by default on new browsers. Doing so will remove some of the need for these tricks and make much of the web feel much faster for the average user. (Firefox has this disabled supposedly because some proxies, some load balancers, and some versions of IIS choke on pipelined requests. But Opera has found sufficient workarounds to enable pipelining by default. Why can’t other browsers do similarly?)

    The above list covers improving the speed of communication between browser and server and can be applied generally to many sites, regardless of what web server software they use or what language the code behind your site is written in. There is, unfortunately, a lot that isn’t covered.

    While the tips above are intended to improve your page load times, a side benefit of many of them is a reduction in server bandwidth and CPU needed for the average page view. Reducing your costs while improving your user experience seems it should be worth spending some time on.

    From: http://www.die.net/musings/page_load_time/

     
  • bixuan 15:01 on 2009 年 01 月 22 日 链接地址 | 回复
    Tags: , UI, web, 应用程序, 操作系统, , 硬件,   

    S3-组成Web站点元素 

    S3:组成Web站点元素

    不了解组成Web站点元素,就没办法进行架构的设计,犹如厨师不知道有哪些材料,那他如何做的出可口的菜肴?
    从外到里:

    1. 浏览器
    2. DNS
    3. Web UI设计
    4. 应用程序
    5. 操作系统
    6. 服务器硬件
    7. 网络系统

    返回上一章:《S2:性能参数

     
  • bixuan 09:51 on 2009 年 01 月 22 日 链接地址 | 回复
    Tags: lynx, web, ,   

    《web性能优化》拾遗 

    1. 需要多大的带宽?
    2. 对web站点的性能来说,服务器带宽是为仪最重要的因素。实际上确定需要怎么样的带宽的数学公式非常简单:次/秒(每秒访问的次数)*比特/次(每次访问的平均容量)=比特/秒

    3. 需要多快的服务器?
    4. 对绝大多数网站来说,处理静态文件的性能兵并不是瓶颈。
      因特网上一个http传输的全部时间通常是2-20秒,其中大部分是由调制解调器和因特网带宽以及延迟限制带来的。 (阅读全文 …)

     
  • bixuan 09:34 on 2008 年 12 月 01 日 链接地址 | 回复
    Tags: , response time, web,   

    Web服务器性能的主要指标:

    1. 每秒处理的的请求数:req/s
    2. 每个请求的响应时间:response time

    提醒一下自己:)

     
  • bixuan 10:16 on 2008 年 09 月 07 日 链接地址 | 回复
    Tags: , elements, web, website, , 站点体系   

    Web站点体系结构组成元素 

    一般来说,组成Web站点体系结构有如下几个基本元素。

    浏览器

    因为Web浏览器标准、简单且普遍使用,所以它可以称得上是一个接近理想状态的图形用户接口(Graphical User Interface,GUI)。
    目前比较流行的浏览器有:IE,firefox,opera,safari等,所以必须要了解其的相关特性,这也利于更好的利用这些特性来做相关架构的设计。

    负载均衡

    最简单的莫属DNS轮询(Round Robin DNS)方式了,但是不建议使用,因为下面的三个原因迫使你特别小心:
    1. Round Robin DNS无法实现真正的负载均衡,但是在一些简单情况下还是能够均衡负载。真正的负载均衡是监测服务器的使用情况,以及根据该使用情况来分配连接,以便能始终将连接分配给那些有足够的容量来处理这些连接的服务器。
    当Round Robin集中的一台服务器比其他服务器慢很多时,就会产生一种称为”护航(convoying)“的特殊情况,这时用户会列队等待速度较慢的服务器,而较快的服务器则未被使用。真正的负载均衡不会出现这样的问题。
    2. RRDNS不会视图解决服务器的失效问题。用户仍然会被引导到失效的服务器上。真正的负载均衡可以提高站点的可用性,因为如果一台服务器出现故障,那么其他的服务器会自动接过该服务器的负载。
    3. RRDNS很难保持用户的状态,特别是使用session的业务,比如某个用户在发表文章或者回复的时候,应用程序会对该用户的session保存在当前的服务器上,但是当用户写好文章或者回复开始提交后,因为RRDNS,结果发现用户提交到了另外的服务器上,因为新的服务器上没有用户的session,提示用户未登陆等警告信息,所以会导致提交失败。
    很多情况,情况当要从dns里删除失效的IP时,会发现DNS的更新非常慢,因为很多LOCAL DNS并不遵循相关规范,这样有许多用户的LOCAL DNS服务器的缓存里仍会保留这个失效的IP,而且保留的时候甚至会很久,在国内特别是小的ISP常会这么做。

    IP级别的负载均衡
    这里常见的软件的实现方式有LVS,值得骄傲的是LVS是由国人章文嵩开发的,其简单高效,当然也需要配合其他的HA软件来实现”三H“。通过IP级别的负载均衡可以避免上述的RRDNS弊端。

    当然也可以使用硬件均衡设备。

    Web服务器

    目前常用开源的Web服务器有:Apache、Nginx、Lighttpd等。
    Web服务器的内容和日志应当分开保存到各自专用的磁盘上,这样可以避免他们相互干扰。

    中间件

    任何与一端的Web服务器和另一端的数据库交互的软件都可以被成为中间件。中间件的好处可以使结构清晰简单,可以提高整体性能。

    数据库

    数据库表可以通过某种方式被定义、镜像、分割、部署,以使之发挥最大的性能。数据库的优化是们深奥的学问,一个好的数据库管理员(Database Administrator,DBA)身价也是不菲的。
    目前常见的DB有:mysql、oracle等。

    虽然Web站点体系基本上是上述几个方面,但是影响Web性能确有更多的因素,只要把握上述几个方面,逐步排除和优化,我想结果一定不会差。

     
  • bixuan 11:23 on 2008 年 07 月 07 日 链接地址 | 回复
    Tags: analyzer, web   

    Web Page Analyzer 

    推荐一个站点:http://www.websiteoptimization.com/services/analyze/

    可以分析网页的执行效率同时会给出诸多改进的地方,不错!

    感谢laurence的推荐:)

     
  • bixuan 21:50 on 2008 年 06 月 26 日 链接地址 | 回复
    Tags: browser, , , web   

    跨平台、多浏览器页面测试 

    给自己做个标记:)

    browsershots提供了Linux、Windows、Mac os、BSD四个平台下的多个浏览器的页面截图,很强大。不过我主要就看看Windows和Mac os下的效果就好,选择太多浏览器会导致速度非常慢。

    (阅读全文 …)

     
  • bixuan 14:42 on 2007 年 12 月 29 日 链接地址 | 回复
    Tags: , golden, rule, web   

    Golden rule of web caching 

    The golden rule of web caching is: For the caching to be most effective, cache as close to the final product as possible.

    原文:http://gojko.net/2007/11/29/golden-rule-of-web-caching/

     
c
写新的
j
下一篇文章/下一个回复
k
前一篇文章/以前的回复
r
回复
e
编辑
o
显示/隐藏 回复
t
回到顶部
l
go to login
h
show/hide help
esc
取消