Intel 寻找热点

Lab 1: Finding Hotspots


___________________________________________________________________

Developer Product Division

寻找热点

Disclaimer

The information contained in this document is provided for informational
purposes only and represents the current view of Intel Corporation ("Intel") and
its contributors ("Contributors") on, as of the date of publication. Intel and
the Contributors make no commitment to update the information contained in this
document, and Intel reserves the right to make changes at any time, without
notice.

DISCLAIMER. THIS DOCUMENT, IS PROVIDED "AS IS." NEITHER INTEL, NOR THE
CONTRIBUTORS MAKE ANY REPRESENTATIONS OF ANY KIND WITH RESPECT TO PRODUCTS
REFERENCED HEREIN, WHETHER SUCH PRODUCTS ARE THOSE OF INTEL, THE CONTRIBUTORS,
OR THIRD PARTIES. INTEL, AND ITS CONTRIBUTORS EXPRESSLY DISCLAIM ANY AND ALL
WARRANTIES, IMPLIED OR EXPRESS, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES OF
MERCHANTABILITY, FITNESS FOR ANY PARTICULAR PURPOSE, NON-INFRINGEMENT, AND ANY
WARRANTY ARISING OUT OF THE INFORMATION CONTAINED HEREIN, INCLUDING WITHOUT
LIMITATION, ANY PRODUCTS, SPECIFICATIONS, OR OTHER MATERIALS REFERENCED HEREIN.
INTEL, AND ITS CONTRIBUTORS DO NOT WARRANT THAT THIS DOCUMENT IS FREE FROM
ERRORS, OR THAT ANY PRODUCTS OR OTHER TECHNOLOGY DEVELOPED IN CONFORMANCE WITH
THIS DOCUMENT WILL PERFORM IN THE INTENDED MANNER, OR WILL BE FREE FROM
INFRINGEMENT OF THIRD PARTY PROPRIETARY RIGHTS, AND INTEL, AND ITS CONTRIBUTORS
DISCLAIM ALL LIABILITY THEREFOR. INTEL, AND ITS CONTRIBUTORS DO NOT WARRANT THAT
ANY PRODUCT REFERENCED HEREIN OR ANY PRODUCT OR TECHNOLOGY DEVELOPED IN RELIANCE
UPON THIS DOCUMENT, IN WHOLE OR IN PART, WILL BE SUFFICIENT, ACCURATE, RELIABLE,
COMPLETE, FREE FROM DEFECTS OR SAFE FOR ITS INTENDED PURPOSE, AND HEREBY
DISCLAIM ALL LIABILITIES THEREFOR. ANY PERSON MAKING, USING OR SELLING SUCH
PRODUCT OR TECHNOLOGY DOES SO AT HIS OR HER OWN RISK.

Licenses may be
required. Intel, its contributors and others may have patents or pending patent
applications, trademarks, copyrights or other intellectual proprietary rights
covering subject matter contained or described in this document. No license,
express, implied, by estoppels or otherwise, to any intellectual property rights
of Intel or any other party is granted herein. It is your responsibility to seek
licenses for such intellectual property rights from Intel and others where
appropriate. Limited License Grant. Intel hereby grants you a limited copyright
license to copy this document for your use and internal distribution only. You
may not distribute this document externally, in whole or in part, to any other
person or entity. LIMITED LIABILITY. IN NO EVENT SHALL INTEL, OR ITS
CONTRIBUTORS HAVE ANY LIABILITY TO YOU OR TO ANY OTHER THIRD PARTY, FOR ANY LOST
PROFITS, LOST DATA, LOSS OF USE OR COSTS OF PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES, OR FOR ANY DIRECT, INDIRECT, SPECIAL OR CONSEQUENTIAL DAMAGES ARISING
OUT OF YOUR USE OF THIS DOCUMENT OR RELIANCE UPON THE INFORMATION CONTAINED
HEREIN, UNDER ANY CAUSE OF ACTION OR THEORY OF LIABILITY, AND IRRESPECTIVE OF
WHETHER INTEL, OR ANY CONTRIBUTOR HAS ADVANCE NOTICE OF THE POSSIBILITY OF SUCH
DAMAGES. THESE LIMITATIONS SHALL APPLY NOTWITHSTANDING THE FAILURE OF THE
ESSENTIAL PURPOSE OF ANY LIMITED REMEDY.

Intel and Intel logo are trademarks or registered trademarks of Intel
Corporation or its subsidiaries in the United States and other countries.

*Other names and brands may be claimed as the property of others.

Copyright
? 2009, Intel Corporation. All Rights Reserved.Table of Contents

Lab 1: Finding
Hotspots    i

Developer Product
Division    i

Disclaimer    ii

Lab 1: Finding
Hotspots    1

Activity 1 – Build the
Application    2

Activity 2 – Collect Performance
Data    3

Activity 3 – Find the
Hotspot    4

Lab 1: Finding Hotspots








Time
Required

Forty
Five minutes

Objective

In
this lab session, you will use Intel? VTune? Amplifier XE to find a
performance hotspot in an application.

After
successfully completing this lab‘s activities, you will be able
to:

  • Collect
    performance data for an application

  • Determine an
    application‘s performance bottleneck

  • Drill down
    to the source code of a hotspot

Activity 1 – Build the Application











Time
Required

Ten
minutes

Objective


  • Build the
    application in preparation for finding a hotspot
   

  1. Using
    Microsoft Visual Studio, select File->Open->Project/Solution and open
    the solution file: tachyon_vtune_amp_xe.sln

  2. Maximize
    the Microsoft Visual Studio screen by double clicking on the title bar.

  3. Verify
    that Release is the selected Solution Configuration in the left hand pulldown
    window near the top of the Visual Studio screen

  4. Select/highlight
    the find_hotspots project in the Solution Explorer pane

  5. From the
    top Visual Studio menu select Build->Build find_hotspots

  6. Verify
    at the bottom of the Visual Studio screen that it built with no errors. It
    should indicate that 2 projects succeeded in being built

Review
Questions

  • Did the
    tachyon.common project build also?

  • How many
    source files are in the find-hotspots project?

Activity 2 – Collect Performance Data











Time
Required

Fifteen
minutes

Objective


  • Run the
    application while collecting performance data
   

  1. Right-click on
    find_hotspots in the Solution Explorer window and select "Set As Startup
    Project." The project name "find_hotspots" will be displayed as bold
    characters.

  2. Click on
    the "New Analysis" button

  3. Select
    "Algorithm Tuning->Hotspots" in the analysis type pane

  4. Click
    "Analyze" – The tachyon application will run. Note that as the application
    runs it draws and image of several different silver balls on the screen.
    Notice the execution time displayed in the application‘s title bar immediately
    after the image is completely displayed.

  5. After
    the application completes the Intel? VTune? Amplifier XE will spend some time
    analyzing the data. When it is finished analyzing, the Hotspots pane appears.
    Note the analysis explanation pane comes up. Read it and then clear the
    pane.

    At this point the application has run to completion and the
    Intel? VTune? Amplifier XE 2011 displays the analyzed results.

Review
Questions

  • What is
    the result screen that appears after clearing the analysis explanation pane?

  • Which
    function used the most CPU time?

Activity 3 – Find the Hotspot











Time
Required

Twenty
minutes

Objective


  • Find the
    source code for a performance hotspot

  • Identify the
    calling sequence into the performance hotspot that generated the most
    CPU time
   

  1. Make
    sure the "Bottom-up" tab is highlighted. The functions in the tachyon program
    will be listed in order of execution time. Notice the Timeline View at the
    bottom of the screen. This shows the various states of the threads in the
    program and the number of CPUs used as the program ran. There seems to be
    very little CPU usage or thread execution near the end of the program. This
    is the phase of the program in which it finished but kept the application
    window visible so the user has time to see the overall execution time. Notice
    also that there is only 1 thread shown with any significant execution time.

  2. Go back
    to looking at the function list at the top of the result screen. The
    functions grid_intersect and sphere_intersect will be at the top of the list,
    but there is another function that seems to have used a surprisingly large
    amount of CPU time given what it does: initialize_2D_buffer.

  3. Click on
    the arrow to the left of the function name initialize_2D_buffer .
    Notice the different calling sequences into that function, and to see the
    relative amounts of execution time generated by those calling sequences. In
    this case there appears to be only 1.

  4. Double
    click on the function name initialize_2D_buffer. The source code for that
    function is displayed at the hottest point in that function along with
    assembly code to the right. Each source and assembly statement is annotated
    with execution time on the right.

    Vertical panes to the right of the
    source and assembly vertical scroll bars show relative position and density of
    execution time throughout the view. Scroll up and down to see that.

  5. Note the
    comments surrounding the nested for loops in lines 77-87 that contain the
    statement that consumed the most CPU time, line number 84.

    To show the
    use of the hotspot collector the source code has 2 different ways of
    explicitly initializing the code. One referencing sequential memory locations
    (the "faster" method) and one using a slower, non-sequential method).

  6. Use
    Visual Studio to comment out the slower method, rebuild the app (as was done
    in Activity 1) and run it. Note the faster execution time.

  7. Rerun
    Piersol HE‘s Hotspots profiling mode. After it runs and Piersol HE shows the
    results, notice that "initialize_2D_buffer" is much further down the function
    list and took less time.

Review
Questions

  1. How much
    faster was the modified application?

  2. At what
    point in the program does the CPU usage drop off?

  3. Which
    function took the most time in the optimized version?

Intel 寻找热点,布布扣,bubuko.com

时间: 2024-12-16 16:39:02

Intel 寻找热点的相关文章

自媒体平台如何提高推荐量

说起写这点东西的初衷,是因为初接触这一行,想学习一点东西,结果无论是网上还是QQ群之类的,都没有很多实质性的可以学习到的东西.其实一个人在自媒体的道路上摸着石头过河的日子真的很难,所以自己总结了一点经验,想写出来跟大家分享一下,希望能得到大家的回应,另外自己还突发奇想,建了一个小白群,目的也是想认识更多像我一样的新手们能够一起交流,一起成长!好了,废话就说这么都,现在开始"干货"时间. 在大家看这篇文章之前.我们首先先列出三个问题. 一. 到底什么样的文章才能获得高推荐? 这个问题其实

老李分享:性能优化的境界

这篇文章是关于网站性能优化体验的,性能优化是一个复杂的话题,牵涉的东西非常多,我只是按照我的理解列出了性能优化整个过程中需要考虑的种种因素.点到为止,包含的内容以浅显的介绍为主,如果你有见解能告知我那再好不过了.无论如何,希望阅读它的你有所收获. 我眼中的网站性能问题都反映了一个网站的“Availability”(中文叫做可用性,但是这个翻译也不足够达意),以往我的认识是,这个网站如果全部或者部分不可用,那是功能问题,但是如果响应慢.负载差,这才是性能问题:可是后来我逐渐意识到,性能问题涵盖的范

徐小平:正在死亡的交易员们

巴塞罗那的圣家大教堂.高迪31岁接手设计,直到74岁被有轨电车撞到.他一生为喜欢的事业而奋斗,最后葬在教堂的地下室.130多年了,这座教堂还没有完全落成,还有不少脚手架.交易员这个行业,正在死亡.交易员们会死去,但交易永生. 是的.不要惊奇.交易员这个行业,正在死亡. 我认识这行业,除了因为投行工作原因,也是因为曾坚持追求一对冲基金交易员多年.那基金在纽约中城,但我们没有时差问题:基金是亚洲主题,专做香港.韩国和日本股票. 到了晚上,在卡耐基梅陇大厦地下停车场,她的同事们开车回曼哈顿或者泽西的家

2014Esri全球用户大会之ArcGIS for Desktop

1. 能将更多的ArcGIS桌面功能推向Web端吗? Web maps是ArcGIS平台对地理空间信息进行共享和可视化的重要手段.Web maps也提供一些简单的空间数据分析.用户可以使用ArcGIS Online或ArcGIS for Server来增强Web maps的能力. ArcGIS桌面在功能和易用性方面一直在持续不断的改进,它是进行精细化制图.地理空间数据管理.高级数据分析,以及影像和3D的主要应用.此外,还新增了ArcGIS Pro应用程序(即将随10.3发布),使得向服务器和Ar

《引爆社群》读书笔记

推荐指数:五颗星 反正对于我这样的小白来说这本书确实不错,从思维到方法上都给予了一定的指导,比较接地气,推荐大家阅读. 主题:本书的主题其实就是围绕4C理论来讲解如何进行营销,其中夹杂这一些案例. ====================读书笔记与个人思考================================ 什么是4C法则? 场景+内容+社群+人与人连接=4C 在合适的场景下,针对特定的社群,通过有传播力的内容或话题,利用社群的网络结构进行人与人连接,快速实现信息的扩展与传播,以获得

SQLServer DBA 三十问(加强版)

1. 谈谈聚集索引.非聚集索引.Hash索引的区别和各自的优劣,Include覆盖索引的作用,相对于组合索引的优势: 2. 日志文件是什么结构,数据写入日志文件与数据文件区别是什么,日志文件不能收缩的原因怎么分析,可能的原因是什么,如何优化日志文件: 3. SQLServer有哪些情况会读或者写日志文件: 4. 描述下CheckPoint.DBCC CheckDB.CheckSum的过程和作用: 5. 数据库文件的组织结构,主要包含哪些页,各自的作用是什么: 6. 如何寻找热点库.热点表和执行最

redis06

缓存的使用与设计 1.受益 加速读写 CPU L1/L2/L3 Cache.浏览器缓存.Ehcache缓存数据库结果降低后端负载后端服务器通过前端缓存降低负载:业务端使用Redis降低后端MySQL的负载 2.成本 数据不一致:缓存层和数据层有时间窗口不一致问题,和更新策略有关代码维护成本:多了一层缓存逻辑运维成本:例如Redis Cluster 3.使用场景 降低后端负载 对高消耗的SQL:join结果集/分组统计结果缓存加速请求响应 利用Redis/Memcache优化IO响应时间 大量写合

目前最实用的自媒体爆文采集器

对于从事新媒体运营行业的人来说,特别是刚刚进入行业的新人只要合理的掌握了怎么追热点热搜和怎么写热点爆文,那么创作出一片阅读上万到几十万的自媒体爆文是非常容易的.不过要准确的追上热点爆文也是非常不容易的.比如说娱乐行业,大多数都是突发事件甚至事情发生后连主角本人都是不知情的,更何况我们这样的圈外人.所以今天我们就来聊聊关于目前最实用的自媒体爆文采集器有哪些?该如何利用这些工具去采集爆文以及如何快速的创作出热点爆文. 追热点内容运营看完百度风云榜,看360实时热点,看完360,看微博热搜榜,还有ha

开发者的自测利器-Hprof命令(寻找cpu热点)

测试代码: 1 public class HProfTest { 2 public void slowMethod() { 3 try { 4 Thread.sleep(1000); 5 } catch (Exception e) { 6 e.printStackTrace(); 7 } 8 } 9 10 public void slowerMethod() { 11 try { 12 Thread.sleep(10000); 13 } catch (Exception e) { 14 e.pr