GC基础和性能相关(转自MSDN)

Performance

Now that we have a basic model for how things are working, let‘s consider some things that could go wrong that would make it slow. That will give us a good idea what sorts of things we should try to avoid to get the best performance out of the collector.

Too Many Allocations

This is really the most basic thing that can go wrong. Allocating new memory with the garbage collector is really quite fast. As you can see in Figure 2 above is all that needs to happen typically is for the allocation pointer to get moved to create space for your new object on the "allocated" side—it doesn‘t get much faster than that. However, sooner or later a garbage collect has to happen and, all things being equal, it‘s better for that to happen later than sooner. So you want to make sure when you‘re creating new objects that it‘s really necessary and appropriate to do so, even though creating just one is fast.

This may sound like obvious advice, but actually it‘s remarkably easy to forget that one little line of code you write could trigger a lot of allocations. For example, suppose you‘re writing a comparison function of some kind, and suppose that your objects have a keywords field and that you want your comparison to be case insensitive on the keywords in the order given. Now in this case you can‘t just compare the entire keywords string, because the first keyword might be very short. It would be tempting to use String.Split to break the keyword string into pieces and then compare each piece in order using the normal case-insensitive compare. Sounds great right?

Well, as it turns out doing it like that isn‘t such a good idea. You see, String.Split is going to create an array of strings, which means one new string object for every keyword originally in your keywords string plus one more object for the array. Yikes! If we‘re doing this in the context of a sort, that‘s a lot of comparisons and your two-line comparison function is now creating a very large number of temporary objects. Suddenly the garbage collector is going to be working very hard on your behalf, and even with the cleverest collection scheme there is just a lot of trash to clean up. Better to write a comparison function that doesn‘t require the allocations at all.

Too-Large Allocations

When working with a traditional allocator, such as malloc(), programmers often write code that makes as few calls to malloc() as possible because they know the cost of allocation is comparatively high. This translates into the practice of allocating in chunks, often speculatively allocating objects we might need, so that we can do fewer total allocations. The pre-allocated objects are then manually managed from some kind of pool, effectively creating a sort of high-speed custom allocator.

In the managed world this practice is much less compelling for several reasons:

First, the cost of doing an allocation is extremely low—there‘s no searching for free blocks as with traditional allocators; all that needs to happen is the boundary between the free and allocated areas needs to move. The low cost of allocation means that the most compelling reason to pool simply isn‘t present.

Second, if you do choose to pre-allocate you will of course be making more allocations than are required for your immediate needs, which could in turn force additional garbage collections that might otherwise have been unnecessary.

Finally, the garbage collector will be unable to reclaim space for objects that you are manually recycling, because from the global perspective all of those objects, including the ones that are not currently in use, are still live. You might find that a great deal of memory is wasted keeping ready-to-use but not in-use objects on hand.

This isn‘t to say that pre-allocating is always a bad idea. You might wish to do it to force certain objects to be initially allocated together, for instance, but you will likely find it is less compelling as a general strategy than it would be in unmanaged code.

Too Many Pointers

If you create a data structure that is a large mesh of pointers you‘ll have two problems. First, there will be a lot of object writes (see Figure 3 below) and, secondly, when it comes time to collect that data structure, you will make the garbage collector follow all those pointers and if necessary change them all as things move around. If your data structure is long-lived and won‘t change much, then the collector will only need to visit all those pointers when full collections happen (at the gen2 level). But if you create such a structure on a transitory basis, say as part of processing transactions, then you will pay the cost much more often.

Figure 3. Data structure heavy in pointers

Data structures that are heavy in pointers can have other problems as well, not related to garbage collection time. Again, as we discussed earlier, when objects are created they are allocated contiguously in the order of allocation. This is great if you are creating a large, possibly complex, data structure by, for instance, restoring information from a file. Even though you have disparate data types, all your objects will be close together in memory, which in turn will help the processor to have fast access to those objects. However, as time passes and your data structure is modified, new objects will likely need to be attached to the old objects. Those new objects will have been created much later and so will not be near the original objects in memory. Even when the garbage collector does compact your memory your objects will not be shuffled around in memory, they merely "slide" together to remove the wasted space. The resulting disorder might get so bad over time that you may be inclined to make a fresh copy of your whole data structure, all nicely packed, and let the old disorderly one be condemned by the collector in due course.

Too Many Roots

The garbage collector must of course give roots special treatment at collection time—they always have to be enumerated and duly considered in turn. The gencollection can be fast only to the extent that you don‘t give it a flood of roots to consider. If you were to create a deeply recursive function that has many object pointers among its local variables, the result can actually be quite costly. This cost is incurred not only in having to consider all those roots, but also in the extra-large number of gen0 objects that those roots might be keeping alive for not very long (discussed below).

Too Many Object Writes

Once again referring to our earlier discussion, remember that every time a managed program modified an object pointer the write barrier code is also triggered. This can be bad for two reasons:

First, the cost of the write barrier might be comparable to the cost of what you were trying to do in the first place. If you are, for instance, doing simple operations in some kind of enumerator class, you might find that you need to move some of your key pointers from the main collection into the enumerator at every step. This is actually something you might want to avoid, because you effectively double the cost of copying those pointers around due to the write barrier and you might have to do it one or more times per loop on the enumerator.

Second, triggering write barriers is doubly bad if you are in fact writing on older objects. As you modify your older objects you effectively create additional roots to check (discussed above) when the next garbage collection happens. If you modified enough of your old objects you would effectively negate the usual speed improvements associated with collecting only the youngest generation.

These two reasons are of course complemented by the usual reasons for not doing too many writes in any kind of program. All things being equal, it‘s better to touch less of your memory (read or write, in fact) so as to make more economical use of the processor‘s cache.

Too Many Almost-Long-Life Objects

Finally, perhaps the biggest pitfall of the generational garbage collector is the creation of many objects, which are neither exactly temporary nor are they exactly long-lived. These objects can cause a lot of trouble, because they will not be cleaned up by a gencollection (the cheapest), as they will still be necessary, and they might even survive a gencollection because they are still in use, but they soon die after that.

The trouble is, once an object has arrived at the genlevel, only a full collection will get rid of it, and full collections are sufficiently costly that the garbage collector delays them as long as is reasonably possible. So the result of having many "almost-long-lived" objects is that your genwill tend to grow, potentially at an alarming rate; it might not get cleaned up nearly as fast as you would like, and when it does get cleaned up it will certainly be a lot more costly to do so than you might have wished.

To avoid these kinds of objects, your best lines of defense go like this:

  1. Allocate as few objects as possible, with due attention to the amount of temporary space you are using.
  2. Keep the longer-lived object sizes to a minimum.
  3. Keep as few object pointers on your stack as possible (those are roots).

If you do these things, your gencollections are more likely to be highly effective, and genwill not grow very fast. As a result, gen1 collections can be done less frequently and, when it becomes prudent to do a gencollection, your medium lifetime objects will already be dead and can be recovered, cheaply, at that time.

If things are going great then during steady-state operations your gensize will not be increasing at all!

原文:http://msdn.microsoft.com/en-us/library/ms973837.aspx

GC基础和性能相关(转自MSDN)

时间: 2024-10-13 13:54:45

GC基础和性能相关(转自MSDN)的相关文章

Spark 性能相关參数配置具体解释-shuffle篇

作者:刘旭晖 Raymond 转载请注明出处 Email:colorant at 163.com BLOG:http://blog.csdn.net/colorant/ 随着Spark的逐渐成熟完好, 越来越多的可配置參数被加入到Spark中来, 在Spark的官方文档http://spark.apache.org/docs/latest/configuration.html 中提供了这些可配置參数中相当大一部分的说明. 可是文档的更新总是落后于代码的开发的, 另一些配置參数没有来得及被加入到这

GC 基础

= GC 基础 ===================== JAVA堆的描述如下: 内存由 Perm 和 Heap 组成. 其中 Heap = {Old + NEW = { Eden , from, to } } JVM内存模型中分两大块,一块是 NEW Generation, 另一块是Old Generation. 在New Generation中,有一个叫Eden的空间,主要是用来存放新生的对象,还有两个Survivor Spaces(from,to), 它们用来存放每次垃圾回收后存活下来的

Spark 性能相关参数配置详解-shuffle篇

作者:刘旭晖 Raymond 转载请注明出处 Email:colorant at 163.com BLOG:http://blog.csdn.net/colorant/ 随着Spark的逐渐成熟完善, 越来越多的可配置参数被添加到Spark中来, 在Spark的官方文档http://spark.apache.org/docs/latest/configuration.html 中提供了这些可配置参数中相当大一部分的说明. 但是文档的更新总是落后于代码的开发的, 还有一些配置参数没有来得及被添加到

linux内存基础知识和相关调优方案

内存是计算机中重要的部件之一,它是与CPU进行沟通的桥梁.计算机中所有程序的运行都是在内存中进行的,因此内存的性能对计算机的影响非常大.内存作用是用于暂时存放CPU中的运算数据,以及与硬盘等外部存储器交换的数据.只要计算机在运行中,CPU就会把需要运算的数据调到内存中进行运算,当运算完成后CPU再将结果传送出来,内存的运行也决定了计算机的稳定运行.对于整个操作系统来说,内存可能是最麻烦的的设备.而其性能的好坏直接影响着整个操作系统. 我们知道CPU是不能与硬盘打交道的,只有数据被载入到内存中才可

Spark 性能相关参数配置详解-Storage篇

随着Spark的逐渐成熟完善, 越来越多的可配置参数被添加到Spark中来, 本文试图通过阐述这其中部分参数的工作原理和配置思路, 和大家一起探讨一下如何根据实际场合对Spark进行配置优化. 由于篇幅较长,所以在这里分篇组织,如果要看最新完整的网页版内容,可以戳这里:http://spark-config.readthedocs.org/,主要是便于更新内容 Storage相关配置参数 spark.local.dir 这个看起来很简单,就是Spark用于写中间数据,如RDD Cache,Shu

如何提升爬虫性能相关的知识点

如何提升爬虫性能相关的知识点 爬虫的本质是伪造socket客户端与服务端的通信过程,如果我们有多个url待爬取,只用一个线程且采用串行的方式执行,那只能等待爬取一个url结束后才能继续下一个,这样我们就会发现效率非常低. 原因:爬虫是一项IO密集型任务,遇到IO问题就会阻塞,CPU运行就会停滞,直到阻塞结束.那么在CPU等待组合结束的过程中,任务其实是呈现出卡住的状态.但是,如果在单线程下进行N个任务且都是纯计算的任务的话,那么该线程对cpu的利用率仍然会很高,所以单线程下串行多个计算密集型任务

爬虫性能相关

性能相关 在编写爬虫时,性能的消耗主要在IO请求中,当单进程单线程模式下请求URL时必然会引起等待,从而使得请求整体变慢. import requests def fetch_async(url): response = requests.get(url) return response url_list = ['http://www.github.com', 'http://www.bing.com'] for url in url_list: fetch_async(url) 1.同步执行

Linux基础--进程管理相关命令介绍(2)

本文主要介绍了Linux中进程管理的相关命令,涉及到的主要命令有top,vmstat等. (1)top ①功能:用来查看CPU,内存以及进程的状态. ②用例: ③相关注释: load average表示负载,三个数值分别表示第1分钟,第5分钟,第10分钟 Cpu中us表示用户空间程序占用百分比,sy表示内核模式占用百分比,ni表示调整NICE值所占用的    CPU百分比,id表示CPU的空闲比例,wa表示等待磁盘IO完成所占用的时间比例,hi表示硬件中断占     据的百分比,si表示软中断所

Oracle 性能相关常用脚本(SQL)

在缺乏的可视化工具来监控数据库性能的情形下,常用的脚本就派上用场了,下面提供几个关于Oracle性能相关的脚本供大家参考.以下脚本均在Oracle 10g测试通过,Oracle 11g可能要做相应调整. 1.寻找最多BUFFER_GETS开销的SQL 语句 [sql] view plain copy print? --filename: top_sql_by_buffer_gets.sql --Identify heavy SQL (Get the SQL with heavy BUFFER_G