Understanding String Table Size in HotSpot

In JDK-6962930[2], it requested that string table size be configurable.  The resolved date of that bug was on 04/25/2011 and it‘s available in JDK 7.  In another JDK bug[3], it has requested the default size (i.e. 1009) of string table be increased.

In this article, we will examine the following topics:

  • What string table is
  • How to find the number of interned strings in your applications
  • The tradeoff between memory footprint and lookup cost

String Table

In Java, string interning[1] is a method of storing only one copy of each distinct string value, which must be immutable. Interning strings makes some string processing tasks more time- or space-efficient at the cost of requiring more time when the string is created or interned. The distinct values are stored in a string intern pool, which is the string table in HotSpot.

The size of the string table (i.e., a chained hash table) is configurable in JDK 7.  When the overflow chains become long, performance can degrade.  The current default size of string table is 1009 (or 1009 buckets), which is too small for applications that stress the string table.  Note that the string table itself is allocated in native memory but the strings are java objects.

Increasing the size improves performance (i..e, reducing look-up cost) but increases the StringTable size by 16 bytes on 64-bit systems, 8 bytes on 32-bit systems for every additional entry.  For example, changing the default size to 60013 increases the String Table size by 460K on 32 bit systems.

Finding Number of Interned Strings in the Applications

In HotSpot, it provides a product level option named PrintStringTableStatistics which can be used to print hash table statistics[4].  For example, using one of our applications (hereafter will be referred as JavaApp), it prints out the following information:

StringTable statistics:
Number of buckets  : 60013
Average bucket size  : 5
Variance of bucket size : 5
Std. dev. of bucket size: 2
Maximum bucket size  : 17

You can find the above output from your manged server‘s log file in the WebLogic domain.  Note that we have set the following option:

  • -XX:StringTableSize=60013

So, there are 60013 buckets in the hash table (or string table).

In JDK, there is also a tool named jmap which can be used to find out number of interned strings in your application.  For example, we have found the following information using:

$ jdk-hs/bin/jmap -heap 18974
Attaching to process ID 18974, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 24.0-b43

using thread-local object allocation.
Parallel GC with 18 thread(s)

Heap Configuration:
   MinHeapFreeRatio = 40
   MaxHeapFreeRatio = 70
   MaxHeapSize      = 2147483648 (2048.0MB)
   NewSize          = 1310720 (1.25MB)
   MaxNewSize       = 17592186044415 MB
   OldSize          = 5439488 (5.1875MB)
   NewRatio         = 2
   SurvivorRatio    = 8
   PermSize         = 402653184 (384.0MB)
   MaxPermSize      = 402653184 (384.0MB)
   G1HeapRegionSize = 0 (0.0MB)

Heap Usage:
PS Young Generation
<deleted for brevity>

270145 interned Strings occupying 40429904 bytes.

Therefore, we know there are around 260K interned Strings in the table.

Tradeoff Between Memory Footprint and Lookup Cost

Based on curiosity, we have tried to set the string table size to be 277331 (a prime number) to see how JavaApp performs.  Here are our findings:

  • Average Response Time: +0.75%
  • 90% Response Time: +0.56%

However, the memory footprint has increased:

  • Total Memory Footprint: -1.03%

Finally, here is the hash table statistics based on the new size (i.e., 277331):

StringTable statistics:
Number of buckets       :  277331
Average bucket size     :       1
Variance of bucket size :       1
Std. dev. of bucket size:       1
Maximum bucket size     :       8

The conclusion is that increasing string table size from 60013 to 277331 helps JavaApp‘s performance a little bit at the expense of larger memory footprint.  In this case, the benefit is minimal, keeping string table size to be 60013 is good enough.

References

    1. String Interning (Wikipedia)
    2. JDK 6962930 : make the string table size configurable
    3. JDK 8009928: Increase default value for StringTableSize
    4. Java GC tuning for strings
    5. All other performance tuning articles on XML and More
    6. G1 GC Glossary of Terms
时间: 2024-08-01 18:17:18

Understanding String Table Size in HotSpot的相关文章

【转】ArrayList的toArray,也就是list.toArray[new String[list.size()]];,即List转为数组

[转]ArrayList的toArray ArrayList提供了一个将List转为数组的一个非常方便的方法toArray.toArray有两个重载的方法: 1.list.toArray(); 2.list.toArray(T[]  a); 对于第一个重载方法,是将list直接转为Object[] 数组: 第二种方法是将list转化为你所需要类型的数组,当然我们用的时候会转化为与list内容相同的类型. 不明真像的同学喜欢用第一个,是这样写: ArrayList<String> list=ne

【MySQL笔记】mysql报错"ERROR 1206 (HY000): The total number of locks exceeds the lock table size"的解决方法

step1:查看 1.1 Mysql命令行里输入"show engines:"查看innoddb数据引擎状态, 1.2 show variables "%_buffer%"里查看innodb_buffer_pool_size的数值,默认是8M(太小,需要改大一点!) step2:找配置文件,修改innodb_buffer_pool_size=64M 2.1 在linux里配置文件是my.cnf,windows里是my.ini(注:不是my-default.ini).

c++ string的size()函数和length()函数

C++标准库中的string中两者的源代码如下: size_type   __CLR_OR_THIS_CALL   length()   const { //   return   length   of   sequence return   (_Mysize); } size_type   __CLR_OR_THIS_CALL   size()   const { //   return   length   of   sequence return   (_Mysize); } 所以两者没

Mysql_解决The total number of locks exceeds the lock table size错误

在操作mysql数据库表时出现以下错误. 网上google搜索相关问题,发现一位外国牛人这么解释: If you're running an operation on a large number of rows within a table that uses the InnoDB storage engine, you might see this error: ERROR 1206 (HY000): The total number of locks exceeds the lock ta

MYSQL碰到The total number of locks exceeds the lock table size 问题解决记录

解决记录如下: 在mysql里面进行修改操作时提示:The total number of locks exceeds the lock table size ,通过百度搜到innodb_buffer_pool_size过小: 打开mysql 命令框 输入 show variables like "%tmp%"; 查看innodb_buffer_pool_size,输入SET GLOBAL innodb_buffer_pool_size=67108864; 完成之后再次使用show v

C++ string的size()和length()函数没有区别

C++标准库中的string中两者的源代码如下:      size_type   __CLR_OR_THIS_CALL   length()   const     { //   return   length   of   sequence     return   (_Mysize);     }         size_type   __CLR_OR_THIS_CALL   size()   const     { //   return   length   of   sequenc

String学习之-深入解析String#intern

引言 在 JAVA 语言中有8中基本类型和一种比较特殊的类型String.这些类型为了使他们在运行过程中速度更快,更节省内存,都提供了一种常量池的概念.常量池就类似一个JAVA系统级别提供的缓存. 8种基本类型的常量池都是系统协调的,String类型的常量池比较特殊.它的主要使用方法有两种: 直接使用双引号声明出来的String对象会直接存储在常量池中. 如果不是用双引号声明的String对象,可以使用String提供的intern方法.intern 方法会从字符串常量池中查询当前字符串是否存在

深入解析String#intern

转自:https://tech.meituan.com/in_depth_understanding_string_intern.html 深入解析String#intern john_yang ·2014-03-06 17:10 引言 在 JAVA 语言中有8中基本类型和一种比较特殊的类型String.这些类型为了使他们在运行过程中速度更快,更节省内存,都提供了一种常量池的概念.常量池就类似一个JAVA系统级别提供的缓存. 8种基本类型的常量池都是系统协调的,String类型的常量池比较特殊.

如何用dumpbin.exe检查编译器生成的托管模块所嵌入的信息

开启CMD 运行到dumpbin目录下:D:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin 运行命令VCVARS32.BAT,配置环境 如果不运行vcvars32.bat,会出现如下提示: 运行dumpbin命令 ? ? D:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin>dumpbin /exports E: DSTCode\Web\DSTWeb\bin\DSTW