Batch Processing / 憋错料

下面例子展示批量插入一个反模式（不成熟使用Hibernate插入100000行记录）
Session session = sessionFactory.openSession();
Transaction tx = session.beginTransaction();
for ( int i=0; i<100000; i++ ) {
  Customer customer = new Customer(.....);
  session.save(customer);
}
tx.commit();
session.close();
此操作在大多数系统执行约50000行时得到OutOfMemoryException而失败，因为Hibernate缓存所有新插入Customer在会话级缓存实例。
有几种方法可以避免这个问题。

在Batch Processing前启用JDBC batching，要启用JDBCbatching,设置属性hibernate.jdbc.batch_size为一个整数10至50。
注：如果你使用一个标识符生成器，在JDBC级别透明Hibernate禁用insert batching。

如果上面的方法是不合适的，你可以禁用二级缓存：hibernate.cache.use_second_level_cache = false

1、Batch inserts
  当使新的对象持久化时，定期采用Session的flush和clear方法，控制一级缓存的大小
  Session session = sessionFactory.openSession();
  Transaction tx = session.beginTransaction();
  for ( int i=0; i<100000; i++ ) {
   Customer customer = new Customer(.....);
   session.save(customer);
   if ( i % 20 == 0 ) { //20, same as the JDBC batch size
    //flush a batch of inserts and release memory:
    session.flush();
    session.clear();
   }
  }
  tx.commit();
  session.close();

2、Batch updates
  当遍历和更新数据，定期使用flush和clear方法，另外，使用scroll()方法利用服务器端游标查询返回多行数据。
  Session session = sessionFactory.openSession();
  Transaction tx = session.beginTransaction();

  ScrollableResults customers = session.getNamedQuery("GetCustomers")
         .setCacheMode(CacheMode.IGNORE)
         .scroll(ScrollMode.FORWARD_ONLY);
  int count=0;
  while ( customers.next() ) {
   Customer customer = (Customer) customers.get(0);
   customer.updateStuff(...);
   if ( ++count % 20 == 0 ) {
    //flush a batch of updates and release memory:
    session.flush();
    session.clear();
   }
  }

  tx.commit();
  session.close();

3、StatelessSession
  StatelessSession是Hibernate提供一个面向命令API。用它来流化数据且从数据库detached对象的形式。
  StatelessSession没有与之关联的持久化上下文且不提供许多更高级的生命周期的语义，
  StatelessSession不提供的特性和行为：
   a first-level cache
   interaction with any second-level or query cache
   transactional write-behind or automatic dirty checking

  StatelessSession的限制：
   Operations performed using a stateless session never cascade to associated instances
   Collections are ignored by a stateless session
   Operations performed via a stateless session bypass Hibernate‘s event model and interceptors
   Due to the lack of a first-level cache, Stateless sessions are vulnerable to data aliasing effects
   A stateless session is a lower-level abstraction that is much closer to the underlying JDBC

  使用StatelessSession例子：
  StatelessSession session = sessionFactory.openStatelessSession();
  Transaction tx = session.beginTransaction();

  ScrollableResults customers = session.getNamedQuery("GetCustomers")
         .scroll(ScrollMode.FORWARD_ONLY);
  while ( customers.next() ) {
   Customer customer = (Customer) customers.get(0);
   customer.updateStuff(...);
   session.update(customer);
  }

  tx.commit();
  session.close();

  Customer 的实例通过查询马上detached状态返回，没有与任何持久上下关联。
  StatelessSession接口中的insert、update、delete直接操作数据库的行，引起相应的sql马上执行。
  与Session接口中的save、saveOrUpdate、delete语意不同。

4、Hibernate Query Language for DML
  DML指的是SQL语句如INSERT, UPDATE, 和 DELETE。Hibernate 提供大部分SQL语句执行的方法（HQL）

  HQL for UPDATE and DELETE
   ( UPDATE | DELETE ) FROM? EntityName (WHERE where_conditions)?
   ？后缀：可选参数，上面FROM和WHERE从句是可选的。

   FROM从句只能引用一个单一的实体（可以别名），如果实体名称的别名，任何属性引用也限制使用别名
   隐式or显式连接在HQL大部分是禁止的。你可以在WHERE从句中使用子查询，且子查询本身可以包含连接。
   Session session = sessionFactory.openSession();
   Transaction tx = session.beginTransaction();
   String hqlUpdate = "update Customer c set c.name = :newName where c.name = :oldName";
   // or String hqlUpdate = "update Customer set name = :newName where name = :oldName";
   int updatedEntities = session.createQuery( hqlUpdate )
        .setString( "newName", newName )
        .setString( "oldName", oldName )
        .executeUpdate();

   tx.commit();
   session.close();

   为符合EJB3的规范，默认的，HQL UPDATE语句，不影响版本或时间戳属性值对受影响的实体。
   您可以使用一个版本更新迫使Hibernate重置版本或时间戳属性值，通过在UPDATE关键字后添加VERSIONED关键字:
   Session session = sessionFactory.openSession();
   Transaction tx = session.beginTransaction();
   String hqlVersionedUpdate = "update versioned Customer set name = :newName where name = :oldName";
   int updatedEntities = session.createQuery( hqlVersionedUpdate )
        .setString( "newName", newName )
        .setString( "oldName", oldName )
        .executeUpdate();

   tx.commit();
   session.close();

   注：若使用VERSIONED语句，不能使用自定义的版本类型，而使用 org.hibernate.usertype.UserVersionType类型
   HQL delete statement：
   Session session = sessionFactory.openSession();
   Transaction tx = session.beginTransaction();
   String hqlDelete = "delete Customer c where c.name = :oldName";
   // or String hqlDelete = "delete Customer where name = :oldName";
   int updatedEntities = session.createQuery( hqlDelete )
         .setString( "oldName", oldName )
        .executeUpdate();

   tx.commit();
   session.close();

   Query.executeUpdate()方法返回int值，表示此操作影响实体的数量。这可能会或可能不会与数据库中的影响的行数。
   HQL bulk操作可能导致多个SQL语句被执行，如joined-subclass

  HQL syntax for INSERT
   INSERT INTO EntityName properties_list select_statement
   仅仅支持INSERT INTO ... SELECT ... 格式，不能指定明确值insert。
   properties_list ：类似于SQL INSERT语句中的列规范，关于实体涉及的继承映射，只能直接使用直接类级别上的属性，
       父类的属性不被允许且子类属性不相关。换言之，INSERT语句本来是非多态的。
       id不需要再properties_list中指定，Hibernate自动生成一个值。
       自动生成只有你使用ID生成器操作数据库，否则,在解析Hibernate抛出一个异常。
       可用的数据库生成器：org.hibernate.id.SequenceGenerator、其子类和实现org.hibernate.id.PostInsertIdentifierGenerator类

   select_statement：任何有效的HQL查询，但返回类型必需与INSERT期望的匹配。Hibernate验证返回类型在编译查询，而不是期待数据库检查。
       问题可能来自Hibernate类型是等价的,而不是平等的。如：一个属性类型org.hibernate.type.DateType和一个org.hibernate.type.TimestampType类型，
       尽管数据库可能没有区别,或者能够处理的转换。

   属性映射为版本或时间戳,insert语句给你两个选择。properties_list您可以指定属性,在这种情况下,它的价值来自相应的选择表达式,或从properties_list省略它,
   在这种情况下,值由org.hibernate.type.VersionType使用
   Session session = sessionFactory.openSession();
   Transaction tx = session.beginTransaction();
   String hqlInsert = "insert into DelinquentAccount (id, name) select c.id, c.name from Customer c where ...";
   int createdEntities = session.createQuery( hqlInsert )
        .executeUpdate();
   tx.commit();
   session.close();

时间： 2024-12-14 08:41:03

Batch Processing

Batch Processing的相关文章

Multithreading Batch Processing Framework

Apache Hadoop YARN: Moving beyond MapReduce and Batch Processing with Apache Hadoop 2

NHibernate大批量插入数据库的处理方法 NHibernate Batch processing

Spring Batch Hello World Example

[Spring Batch 系列] 第一节初识 Spring Batch

Spring Batch学习_ItemReaders and ItemWriters

13 Stream Processing Patterns for building Streaming and Realtime Applications

Stream Processing 101: From SQL to Streaming SQL in 10 Minutes

hibernate批量插入collection，同一类实体，不通实例