【翻译】《Java? I/O, 2nd Edition》-14.1 使用缓冲区复制文件

14.1. 使用缓冲区复制文件

我们将会从一个简单的复制文件的程序开始，程序的基本接口看起来像这样：

java FileCopier original copy

很明显，这个程序可以使用传统的流式IO去编写，其实几乎所有用NIO去写的程序都可以使用传统流式IO去写，在这种情况下（译注：可能是指用NIO去改写用传统IO编写的程序），传统流式IO做不到的事情NIO也做不到，不过如果被复制的文件非常大并且使用的操作系统非常先进，那么NIO版本的FileCopier或许会比传统IO版本的要快。

程序的结构很典型。

import java.io.*;
import java.nio.*;
public class NIOCopier {
  public static void main(String[] args) throws IOException {
    FileInputStream inFile = new FileInputStream(args[0]);
    FileOutputStream outFile = new FileOutputStream(args[1]);
    // copy files here...
    inFile.close( );
    outFile.close( );
  }
}

与其仅仅从输入流（input stream）中读取数据然后写出到输出流（output stream），我们会做一点改变。首先，我们调用FileInputStream 和 FileOutputStream的 getChannel( ) 方法分别在输入和输出文件上面打开通道（channel），

FileChannel inChannel = inFile.getChannel( );
FileChannel outChannel = outFile.getChannel( );

接着，使用ByteBuffer.allocate( )这个静态工厂方法去创建一个大小为1M的缓冲区。inChannel从源文件中读取数据，然后填充到缓冲区里。接着outChannel从缓冲区中取出数据，然后写入到目标文件中。调用输入流的read（）方法需要将数组作为参数传入，与其类似，你调用channel的read（）方法时候，要将缓冲区作为参数传入，然后就可以读取数据。

inChannel.read(buffer);

read（）方法会返回读取到的字节的数量，如输入流一样，这个方法不保证完全将缓冲区填满，它可能只读取到比缓冲区大小要小的字节，或者完全读不到任何字节。当数据全部读取完毕的时候，这个方法会返回-1。因此你通常这样做：

long bytesRead = inChannel.read(buffer);
if (bytesRead == -1) break;

现在outChannel要将数据从缓冲区中写入目标文件了，在这之前，必须调用buffer.flip()方法将缓冲区重置一下。调用flip（）方法会将缓冲区从写入模式切换到读取读模式。然后将缓冲区作为参数传递给outChannel的write（）方法，就可以写出数据了。

outChannel.write(buffer);

输出流的write(byte[]) 方法保证byte数组中所有的数据都会被写到目标文件（除非抛出异常）。但是outChannel的write（）方法不会，它与inChannel的read（）方法相似，write（）方法每次会写出一些数据，可能不是全部，或者压根不写出任何数据，它会返回已经写出的字节的数目。你可以重复调用这个方法直到所有数据都被写出，像这样：

long bytesWritten = 0;
while (bytesWritten < bytesRead){
  bytesWritten += outChannel.write(buffer);
}

其实这里有一个更加简单的方法，通过调用缓冲区的 hasRemaining( ) 方法，可以知道全部数据是否都已经被写出。这段代码最多速写1M的数据，为了复制更大的文件，我们需要重复这个过程：

while (true) {
  ByteBuffer buffer = ByteBuffer.allocate(1024*1024);
  int bytesRead = inChannel.read(buffer);
  if (bytesRead == -1) break;
  buffer.flip( );
  while (buffer.hasRemaining( )) outChannel.write(buffer);
}

每次读取数据的时候都新建一个缓冲区是浪费的，我们应该重用缓冲区。在每次重用之前，都必须调用clear（）方法清空缓冲区。

ByteBuffer buffer = ByteBuffer.allocate(1024*1024);
while (true) {
  int bytesRead = inChannel.read(buffer);
  if (bytesRead == -1) break;
  buffer.flip( );
  while (buffer.hasRemaining( )) outChannel.write(buffer);
  buffer.clear( );
}

最后关闭两个channel，以释放它们所使用的本地资源

inChannel.close( );
outChannel.close( );

Example 14-1 是这个程序的完整版，去除部分相似的语句，可以将它和Example 4-2（译注：这个程序是第4章的，我将它放到文章的结尾以备对比）对比，他们具有相同的功能，而Example 4-2使用了面向流的方式复制文

Example 14-1. Copying files using NIO

import java.io.*;
import java.nio.*;
import java.nio.channels.*;
public class NIOCopier {
  public static void main(String[] args) throws IOException {
    FileInputStream inFile = new FileInputStream(args[0]);
    FileOutputStream outFile = new FileOutputStream(args[1]);
    FileChannel inChannel = inFile.getChannel( );
    FileChannel outChannel = outFile.getChannel( );
    for (ByteBuffer buffer = ByteBuffer.allocate(1024*1024);
    inChannel.read(buffer) != -1;
    buffer.clear( )) {
      buffer.flip( );
      while (buffer.hasRemaining( )) outChannel.write(buffer);
    }
    inChannel.close( );
    outChannel.close( );
  }
}

在一个非正规的测试中（CPU：双核2.5G的PowerMac G5，操作系统：Mac OS X 10.4.1），复制一个4.3G大小的文件，使用传统的流式IO（字节缓冲区大小是8192字节）需要305秒，增加或者缩小缓冲区不会让时间减少超过5%，并且似乎改变任何因素都会增加复制的时间。（例如在Example 14-1中，如果使用1M大小的缓冲区，时间反而增加了23秒）。如果使用Example 14-1 中的NIO去实现的话，大概会快16%，即225秒。使用straight Finder（译注：不知道这是什么东西，估计是个应用程序）的话需要197秒，使用Unix系统的cp程序需要312秒。由此可知straight Finder在底层做了惊人的优化。（译注：我在公司服务器上用NIO和BIO写过类似的测试程序，貌似差距没有作者说的那么大，但NIO还是稍微快那么一点点）

对于传统的文件操作，需要从头到尾扫描一次文件。这种情况下NIO并不会有太大帮助。NIO并不是所有IO性能问题的灵丹妙药。但在两种情况下NIO对性能的提高是明显的：

高并发的网络服务器。

在一个大文件上重复进行随机访问。

（译注：以下是文中提及的例程）

Example 4-1. The FileDumper program

import java.io.*;
import com.elharo.io.*;
public class FileTyper {
  public static void main(String[] args) throws IOException {
    if (args.length != 1) {
      System.err.println("Usage: java FileTyper filename");
      return;
    }
    typeFile(args[0]);
  }
  public static void typeFile(String filename) throws IOException {
    FileInputStream fin = new FileInputStream(filename);
    try {
      StreamCopier.copy(fin, System.out);
    }
    finally {
      fin.close( );
    }
  }
}

Example 3-3. The StreamCopier class

package com.elharo.io;
import java.io.*;
public class StreamCopier {
  public static void main(String[] args) {
    try {
      copy(System.in, System.out);
    }
    catch (IOException ex) {
      System.err.println(ex);
    }
  }
  public static void copy(InputStream in, OutputStream out)
   throws IOException {
    byte[] buffer = new byte[1024];
    while (true) {
      int bytesRead = in.read(buffer);
      if (bytesRead == -1) break;
      out.write(buffer, 0, bytesRead);
    }
  }
}

时间： 2024-10-09 12:11:37

【翻译】《Java? I/O, 2nd Edition》-14.1 使用缓冲区复制文件

【翻译】《Java? I/O, 2nd Edition》-14.1 使用缓冲区复制文件的相关文章

JAVA之旅（二十五）——文件复制,字符流的缓冲区，BufferedWriter，BufferedReader，通过缓冲区复制文件，readLine工作原理，自定义readLine

java Io缓冲区复制文件

第十二周翻译-《Pro SQL Server Internals, 2nd edition》

翻译-聚集索引《Pro SQL Server Internals, 2nd edition》

Elasticsearch Server,2nd Edition pdf 翻译中文

翻译节选《Pro SQL Server Internals, 2nd edition》CHAPTER 2

第十五周翻译-《Pro SQL Server Internals, 2nd edition》

十五周翻译《Pro SQL Server Internals, 2nd edition》 CHAPTER 7

翻译：《Pro SQL Server Internals, 2nd edition》CHAPTER 7 Designing and Tuning the Indexes