使用HttpClient进行网络处理的基本步骤如下:
1、通过get的方式获取到Response对象。
CloseableHttpClient httpClient = HttpClients.createDefault(); HttpGet httpGet = new HttpGet("http://www.baidu.com/"); CloseableHttpResponse response = httpClient.execute(httpGet);
2、获取Response对象的Entity。
HttpEntity entity = response.getEntity();
注:HttpClient将Response的正文及Request的POST/PUT方法中的正文均封装成一个HttpEntity对象。可以通过entity.getContenType(),entity.getContentLength()等方法获取到正文的相关信息。但最重要的方法是通过getContent()获取到InputStream对象。
3、通过Entity获取到InputStream对象,然后对返回内容进行处理。
is = entity.getContent(); sc = new Scanner(is); // String filename = path.substring(path.lastIndexOf(‘/‘)+1); String filename = "2.txt"; os = new PrintWriter(filename); while (sc.hasNext()) { os.write(sc.nextLine()); }
使用HtppClient下载一个网页的完整代码如下:
package com.ljh.test; import java.io.IOException; import java.io.InputStream; import java.io.PrintWriter; import java.io.Writer; import java.util.Scanner; import org.apache.http.HttpEntity; import org.apache.http.HttpStatus; import org.apache.http.client.ClientProtocolException; import org.apache.http.client.methods.CloseableHttpResponse; import org.apache.http.client.methods.HttpGet; import org.apache.http.impl.client.CloseableHttpClient; import org.apache.http.impl.client.HttpClients; public class DownloadWebPage{ public static void downloadPagebyGetMethod() throws IOException { // 1、通过HttpGet获取到response对象 CloseableHttpClient httpClient = HttpClients.createDefault(); HttpGet httpGet = new HttpGet("http://www.baidu.com/"); CloseableHttpResponse response = httpClient.execute(httpGet); InputStream is = null; Scanner sc = null; Writer os = null; if (response.getStatusLine().getStatusCode() == HttpStatus.SC_OK) { try { // 2、获取response的entity。 HttpEntity entity = response.getEntity(); // 3、获取到InputStream对象,并对内容进行处理 is = entity.getContent(); sc = new Scanner(is); // String filename = path.substring(path.lastIndexOf(‘/‘)+1); String filename = "2.txt"; os = new PrintWriter(filename); while (sc.hasNext()) { os.write(sc.nextLine()); } } catch (ClientProtocolException e) { e.printStackTrace(); } finally { if (sc != null) { sc.close(); } if (is != null) { is.close(); } if (os != null) { os.close(); } if (response != null) { response.close(); } } } } public static void main(String[] args) { try { downloadPagebyGetMethod(); } catch (IOException e) { e.printStackTrace(); } } }
注意:直接将HttpGet改为HttpPost,返回的结果有误,百度返回302状态,即重定向,新浪返回拒绝访问。怀疑大多网站均不允许POST方法直接访问网站。
时间: 2024-10-05 12:21:24