大三上寒假15天--第5天

昨天的下载完成后运行报错,应该是下载的spark版本和教程不符合,然后pom.xml文件中的spark-core内容而应该不同,但是我还是用的教程导致,现在正在尝试安装教程给的网站找的内容又下载中,不知道这次又要下载多久。(我下载的是spark 2.4.4)

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
  <parent>
    <artifactId>spark-parent_2.11</artifactId>
    <groupId>org.apache.spark</groupId>
    <version>2.4.4</version>
  </parent>
  <modelVersion>4.0.0</modelVersion>
  <artifactId>spark-core_2.11</artifactId>
  <name>Spark Project Core</name>
  <url>http://spark.apache.org/</url>
  <build>
    <outputDirectory>target/scala-${scala.binary.version}/classes</outputDirectory>
    <testOutputDirectory>target/scala-${scala.binary.version}/test-classes</testOutputDirectory>
    <resources>
      <resource>
        <directory>${project.basedir}/src/main/resources</directory>
      </resource>
      <resource>
        <filtering>true</filtering>
        <directory>${project.build.directory}/extra-resources</directory>
      </resource>
    </resources>
    <plugins>
      <plugin>
        <artifactId>maven-antrun-plugin</artifactId>
        <executions>
          <execution>
            <phase>generate-resources</phase>
            <goals>
              <goal>run</goal>
            </goals>
            <configuration>
              <target>
                <exec>
                  <arg />
                  <arg />
                  <arg />
                </exec>
              </target>
            </configuration>
          </execution>
        </executions>
      </plugin>
      <plugin>
        <artifactId>maven-dependency-plugin</artifactId>
        <executions>
          <execution>
            <id>copy-dependencies</id>
            <phase>package</phase>
            <goals>
              <goal>copy-dependencies</goal>
            </goals>
            <configuration>
              <outputDirectory>${project.build.directory}</outputDirectory>
              <overWriteReleases>false</overWriteReleases>
              <overWriteSnapshots>false</overWriteSnapshots>
              <overWriteIfNewer>true</overWriteIfNewer>
              <useSubDirectoryPerType>true</useSubDirectoryPerType>
              <includeArtifactIds>guava,jetty-io,jetty-servlet,jetty-servlets,jetty-continuation,jetty-http,jetty-plus,jetty-util,jetty-server,jetty-security,jetty-proxy,jetty-client</includeArtifactIds>
              <silent>true</silent>
            </configuration>
          </execution>
        </executions>
      </plugin>
    </plugins>
  </build>
  <profiles>
    <profile>
      <id>Windows</id>
      <properties>
        <script.extension>.bat</script.extension>
      </properties>
    </profile>
    <profile>
      <id>unix</id>
      <properties>
        <script.extension>.sh</script.extension>
      </properties>
    </profile>
    <profile>
      <id>sparkr</id>
      <build>
        <plugins>
          <plugin>
            <groupId>org.codehaus.mojo</groupId>
            <artifactId>exec-maven-plugin</artifactId>
            <version>1.6.0</version>
            <executions>
              <execution>
                <id>sparkr-pkg</id>
                <phase>compile</phase>
                <goals>
                  <goal>exec</goal>
                </goals>
              </execution>
            </executions>
            <configuration>
              <executable>${project.basedir}${file.separator}..${file.separator}R${file.separator}install-dev${script.extension}</executable>
            </configuration>
          </plugin>
        </plugins>
      </build>
    </profile>
  </profiles>
  <dependencies>
    <dependency>
      <groupId>com.thoughtworks.paranamer</groupId>
      <artifactId>paranamer</artifactId>
      <version>2.8</version>
      <scope>runtime</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.avro</groupId>
      <artifactId>avro</artifactId>
      <version>1.8.2</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.avro</groupId>
      <artifactId>avro-mapred</artifactId>
      <version>1.8.2</version>
      <classifier>hadoop2</classifier>
      <scope>compile</scope>
      <exclusions>
        <exclusion>
          <artifactId>netty</artifactId>
          <groupId>io.netty</groupId>
        </exclusion>
        <exclusion>
          <artifactId>jetty</artifactId>
          <groupId>org.mortbay.jetty</groupId>
        </exclusion>
        <exclusion>
          <artifactId>jetty-util</artifactId>
          <groupId>org.mortbay.jetty</groupId>
        </exclusion>
        <exclusion>
          <artifactId>servlet-api</artifactId>
          <groupId>org.mortbay.jetty</groupId>
        </exclusion>
        <exclusion>
          <artifactId>velocity</artifactId>
          <groupId>org.apache.velocity</groupId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>com.google.guava</groupId>
      <artifactId>guava</artifactId>
      <version>14.0.1</version>
      <scope>provided</scope>
    </dependency>
    <dependency>
      <groupId>com.twitter</groupId>
      <artifactId>chill_2.11</artifactId>
      <version>0.9.3</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>com.twitter</groupId>
      <artifactId>chill-java</artifactId>
      <version>0.9.3</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.xbean</groupId>
      <artifactId>xbean-asm6-shaded</artifactId>
      <version>4.8</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.hadoop</groupId>
      <artifactId>hadoop-client</artifactId>
      <version>2.6.5</version>
      <scope>compile</scope>
      <exclusions>
        <exclusion>
          <artifactId>asm</artifactId>
          <groupId>asm</groupId>
        </exclusion>
        <exclusion>
          <artifactId>jackson-mapper-asl</artifactId>
          <groupId>org.codehaus.jackson</groupId>
        </exclusion>
        <exclusion>
          <artifactId>asm</artifactId>
          <groupId>org.ow2.asm</groupId>
        </exclusion>
        <exclusion>
          <artifactId>netty</artifactId>
          <groupId>org.jboss.netty</groupId>
        </exclusion>
        <exclusion>
          <artifactId>commons-beanutils-core</artifactId>
          <groupId>commons-beanutils</groupId>
        </exclusion>
        <exclusion>
          <artifactId>commons-logging</artifactId>
          <groupId>commons-logging</groupId>
        </exclusion>
        <exclusion>
          <artifactId>mockito-all</artifactId>
          <groupId>org.mockito</groupId>
        </exclusion>
        <exclusion>
          <artifactId>servlet-api-2.5</artifactId>
          <groupId>org.mortbay.jetty</groupId>
        </exclusion>
        <exclusion>
          <artifactId>servlet-api</artifactId>
          <groupId>javax.servlet</groupId>
        </exclusion>
        <exclusion>
          <artifactId>junit</artifactId>
          <groupId>junit</groupId>
        </exclusion>
        <exclusion>
          <artifactId>*</artifactId>
          <groupId>com.sun.jersey</groupId>
        </exclusion>
        <exclusion>
          <artifactId>*</artifactId>
          <groupId>com.sun.jersey.jersey-test-framework</groupId>
        </exclusion>
        <exclusion>
          <artifactId>*</artifactId>
          <groupId>com.sun.jersey.contribs</groupId>
        </exclusion>
        <exclusion>
          <artifactId>jets3t</artifactId>
          <groupId>net.java.dev.jets3t</groupId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-launcher_2.11</artifactId>
      <version>2.4.4</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-kvstore_2.11</artifactId>
      <version>2.4.4</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-network-common_2.11</artifactId>
      <version>2.4.4</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-network-shuffle_2.11</artifactId>
      <version>2.4.4</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-unsafe_2.11</artifactId>
      <version>2.4.4</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>javax.activation</groupId>
      <artifactId>activation</artifactId>
      <version>1.1.1</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.curator</groupId>
      <artifactId>curator-recipes</artifactId>
      <version>2.6.0</version>
      <scope>compile</scope>
      <exclusions>
        <exclusion>
          <artifactId>netty</artifactId>
          <groupId>org.jboss.netty</groupId>
        </exclusion>
        <exclusion>
          <artifactId>jline</artifactId>
          <groupId>jline</groupId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>org.apache.zookeeper</groupId>
      <artifactId>zookeeper</artifactId>
      <version>3.4.6</version>
      <scope>compile</scope>
      <exclusions>
        <exclusion>
          <artifactId>netty</artifactId>
          <groupId>org.jboss.netty</groupId>
        </exclusion>
        <exclusion>
          <artifactId>jline</artifactId>
          <groupId>jline</groupId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>javax.servlet</groupId>
      <artifactId>javax.servlet-api</artifactId>
      <version>3.1.0</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.commons</groupId>
      <artifactId>commons-lang3</artifactId>
      <version>3.5</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.commons</groupId>
      <artifactId>commons-math3</artifactId>
      <version>3.4.1</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>com.google.code.findbugs</groupId>
      <artifactId>jsr305</artifactId>
      <version>1.3.9</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>org.slf4j</groupId>
      <artifactId>slf4j-api</artifactId>
      <version>1.7.16</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>org.slf4j</groupId>
      <artifactId>jul-to-slf4j</artifactId>
      <version>1.7.16</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>org.slf4j</groupId>
      <artifactId>jcl-over-slf4j</artifactId>
      <version>1.7.16</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>log4j</groupId>
      <artifactId>log4j</artifactId>
      <version>1.2.17</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>org.slf4j</groupId>
      <artifactId>slf4j-log4j12</artifactId>
      <version>1.7.16</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>com.ning</groupId>
      <artifactId>compress-lzf</artifactId>
      <version>1.0.3</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>org.xerial.snappy</groupId>
      <artifactId>snappy-java</artifactId>
      <version>1.1.7.3</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>org.lz4</groupId>
      <artifactId>lz4-java</artifactId>
      <version>1.4.0</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>com.github.luben</groupId>
      <artifactId>zstd-jni</artifactId>
      <version>1.3.2-2</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>org.roaringbitmap</groupId>
      <artifactId>RoaringBitmap</artifactId>
      <version>0.7.45</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>commons-net</groupId>
      <artifactId>commons-net</artifactId>
      <version>3.1</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>org.scala-lang</groupId>
      <artifactId>scala-library</artifactId>
      <version>2.11.12</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>org.json4s</groupId>
      <artifactId>json4s-jackson_2.11</artifactId>
      <version>3.5.3</version>
      <scope>compile</scope>
      <exclusions>
        <exclusion>
          <artifactId>*</artifactId>
          <groupId>com.fasterxml.jackson.core</groupId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>org.glassfish.jersey.core</groupId>
      <artifactId>jersey-client</artifactId>
      <version>2.22.2</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>org.glassfish.jersey.core</groupId>
      <artifactId>jersey-common</artifactId>
      <version>2.22.2</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>org.glassfish.jersey.core</groupId>
      <artifactId>jersey-server</artifactId>
      <version>2.22.2</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>org.glassfish.jersey.containers</groupId>
      <artifactId>jersey-container-servlet</artifactId>
      <version>2.22.2</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>org.glassfish.jersey.containers</groupId>
      <artifactId>jersey-container-servlet-core</artifactId>
      <version>2.22.2</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>io.netty</groupId>
      <artifactId>netty-all</artifactId>
      <version>4.1.17.Final</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>io.netty</groupId>
      <artifactId>netty</artifactId>
      <version>3.9.9.Final</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>com.clearspring.analytics</groupId>
      <artifactId>stream</artifactId>
      <version>2.7.0</version>
      <scope>compile</scope>
      <exclusions>
        <exclusion>
          <artifactId>fastutil</artifactId>
          <groupId>it.unimi.dsi</groupId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>io.dropwizard.metrics</groupId>
      <artifactId>metrics-core</artifactId>
      <version>3.1.5</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>io.dropwizard.metrics</groupId>
      <artifactId>metrics-jvm</artifactId>
      <version>3.1.5</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>io.dropwizard.metrics</groupId>
      <artifactId>metrics-json</artifactId>
      <version>3.1.5</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>io.dropwizard.metrics</groupId>
      <artifactId>metrics-graphite</artifactId>
      <version>3.1.5</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>com.fasterxml.jackson.core</groupId>
      <artifactId>jackson-databind</artifactId>
      <version>2.6.7.1</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>com.fasterxml.jackson.module</groupId>
      <artifactId>jackson-module-scala_2.11</artifactId>
      <version>2.6.7.1</version>
      <scope>compile</scope>
      <exclusions>
        <exclusion>
          <artifactId>guava</artifactId>
          <groupId>com.google.guava</groupId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>org.apache.derby</groupId>
      <artifactId>derby</artifactId>
      <version>10.12.1.1</version>
      <scope>test</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.ivy</groupId>
      <artifactId>ivy</artifactId>
      <version>2.4.0</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>oro</groupId>
      <artifactId>oro</artifactId>
      <version>2.0.8</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>org.seleniumhq.selenium</groupId>
      <artifactId>selenium-java</artifactId>
      <version>2.52.0</version>
      <scope>test</scope>
      <exclusions>
        <exclusion>
          <artifactId>guava</artifactId>
          <groupId>com.google.guava</groupId>
        </exclusion>
        <exclusion>
          <artifactId>netty</artifactId>
          <groupId>io.netty</groupId>
        </exclusion>
        <exclusion>
          <artifactId>selenium-chrome-driver</artifactId>
          <groupId>org.seleniumhq.selenium</groupId>
        </exclusion>
        <exclusion>
          <artifactId>selenium-edge-driver</artifactId>
          <groupId>org.seleniumhq.selenium</groupId>
        </exclusion>
        <exclusion>
          <artifactId>selenium-firefox-driver</artifactId>
          <groupId>org.seleniumhq.selenium</groupId>
        </exclusion>
        <exclusion>
          <artifactId>selenium-ie-driver</artifactId>
          <groupId>org.seleniumhq.selenium</groupId>
        </exclusion>
        <exclusion>
          <artifactId>selenium-safari-driver</artifactId>
          <groupId>org.seleniumhq.selenium</groupId>
        </exclusion>
        <exclusion>
          <artifactId>selenium-support</artifactId>
          <groupId>org.seleniumhq.selenium</groupId>
        </exclusion>
        <exclusion>
          <artifactId>webbit</artifactId>
          <groupId>org.webbitserver</groupId>
        </exclusion>
        <exclusion>
          <artifactId>selenium-leg-rc</artifactId>
          <groupId>org.seleniumhq.selenium</groupId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>org.seleniumhq.selenium</groupId>
      <artifactId>selenium-htmlunit-driver</artifactId>
      <version>2.52.0</version>
      <scope>test</scope>
      <exclusions>
        <exclusion>
          <artifactId>htmlunit</artifactId>
          <groupId>net.sourceforge.htmlunit</groupId>
        </exclusion>
        <exclusion>
          <artifactId>selenium-support</artifactId>
          <groupId>org.seleniumhq.selenium</groupId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>xml-apis</groupId>
      <artifactId>xml-apis</artifactId>
      <version>1.4.01</version>
      <scope>test</scope>
    </dependency>
    <dependency>
      <groupId>org.hamcrest</groupId>
      <artifactId>hamcrest-core</artifactId>
      <version>1.3</version>
      <scope>test</scope>
    </dependency>
    <dependency>
      <groupId>org.hamcrest</groupId>
      <artifactId>hamcrest-library</artifactId>
      <version>1.3</version>
      <scope>test</scope>
    </dependency>
    <dependency>
      <groupId>org.mockito</groupId>
      <artifactId>mockito-core</artifactId>
      <version>1.10.19</version>
      <scope>test</scope>
    </dependency>
    <dependency>
      <groupId>org.scalacheck</groupId>
      <artifactId>scalacheck_2.11</artifactId>
      <version>1.13.5</version>
      <scope>test</scope>
      <exclusions>
        <exclusion>
          <artifactId>test-interface</artifactId>
          <groupId>org.scala-sbt</groupId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>org.apache.curator</groupId>
      <artifactId>curator-test</artifactId>
      <version>2.6.0</version>
      <scope>test</scope>
      <exclusions>
        <exclusion>
          <artifactId>commons-math</artifactId>
          <groupId>org.apache.commons</groupId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>net.razorvine</groupId>
      <artifactId>pyrolite</artifactId>
      <version>4.13</version>
      <scope>compile</scope>
      <exclusions>
        <exclusion>
          <artifactId>serpent</artifactId>
          <groupId>net.razorvine</groupId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>net.sf.py4j</groupId>
      <artifactId>py4j</artifactId>
      <version>0.10.7</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-tags_2.11</artifactId>
      <version>2.4.4</version>
      <scope>compile</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-launcher_2.11</artifactId>
      <version>2.4.4</version>
      <classifier>tests</classifier>
      <scope>test</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-tags_2.11</artifactId>
      <version>2.4.4</version>
      <type>test-jar</type>
      <scope>test</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.commons</groupId>
      <artifactId>commons-crypto</artifactId>
      <version>1.0.0</version>
      <scope>compile</scope>
      <exclusions>
        <exclusion>
          <artifactId>jna</artifactId>
          <groupId>net.java.dev.jna</groupId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>org.spark-project.hive</groupId>
      <artifactId>hive-exec</artifactId>
      <version>1.2.1.spark2</version>
      <scope>provided</scope>
      <exclusions>
        <exclusion>
          <artifactId>hive-metastore</artifactId>
          <groupId>org.spark-project.hive</groupId>
        </exclusion>
        <exclusion>
          <artifactId>hive-shims</artifactId>
          <groupId>org.spark-project.hive</groupId>
        </exclusion>
        <exclusion>
          <artifactId>hive-ant</artifactId>
          <groupId>org.spark-project.hive</groupId>
        </exclusion>
        <exclusion>
          <artifactId>spark-client</artifactId>
          <groupId>org.spark-project.hive</groupId>
        </exclusion>
        <exclusion>
          <artifactId>ant</artifactId>
          <groupId>ant</groupId>
        </exclusion>
        <exclusion>
          <artifactId>ant</artifactId>
          <groupId>org.apache.ant</groupId>
        </exclusion>
        <exclusion>
          <artifactId>kryo</artifactId>
          <groupId>com.esotericsoftware.kryo</groupId>
        </exclusion>
        <exclusion>
          <artifactId>commons-codec</artifactId>
          <groupId>commons-codec</groupId>
        </exclusion>
        <exclusion>
          <artifactId>commons-httpclient</artifactId>
          <groupId>commons-httpclient</groupId>
        </exclusion>
        <exclusion>
          <artifactId>avro-mapred</artifactId>
          <groupId>org.apache.avro</groupId>
        </exclusion>
        <exclusion>
          <artifactId>calcite-core</artifactId>
          <groupId>org.apache.calcite</groupId>
        </exclusion>
        <exclusion>
          <artifactId>apache-curator</artifactId>
          <groupId>org.apache.curator</groupId>
        </exclusion>
        <exclusion>
          <artifactId>curator-client</artifactId>
          <groupId>org.apache.curator</groupId>
        </exclusion>
        <exclusion>
          <artifactId>curator-framework</artifactId>
          <groupId>org.apache.curator</groupId>
        </exclusion>
        <exclusion>
          <artifactId>libthrift</artifactId>
          <groupId>org.apache.thrift</groupId>
        </exclusion>
        <exclusion>
          <artifactId>libfb303</artifactId>
          <groupId>org.apache.thrift</groupId>
        </exclusion>
        <exclusion>
          <artifactId>zookeeper</artifactId>
          <groupId>org.apache.zookeeper</groupId>
        </exclusion>
        <exclusion>
          <artifactId>slf4j-api</artifactId>
          <groupId>org.slf4j</groupId>
        </exclusion>
        <exclusion>
          <artifactId>slf4j-log4j12</artifactId>
          <groupId>org.slf4j</groupId>
        </exclusion>
        <exclusion>
          <artifactId>log4j</artifactId>
          <groupId>log4j</groupId>
        </exclusion>
        <exclusion>
          <artifactId>commons-logging</artifactId>
          <groupId>commons-logging</groupId>
        </exclusion>
        <exclusion>
          <artifactId>groovy-all</artifactId>
          <groupId>org.codehaus.groovy</groupId>
        </exclusion>
        <exclusion>
          <artifactId>jline</artifactId>
          <groupId>jline</groupId>
        </exclusion>
        <exclusion>
          <artifactId>json</artifactId>
          <groupId>org.json</groupId>
        </exclusion>
        <exclusion>
          <artifactId>javolution</artifactId>
          <groupId>javolution</groupId>
        </exclusion>
        <exclusion>
          <artifactId>apache-log4j-extras</artifactId>
          <groupId>log4j</groupId>
        </exclusion>
        <exclusion>
          <artifactId>antlr-runtime</artifactId>
          <groupId>org.antlr</groupId>
        </exclusion>
        <exclusion>
          <artifactId>ST4</artifactId>
          <groupId>org.antlr</groupId>
        </exclusion>
        <exclusion>
          <artifactId>jodd-core</artifactId>
          <groupId>org.jodd</groupId>
        </exclusion>
        <exclusion>
          <artifactId>datanucleus-core</artifactId>
          <groupId>org.datanucleus</groupId>
        </exclusion>
        <exclusion>
          <artifactId>calcite-avatica</artifactId>
          <groupId>org.apache.calcite</groupId>
        </exclusion>
        <exclusion>
          <artifactId>JavaEWAH</artifactId>
          <groupId>com.googlecode.javaewah</groupId>
        </exclusion>
        <exclusion>
          <artifactId>snappy</artifactId>
          <groupId>org.iq80.snappy</groupId>
        </exclusion>
        <exclusion>
          <artifactId>stax-api</artifactId>
          <groupId>stax</groupId>
        </exclusion>
        <exclusion>
          <artifactId>opencsv</artifactId>
          <groupId>net.sf.opencsv</groupId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>org.spark-project.hive</groupId>
      <artifactId>hive-metastore</artifactId>
      <version>1.2.1.spark2</version>
      <scope>provided</scope>
      <exclusions>
        <exclusion>
          <artifactId>hive-serde</artifactId>
          <groupId>org.spark-project.hive</groupId>
        </exclusion>
        <exclusion>
          <artifactId>hive-shims</artifactId>
          <groupId>org.spark-project.hive</groupId>
        </exclusion>
        <exclusion>
          <artifactId>libfb303</artifactId>
          <groupId>org.apache.thrift</groupId>
        </exclusion>
        <exclusion>
          <artifactId>libthrift</artifactId>
          <groupId>org.apache.thrift</groupId>
        </exclusion>
        <exclusion>
          <artifactId>servlet-api</artifactId>
          <groupId>org.mortbay.jetty</groupId>
        </exclusion>
        <exclusion>
          <artifactId>guava</artifactId>
          <groupId>com.google.guava</groupId>
        </exclusion>
        <exclusion>
          <artifactId>slf4j-api</artifactId>
          <groupId>org.slf4j</groupId>
        </exclusion>
        <exclusion>
          <artifactId>slf4j-log4j12</artifactId>
          <groupId>org.slf4j</groupId>
        </exclusion>
        <exclusion>
          <artifactId>bonecp</artifactId>
          <groupId>com.jolbox</groupId>
        </exclusion>
        <exclusion>
          <artifactId>datanucleus-api-jdo</artifactId>
          <groupId>org.datanucleus</groupId>
        </exclusion>
        <exclusion>
          <artifactId>datanucleus-rdbms</artifactId>
          <groupId>org.datanucleus</groupId>
        </exclusion>
        <exclusion>
          <artifactId>commons-pool</artifactId>
          <groupId>commons-pool</groupId>
        </exclusion>
        <exclusion>
          <artifactId>commons-dbcp</artifactId>
          <groupId>commons-dbcp</groupId>
        </exclusion>
        <exclusion>
          <artifactId>jdo-api</artifactId>
          <groupId>javax.jdo</groupId>
        </exclusion>
        <exclusion>
          <artifactId>datanucleus-core</artifactId>
          <groupId>org.datanucleus</groupId>
        </exclusion>
        <exclusion>
          <artifactId>antlr-runtime</artifactId>
          <groupId>org.antlr</groupId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>org.apache.thrift</groupId>
      <artifactId>libthrift</artifactId>
      <version>0.9.3</version>
      <scope>provided</scope>
      <exclusions>
        <exclusion>
          <artifactId>slf4j-api</artifactId>
          <groupId>org.slf4j</groupId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>org.apache.thrift</groupId>
      <artifactId>libfb303</artifactId>
      <version>0.9.3</version>
      <scope>provided</scope>
      <exclusions>
        <exclusion>
          <artifactId>slf4j-api</artifactId>
          <groupId>org.slf4j</groupId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>org.scalatest</groupId>
      <artifactId>scalatest_2.11</artifactId>
      <version>3.0.3</version>
      <scope>test</scope>
      <exclusions>
        <exclusion>
          <artifactId>scalactic_2.11</artifactId>
          <groupId>org.scalactic</groupId>
        </exclusion>
        <exclusion>
          <artifactId>scala-parser-combinators_2.11</artifactId>
          <groupId>org.scala-lang.modules</groupId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>4.12</version>
      <scope>test</scope>
    </dependency>
    <dependency>
      <groupId>com.novocode</groupId>
      <artifactId>junit-interface</artifactId>
      <version>0.11</version>
      <scope>test</scope>
      <exclusions>
        <exclusion>
          <artifactId>test-interface</artifactId>
          <groupId>org.scala-sbt</groupId>
        </exclusion>
      </exclusions>
    </dependency>
  </dependencies>
  <properties>
    <sbt.project.name>core</sbt.project.name>
  </properties>
</project>

原文地址:https://www.cnblogs.com/my---world/p/12267255.html

时间: 2024-10-31 16:15:07

大三上寒假15天--第5天的相关文章

大三上寒假15天--第1天

学习于林子雨<大数据技术原理与应用>教材配套大数据软件安装和编程实践指南 一. 安装spark 第一步,spark下载(http://spark.apache.org/downloads.html) 第二步,spark压缩包解压 sudo tar -zxf ~/下载/spark-1.6.2-bin-without-hadoop.tgz -C /usr/local/ 第三步,解压后文件夹改名为spark cd /usr/local sudo mv ./spark-1.6.2-bin-withou

大三上寒假15天--第3天

学习于大数据原理与应用 第十六章 Spark 学习指南 三.独立应用程序编程 2.Scala应用程序代码 cd ~           # 进入用户主文件夹 mkdir ./sparkapp        # 创建应用程序根目录 mkdir -p ./sparkapp/src/main/scala     # 创建所需的文件夹结构 vim ./sparkapp/src/main/scala/SimpleApp.scala #在 ./sparkapp/src/main/scala 下建立一个名为

大三上寒假15天--第10天

今天继续学习webmagic爬虫技术,组件包含: 1.Downloader Downloader负责从互联网上下载页面,以便后续处理.WebMagic默认使用了Apache HttpClient作为下载工具. 2.PageProcessor PageProcessor负责解析页面,抽取有用信息,以及发现新的链接.WebMagic使用Jsoup作为HTML解析工具,并基于其开发了解析XPath的工具Xsoup. 在这四个组件中,PageProcessor对于每个站点每个页面都不一样,是需要使用者定

大三上寒假15天--第14天

今天依旧学习了webmagic爬虫,发现昨天爬取的网址不对,内容也不对,重新找了一个网址爬取,重新整理了思路,发现这个网址,分为三种类型的链接,建议,咨询和一个什么记不清了,需要先判断类型,然后才能分配Id,然后加入url队列. 这个网址的一大难点,就是分页是一个POST传值,然后动态的修改了网页内容,所以今天重新学习了,通过POST获取内容,收益匪浅,忙活了一天,忘记写博客了,很晚了所以就写这么多吧. 另外附上学习的网址,讲的很好 关键代码为: Request request = new Re

大三上寒假15天--第15天

今天webmaigic爬虫又学了一个小技巧,想要自己设计保存爬取内容形式,可以不用重写Pipeline,在process()方法中写上,你想要的保存操作,多数情况可以达到相同的效果,我的爬虫程序,想要将内容保存在一个txt中,就是这么实现的,个人感觉简单很多,也是看了网上的高手的文章,才学到了这个技巧,受益匪浅. 爬虫北京政府信件到此就完成完成了,全部代码如下,我的保存特点为以空格隔开不同的信息,方便导入数据库: package my.webmagic2; import java.io.File

大三上寒假15天--第11天

今天继续学习webmagic爬虫 通过老师给的学习资料学习后,对webmagic爬虫基本有了了解,当时对site方法有点模糊,今天也终于搞明白了,感觉就像模拟了一个用户一样,对一个网站的框架还是不是很了解,所以还是不太理解,那些cookie,host,UserAgent,和header是什么,不过我以后会搞明白,今天准备进行编码实际练习,进行一个网站的爬取. 这是爬取前十页信件目录一样的网页,还不清楚怎么获取信件内容url,我会继续努力 package my.webmagic; import u

大三上寒假15天--第12天

今天继续学习webmagic 已经可以爬取出目录和跳转去信件的页面,不知道是一直有还是后来加的,现在好像多了哥跳转中页面,目录给的链接不是直接的信件页面,所以我还需要再加一条爬取跳转中页面的提供的url然后访问,才可以爬取信件内容,但是不知道是正则表达式的问题还是什么,总之爬取不到想要的url一直是null,爬取到的为: url: http://www.beijing.gov.cn/so/view?qt=%E4%BF%A1%E4%BB%B6&location=2&reference=5BE

大三上的总结(-----坚持走完了自己想走的路-----微笑---------)

(⊙v⊙)嗯,这天是第15周的周末的早晨八点. 为什么突然要写一篇这样柔软的文章来描述自己的大三生活呢? ->_->  因为又到期末啦! 该总结一下自己的脑瓜里学会啥了? 噗,先说点acmer的事吧! 西安赛区小小的回忆,来一首<回忆>---白智英,嗯,优美的调调,西安赛区一战,让我想了很多很多,从寒假集训,到暑假集训,为此放弃了自己休息的时间,而呆在实验室鏖战的ACMer,以及陈老师对我时不时送来的水果和时不时来看望我们,以及学校领导的重视. 深深的知道这一次现场赛来的有多么不容

大三上------期末总结

今天终于把万恶的期末考试给考完了!想想考的都是专业课,原本以为肯定会复习地很开心.因为不用像以前一样只是考试前一周什么都不会,然后考前疯狂地看书刷题,考完之后瞬间遗忘.不过事实证明...为了考试看书还是非常痛苦...即使是喜欢的算法,C++,操作系统....不过幸好已经考完啦!接下来为期四周的寒假又不用为考试而学习了! 大三上这一个学期接触的主要的东西依旧还是底层的系统和算法吧.在开学的第一个月把<UNIX环境高级编程>看了一遍,而且是英文原版的.要说真的学到了什么,其实没有,因为这本书更像是