【大数据系列】win10不借助Cygwin安装hadoop2.8

一、下载安装包

解压安装包并创建data,name,tmp文件夹

二、修改配置文件

1、core-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/D:/hadoop/hadoopbak/tmp</value>
    </property>
    <property>
        <name>dfs.name.dir</name>
        <value>/D:/hadoop/hadoopbak/name</value>
    </property>
    <property>
        <name>fs.default.name</name>
        <value>hdfs://localhost:9000</value>
    </property>
</configuration>

2、mapred-site.xml (修改原来的mapred-site.xml.template)

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
       <name>mapreduce.framework.name</name>
       <value>yarn</value>
    </property>
    <property>
       <name>mapred.job.tracker</name>
       <value>hdfs://localhost:9001</value>
    </property>
</configuration>

3、hdfs-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
<!-- 这个参数设置为1,因为是单机版hadoop -->
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
    <property>
        <name>dfs.data.dir</name>
        <value>/D:/hadoop/hadoopbak/data</value>
    </property>
</configuration>

4、yarn-site.xml

<?xml version="1.0"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
<configuration>

<!-- Site specific YARN configuration properties -->
<property>
       <name>yarn.nodemanager.aux-services</name>
       <value>mapreduce_shuffle</value>
    </property>
    <property>
       <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
       <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
</configuration>

四、替换原下载安装包中的bin目录

五、运行准备

1、hadoop格式化

hdfs namenode -format

2、切换到sbin目录之下执行start-all.cmd

启动yarn的时候提示错误Couldn‘t find a package.json file 暂未解决

3、操作HDFS进行上传测试

创建目录,上传数据,查看文件。

四、查看hadoop自带的web控制台GUI

conf文件:

This XML file does not appear to have any style information associated with it. The document tree is shown below.
<configuration>
<property>
<name>yarn.ipc.rpc.class</name>
<value>org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.job.maxtaskfailures.per.tracker</name>
<value>3</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.client.max-cached-nodemanagers-proxies</name>
<value>0</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.job.speculative.retry-after-speculate</name>
<value>15000</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>ha.health-monitor.connect-retry-interval.ms</name>
<value>1000</value>
<source>core-default.xml</source>
</property>
<property>
<name>
yarn.resourcemanager.work-preserving-recovery.enabled
</name>
<value>true</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.reduce.markreset.buffer.percent</name>
<value>0.0</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapreduce.jobhistory.max-age-ms</name>
<value>604800000</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapreduce.job.ubertask.enable</name>
<value>false</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>
rpc.engine.org.apache.hadoop.yarn.server.api.ResourceTrackerPB
</name>
<value>org.apache.hadoop.ipc.ProtobufRpcEngine</value>
<source>programatically</source>
</property>
<property>
<name>yarn.nodemanager.log-aggregation.compression-type</name>
<value>none</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.job.complete.cancel.delegation.tokens</name>
<value>true</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapreduce.jobhistory.datestring.cache.size</name>
<value>200000</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>
hadoop.security.kms.client.authentication.retry-count
</name>
<value>1</value>
<source>core-default.xml</source>
</property>
<property>
<name>hadoop.ssl.enabled.protocols</name>
<value>TLSv1</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>DESKTOP-H804TCF:8030</value>
<source>programatically</source>
</property>
<property>
<name>hadoop.http.cross-origin.enabled</name>
<value>false</value>
<source>core-default.xml</source>
</property>
<property>
<name>
yarn.nodemanager.container-executor.os.sched.priority.adjustment
</name>
<value>0</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.proxy-user-privileges.enabled</name>
<value>false</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.reduce.shuffle.fetch.retry.enabled</name>
<value>${yarn.nodemanager.recovery.enabled}</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>io.mapfile.bloom.error.rate</name>
<value>0.005</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.resourcemanager.minimum.version</name>
<value>NONE</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
yarn.resourcemanager.nodemanagers.heartbeat-interval-ms
</name>
<value>1000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>hadoop.http.cross-origin.allowed-headers</name>
<value>X-Requested-With,Content-Type,Accept,Origin</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.delete.debug-delay-sec</name>
<value>0</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-vcores</name>
<value>4</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.timeline-service.address</name>
<value>${yarn.timeline-service.hostname}:10200</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>ipc.maximum.response.length</name>
<value>134217728</value>
<source>core-default.xml</source>
</property>
<property>
<name>
yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb
</name>
<value>0</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.job.hdfs-servers</name>
<value>${fs.defaultFS}</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.task.profile.reduce.params</name>
<value>${mapreduce.task.profile.params}</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>ftp.stream-buffer-size</name>
<value>4096</value>
<source>core-default.xml</source>
</property>
<property>
<name>hadoop.http.cross-origin.allowed-methods</name>
<value>GET,POST,HEAD</value>
<source>core-default.xml</source>
</property>
<property>
<name>fs.s3a.buffer.dir</name>
<value>${hadoop.tmp.dir}/s3a</value>
<source>core-default.xml</source>
</property>
<property>
<name>
yarn.client.application-client-protocol.poll-interval-ms
</name>
<value>200</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.timeline-service.leveldb-timeline-store.path</name>
<value>${hadoop.tmp.dir}/yarn/timeline</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.job.split.metainfo.maxsize</name>
<value>10000000</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>fs.s3a.fast.upload.buffer</name>
<value>disk</value>
<source>core-default.xml</source>
</property>
<property>
<name>s3native.bytes-per-checksum</name>
<value>512</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.client.failover-retries-on-socket-timeouts</name>
<value>0</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>hadoop.security.sensitive-config-keys</name>
<value>
secret$,password$,ssl.keystore.pass$,fs.s3.*[Ss]ecret.?[Kk]ey,fs.azure.account.key.*,dfs.webhdfs.oauth2.[a-z]+.token,hadoop.security.sensitive-config-keys
</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.timeline-service.client.retry-interval-ms</name>
<value>1000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>hadoop.http.authentication.type</name>
<value>simple</value>
<source>programatically</source>
</property>
<property>
<name>mapreduce.local.clientfactory.class.name</name>
<value>org.apache.hadoop.mapred.LocalClientFactory</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>ipc.client.connection.maxidletime</name>
<value>10000</value>
<source>core-default.xml</source>
</property>
<property>
<name>ipc.server.max.connections</name>
<value>0</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.jobhistory.recovery.store.leveldb.path</name>
<value>${hadoop.tmp.dir}/mapred/history/recoverystore</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>fs.s3a.multipart.purge.age</name>
<value>86400</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.timeline-service.client.best-effort</name>
<value>false</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.job.ubertask.maxmaps</name>
<value>9</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>
yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage
</name>
<value>90.0</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.ifile.readahead.bytes</name>
<value>4194304</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapreduce.jobhistory.admin.address</name>
<value>0.0.0.0:10033</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.sharedcache.uploader.server.thread-count</name>
<value>50</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>s3.client-write-packet-size</name>
<value>65536</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.app.mapreduce.am.resource.cpu-vcores</name>
<value>1</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapreduce.input.lineinputformat.linespermap</name>
<value>1</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapreduce.reduce.shuffle.input.buffer.percent</name>
<value>0.70</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>hadoop.http.staticuser.user</name>
<value>dr.who</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.reduce.maxattempts</name>
<value>4</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapreduce.jobhistory.admin.acl</name>
<value>*</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>
hadoop.security.group.mapping.ldap.search.filter.user
</name>
<value>(&(objectClass=user)(sAMAccountName={0}))</value>
<source>core-default.xml</source>
</property>
<property>
<name>hadoop.workaround.non.threadsafe.getpwuid</name>
<value>true</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.map.maxattempts</name>
<value>4</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapreduce.jobhistory.cleaner.interval-ms</name>
<value>86400000</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>
yarn.timeline-service.entity-group-fs-store.active-dir
</name>
<value>/tmp/entity-file-history/active</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.zk-retry-interval-ms</name>
<value>1000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.is.minicluster</name>
<value>false</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>fs.s3n.block.size</name>
<value>67108864</value>
<source>core-default.xml</source>
</property>
<property>
<name>hadoop.registry.system.acls</name>
<value>sasl:[email protected], sasl:[email protected], sasl:[email protected]</value>
<source>core-default.xml</source>
</property>
<property>
<name>
yarn.nodemanager.node-labels.provider.fetch-timeout-ms
</name>
<value>1200000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.sharedcache.store.in-memory.check-period-mins</name>
<value>720</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>fs.s3a.multiobjectdelete.enable</name>
<value>true</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.map.skip.proc-count.auto-incr</name>
<value>true</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>true</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>hadoop.security.authentication</name>
<value>simple</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.reduce.skip.proc-count.auto-incr</name>
<value>true</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapreduce.reduce.cpu.vcores</name>
<value>1</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>net.topology.node.switch.mapping.impl</name>
<value>org.apache.hadoop.net.ScriptBasedMapping</value>
<source>core-default.xml</source>
</property>
<property>
<name>fs.s3.sleepTimeSeconds</name>
<value>10</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.timeline-service.ttl-ms</name>
<value>604800000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.sharedcache.root-dir</name>
<value>/sharedcache</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.keytab</name>
<value>/etc/krb5.keytab</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
yarn.resourcemanager.container.liveness-monitor.interval-ms
</name>
<value>600000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
hadoop.security.group.mapping.ldap.posix.attr.gid.name
</name>
<value>gidNumber</value>
<source>core-default.xml</source>
</property>
<property>
<name>
yarn.app.mapreduce.am.scheduler.heartbeat.interval-ms
</name>
<value>1000</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>
yarn.app.mapreduce.client-am.ipc.max-retries-on-timeouts
</name>
<value>3</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>
yarn.nodemanager.linux-container-executor.cgroups.hierarchy
</name>
<value>/hadoop-yarn</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>s3.bytes-per-checksum</name>
<value>512</value>
<source>core-default.xml</source>
</property>
<property>
<name>hadoop.ssl.require.client.cert</name>
<value>false</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.output.fileoutputformat.compress</name>
<value>false</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>
yarn.resourcemanager.node-labels.provider.fetch-interval-ms
</name>
<value>1800000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
yarn.resourcemanager.webapp.delegation-token-auth-filter.enabled
</name>
<value>true</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapred.child.tmp</name>
<value>/cygdrive/d/hadoop/hadoop-2.8.0/tmp</value>
<source>core-site.xml</source>
</property>
<property>
<name>mapreduce.shuffle.max.threads</name>
<value>0</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>
yarn.nodemanager.linux-container-executor.cgroups.delete-timeout-ms
</name>
<value>1000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>s3native.client-write-packet-size</name>
<value>65536</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.client.submit.file.replication</name>
<value>10</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.app.mapreduce.am.job.committer.commit-window</name>
<value>10000</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.sleep-delay-before-sigkill.ms</name>
<value>250</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.env-whitelist</name>
<value>
JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME
</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.map.speculative</name>
<value>true</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapreduce.job.speculative.slowtaskthreshold</name>
<value>1.0</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>
yarn.nodemanager.linux-container-executor.cgroups.mount
</name>
<value>false</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.jobhistory.http.policy</name>
<value>HTTP_ONLY</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>ipc.client.low-latency</name>
<value>false</value>
<source>core-default.xml</source>
</property>
<property>
<name>fs.s3a.paging.maximum</name>
<value>5000</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.jvm.system-properties-to-log</name>
<value>
os.name,os.version,java.home,java.runtime.version,java.vendor,java.version,java.vm.name,java.class.path,java.io.tmpdir,user.dir,user.name
</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>hadoop.kerberos.min.seconds.before.relogin</name>
<value>60</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.nodemanager-connect-retries</name>
<value>10</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>fs.s3.buffer.dir</name>
<value>${hadoop.tmp.dir}/s3</value>
<source>core-default.xml</source>
</property>
<property>
<name>io.native.lib.available</name>
<value>true</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.jobhistory.done-dir</name>
<value>${yarn.app.mapreduce.am.staging-dir}/history/done</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>hadoop.registry.zk.retry.interval.ms</name>
<value>1000</value>
<source>core-default.xml</source>
</property>
<property>
<name>
mapreduce.job.reducer.unconditional-preempt.delay.sec
</name>
<value>300</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>hadoop.ssl.hostname.verifier</name>
<value>DEFAULT</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.task.timeout</name>
<value>600000</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>
yarn.resourcemanager.configuration.file-system-based-store
</name>
<value>/yarn/conf</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.disk-health-checker.interval-ms</name>
<value>120000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>hadoop.security.groups.cache.secs</name>
<value>300</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.input.fileinputformat.split.minsize</name>
<value>0</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.minicluster.control-resource-monitoring</name>
<value>false</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.fail-fast</name>
<value>${yarn.fail-fast}</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.shuffle.port</name>
<value>13562</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>hadoop.rpc.protection</name>
<value>authentication</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.client.failover-proxy-provider</name>
<value>
org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider
</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.timeline-service.recovery.enabled</name>
<value>false</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>ipc.client.tcpnodelay</name>
<value>true</value>
<source>core-default.xml</source>
</property>
<property>
<name>fs.s3.maxRetries</name>
<value>4</value>
<source>core-default.xml</source>
</property>
<property>
<name>hadoop.http.authentication.kerberos.principal</name>
<value>HTTP/[email protected]</value>
<source>core-default.xml</source>
</property>
<property>
<name>
hadoop.security.group.mapping.ldap.posix.attr.uid.name
</name>
<value>uidNumber</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>${yarn.resourcemanager.hostname}:8088</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.task.profile.reduces</name>
<value>0-2</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.timeline-service.client.max-retries</name>
<value>30</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.am.max-attempts</name>
<value>2</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.job.end-notification.max.retry.interval</name>
<value>5000</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>ipc.client.connect.retry.interval</name>
<value>1000</value>
<source>core-default.xml</source>
</property>
<property>
<name>fs.s3a.multipart.size</name>
<value>100M</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.app.mapreduce.am.command-opts</name>
<value>-Xmx1024m</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.process-kill-wait.ms</name>
<value>2000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.timeline-service.state-store-class</name>
<value>
org.apache.hadoop.yarn.server.timeline.recovery.LeveldbTimelineStateStore
</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
yarn.timeline-service.client.fd-clean-interval-secs
</name>
<value>60</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.job.speculative.minimum-allowed-tasks</name>
<value>10</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>hadoop.jetty.logs.serve.aliases</name>
<value>true</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.reduce.shuffle.fetch.retry.timeout-ms</name>
<value>30000</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>fs.du.interval</name>
<value>600000</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.sharedcache.admin.address</name>
<value>0.0.0.0:8047</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
yarn.nodemanager.node-labels.provider.fetch-interval-ms
</name>
<value>600000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.acl.reservation-enable</name>
<value>false</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>hadoop.security.random.device.file.path</name>
<value>/dev/urandom</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.task.merge.progress.records</name>
<value>10000</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.container-metrics.period-ms</name>
<value>-1</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>hadoop.registry.secure</name>
<value>false</value>
<source>core-default.xml</source>
</property>
<property>
<name>hadoop.ssl.client.conf</name>
<value>ssl-client.xml</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.job.counters.max</name>
<value>120</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.localizer.fetch.thread-count</name>
<value>4</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>io.mapfile.bloom.size</name>
<value>1048576</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.localizer.client.thread-count</name>
<value>5</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>fs.automatic.close</name>
<value>true</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.task.profile</name>
<value>false</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.recovery.compaction-interval-secs</name>
<value>3600</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.task.combine.progress.records</name>
<value>10000</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapreduce.shuffle.ssl.file.buffer.size</name>
<value>65536</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.app.mapreduce.client.job.max-retries</name>
<value>0</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.app.mapreduce.am.container.log.backups</name>
<value>0</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.minicluster.fixed.ports</name>
<value>false</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.app-submission.cross-platform</name>
<value>false</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.timeline-service.ttl-enable</name>
<value>true</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
yarn.nodemanager.container-monitor.procfs-tree.smaps-based-rss.enabled
</name>
<value>false</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.keytab</name>
<value>/etc/krb5.keytab</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.log-aggregation.policy.class</name>
<value>
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AllContainerLogAggregationPolicy
</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
yarn.client.application-client-protocol.poll-timeout-ms
</name>
<value>-1</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.webapp.ui-actions.enabled</name>
<value>true</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.sharedcache.client-server.address</name>
<value>0.0.0.0:8045</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.webapp.cross-origin.enabled</name>
<value>false</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
yarn.nodemanager.runtime.linux.docker.privileged-containers.allowed
</name>
<value>false</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>hadoop.security.instrumentation.requires.admin</name>
<value>false</value>
<source>core-default.xml</source>
</property>
<property>
<name>io.compression.codec.bzip2.library</name>
<value>system-native</value>
<source>core-default.xml</source>
</property>
<property>
<name>hadoop.ssl.keystores.factory.class</name>
<value>
org.apache.hadoop.security.ssl.FileBasedKeyStoresFactory
</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.task.exit.timeout</name>
<value>60000</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>fs.ftp.host</name>
<value>0.0.0.0</value>
<source>core-default.xml</source>
</property>
<property>
<name>
yarn.app.mapreduce.am.containerlauncher.threadpool-initial-size
</name>
<value>10</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>s3.blocksize</name>
<value>67108864</value>
<source>core-default.xml</source>
</property>
<property>
<name>s3native.stream-buffer-size</name>
<value>4096</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>-1</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.task.userlog.limit.kb</name>
<value>0</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>
hadoop.security.crypto.codec.classes.aes.ctr.nopadding
</name>
<value>
org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec, org.apache.hadoop.crypto.JceAesCtrCryptoCodec
</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.reduce.speculative</name>
<value>true</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.node-labels.fs-store.impl.class</name>
<value>
org.apache.hadoop.yarn.nodelabels.FileSystemNodeLabelsStore
</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>hadoop.caller.context.max.size</name>
<value>128</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.client.failover-retries</name>
<value>0</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>-1</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.jobhistory.recovery.enable</name>
<value>false</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>nfs.exports.allowed.hosts</name>
<value>* rw</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.sharedcache.checksum.algo.impl</name>
<value>
org.apache.hadoop.yarn.sharedcache.ChecksumSHA256Impl
</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.reduce.shuffle.memory.limit.percent</name>
<value>0.25</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>file.replication</name>
<value>1</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.job.reduce.shuffle.consumer.plugin.class</name>
<value>org.apache.hadoop.mapreduce.task.reduce.Shuffle</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapreduce.task.local-fs.write-limit.bytes</name>
<value>-1</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.app.mapreduce.am.log.level</name>
<value>INFO</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapreduce.am.max-attempts</name>
<value>2</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapreduce.shuffle.connection-keep-alive.timeout</name>
<value>5</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapreduce.job.reduces</name>
<value>1</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>
hadoop.security.group.mapping.ldap.connection.timeout.ms
</name>
<value>60000</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.amrmproxy.client.thread-count</name>
<value>25</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
yarn.app.mapreduce.am.job.task.listener.thread-count
</name>
<value>30</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.store.class</name>
<value>
org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore
</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>s3native.replication</name>
<value>3</value>
<source>core-default.xml</source>
</property>
<property>
<name>fs.permissions.umask-mode</name>
<value>022</value>
<source>core-default.xml</source>
</property>
<property>
<name>
yarn.resourcemanager.node-ip-cache.expiry-interval-secs
</name>
<value>-1</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.cluster.local.dir</name>
<value>${hadoop.tmp.dir}/mapred/local</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapreduce.client.output.filter</name>
<value>FAILED</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.pmem-check-enabled</name>
<value>true</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>ftp.replication</name>
<value>3</value>
<source>core-default.xml</source>
</property>
<property>
<name>
hadoop.security.group.mapping.ldap.search.attr.member
</name>
<value>member</value>
<source>core-default.xml</source>
</property>
<property>
<name>fs.s3a.max.total.tasks</name>
<value>5</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.fs.state-store.num-retries</name>
<value>0</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.timeline-service.leveldb-state-store.path</name>
<value>${hadoop.tmp.dir}/yarn/timeline</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>DESKTOP-H804TCF:8031</value>
<source>programatically</source>
</property>
<property>
<name>yarn.nodemanager.resource.pcores-vcores-multiplier</name>
<value>1.0</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.scheduler.monitor.enable</name>
<value>false</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>fs.trash.checkpoint.interval</name>
<value>0</value>
<source>core-default.xml</source>
</property>
<property>
<name>hadoop.registry.zk.retry.times</name>
<value>5</value>
<source>core-default.xml</source>
</property>
<property>
<name>
yarn.timeline-service.leveldb-timeline-store.start-time-write-cache-size
</name>
<value>10000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>s3.stream-buffer-size</name>
<value>4096</value>
<source>core-default.xml</source>
</property>
<property>
<name>fs.s3a.connection.maximum</name>
<value>15</value>
<source>core-default.xml</source>
</property>
<property>
<name>hadoop.security.dns.log-slow-lookups.enabled</name>
<value>false</value>
<source>core-default.xml</source>
</property>
<property>
<name>file.client-write-packet-size</name>
<value>65536</value>
<source>core-default.xml</source>
</property>
<property>
<name>hadoop.shell.missing.defaultFs.warning</name>
<value>false</value>
<source>core-default.xml</source>
</property>
<property>
<name>fs.s3a.impl</name>
<value>org.apache.hadoop.fs.s3a.S3AFileSystem</value>
<source>core-default.xml</source>
</property>
<property>
<name>
yarn.nodemanager.windows-container.memory-limit.enabled
</name>
<value>false</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
rpc.engine.org.apache.hadoop.yarn.api.ApplicationClientProtocolPB
</name>
<value>org.apache.hadoop.ipc.ProtobufRpcEngine</value>
<source>programatically</source>
</property>
<property>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>/tmp/logs</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.reduce.shuffle.retry-delay.max.ms</name>
<value>60000</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>io.map.index.interval</name>
<value>128</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.container-localizer.java.opts</name>
<value>-Xmx256m</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>hadoop.ssl.server.conf</name>
<value>ssl-server.xml</value>
<source>core-default.xml</source>
</property>
<property>
<name>hadoop.rpc.socket.factory.class.default</name>
<value>org.apache.hadoop.net.StandardSocketFactory</value>
<source>core-default.xml</source>
</property>
<property>
<name>
yarn.minicluster.yarn.nodemanager.resource.memory-mb
</name>
<value>4096</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.app.mapreduce.client.max-retries</name>
<value>3</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapreduce.jobhistory.webapp.https.address</name>
<value>0.0.0.0:19890</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.address</name>
<value>${yarn.nodemanager.hostname}:0</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
yarn.resourcemanager.max-log-aggregation-diagnostics-in-memory
</name>
<value>10</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
ha.failover-controller.graceful-fence.rpc-timeout.ms
</name>
<value>5000</value>
<source>core-default.xml</source>
</property>
<property>
<name>
yarn.resourcemanager.delayed.delegation-token.removal-interval-ms
</name>
<value>30000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.timeline-service.enabled</name>
<value>false</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>ipc.maximum.data.length</name>
<value>67108864</value>
<source>core-default.xml</source>
</property>
<property>
<name>hadoop.security.group.mapping.providers.combined</name>
<value>true</value>
<source>core-default.xml</source>
</property>
<property>
<name>hadoop.security.groups.cache.warn.after.ms</name>
<value>5000</value>
<source>core-default.xml</source>
</property>
<property>
<name>
yarn.resourcemanager.fs.state-store.retry-interval-ms
</name>
<value>1000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.zk-acl</name>
<value>world:anyone:rwcda</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.resource-monitor.interval-ms</name>
<value>3000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
yarn.nodemanager.resource.detect-hardware-capabilities
</name>
<value>false</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.sharedcache.app-checker.class</name>
<value>
org.apache.hadoop.yarn.server.sharedcachemanager.RemoteAppChecker
</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
yarn.timeline-service.entity-group-fs-store.retain-seconds
</name>
<value>604800</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.webapp.https.address</name>
<value>0.0.0.0:8044</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
yarn.nodemanager.linux-container-executor.cgroups.delete-delay-ms
</name>
<value>20</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.amrmproxy.enable</name>
<value>false</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
yarn.resourcemanager.fs.state-store.retry-policy-spec
</name>
<value>2000, 500</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>fs.s3a.fast.upload</name>
<value>false</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.job.committer.setup.cleanup.needed</name>
<value>true</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapreduce.job.end-notification.retry.attempts</name>
<value>0</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>
yarn.resourcemanager.state-store.max-completed-applications
</name>
<value>${yarn.resourcemanager.max-completed-applications}</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.map.output.compress</name>
<value>false</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapreduce.jobhistory.cleaner.enable</name>
<value>true</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapreduce.job.running.reduce.limit</name>
<value>0</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>io.seqfile.local.dir</name>
<value>${hadoop.tmp.dir}/io/local</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.reduce.shuffle.read.timeout</name>
<value>180000</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapreduce.job.queuename</name>
<value>default</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>ipc.client.connect.max.retries</name>
<value>10</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.app.mapreduce.am.staging-dir</name>
<value>/tmp/hadoop-yarn/staging</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.app.mapreduce.client.job.retry-interval</name>
<value>2000</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>
yarn.nodemanager.linux-container-executor.resources-handler.class
</name>
<value>
org.apache.hadoop.yarn.server.nodemanager.util.DefaultLCEResourcesHandler
</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
yarn.timeline-service.leveldb-timeline-store.read-cache-size
</name>
<value>104857600</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>io.file.buffer.size</name>
<value>4096</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.webapp.cross-origin.enabled</name>
<value>false</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
yarn.resourcemanager.am-rm-tokens.master-key-rolling-interval-secs
</name>
<value>86400</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.log.deletion-threads-count</name>
<value>4</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>ha.zookeeper.parent-znode</name>
<value>/hadoop-ha</value>
<source>core-default.xml</source>
</property>
<property>
<name>tfile.io.chunk.size</name>
<value>1048576</value>
<source>core-default.xml</source>
</property>
<property>
<name>
yarn.resourcemanager.work-preserving-recovery.scheduling-wait-ms
</name>
<value>10000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.timeline-service.keytab</name>
<value>/etc/krb5.keytab</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.node-labels.enabled</name>
<value>false</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.acl.enable</name>
<value>false</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
hadoop.security.group.mapping.ldap.directory.search.timeout
</name>
<value>10000</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.timeline-service.version</name>
<value>1.0f</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.job.token.tracking.ids.enabled</name>
<value>false</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapreduce.map.output.compress.codec</name>
<value>org.apache.hadoop.io.compress.DefaultCodec</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.sharedcache.enabled</name>
<value>false</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>s3.replication</name>
<value>3</value>
<source>core-default.xml</source>
</property>
<property>
<name>hadoop.registry.zk.root</name>
<value>/registry</value>
<source>core-default.xml</source>
</property>
<property>
<name>tfile.fs.input.buffer.size</name>
<value>262144</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.timeline-service.http-authentication.type</name>
<value>simple</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
ha.failover-controller.graceful-fence.connection.retries
</name>
<value>1</value>
<source>core-default.xml</source>
</property>
<property>
<name>net.topology.script.number.args</name>
<value>100</value>
<source>core-default.xml</source>
</property>
<property>
<name>fs.s3n.multipart.uploads.block.size</name>
<value>67108864</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.sharedcache.admin.thread-count</name>
<value>1</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.recovery.dir</name>
<value>${hadoop.tmp.dir}/yarn-nm-recovery</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>hadoop.ssl.enabled</name>
<value>false</value>
<source>core-default.xml</source>
</property>
<property>
<name>fs.AbstractFileSystem.ftp.impl</name>
<value>org.apache.hadoop.fs.ftp.FtpFs</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.timeline-service.handler-thread-count</name>
<value>10</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
yarn.nodemanager.container-metrics.unregister-delay-ms
</name>
<value>10000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>hadoop.caller.context.enabled</name>
<value>false</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.jobhistory.recovery.store.class</name>
<value>
org.apache.hadoop.mapreduce.v2.hs.HistoryServerFileSystemStateStoreService
</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.log.retain-seconds</name>
<value>10800</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>DESKTOP-H804TCF:8033</value>
<source>programatically</source>
</property>
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>false</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
yarn.resourcemanager.ha.automatic-failover.zk-base-path
</name>
<value>/yarn-leader-election</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>fs.AbstractFileSystem.viewfs.impl</name>
<value>org.apache.hadoop.fs.viewfs.ViewFs</value>
<source>core-default.xml</source>
</property>
<property>
<name>fs.AbstractFileSystem.hdfs.impl</name>
<value>org.apache.hadoop.fs.Hdfs</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.reservation-system.enable</name>
<value>false</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
mapreduce.job.speculative.speculative-cap-total-tasks
</name>
<value>0.01</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>
yarn.timeline-service.generic-application-history.max-applications
</name>
<value>10000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.sharedcache.nm.uploader.thread-count</name>
<value>20</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.log-container-debug-info.enabled</name>
<value>false</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>fs.AbstractFileSystem.s3a.impl</name>
<value>org.apache.hadoop.fs.s3a.S3A</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.max-completed-applications</name>
<value>10000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.log-dirs</name>
<value>${yarn.log.dir}/userlogs</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
yarn.resourcemanager.node-removal-untracked.timeout-ms
</name>
<value>60000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
yarn.nodemanager.linux-container-executor.nonsecure-mode.user-pattern
</name>
<value>^[_.A-Za-z0-9][[email protected]_.A-Za-z0-9]{0,255}?[$]?$</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>ftp.blocksize</name>
<value>67108864</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.job.acl-modify-job</name>
<value></value>
<source>mapred-default.xml</source>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://127.0.0.1:9999</value>
<source>core-site.xml</source>
</property>
<property>
<name>yarn.nodemanager.node-labels.resync-interval-ms</name>
<value>120000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>hadoop.http.filter.initializers</name>
<value>
org.apache.hadoop.yarn.server.security.http.RMAuthenticationFilterInitializer,org.apache.hadoop.http.lib.StaticUserWebFilter
</value>
<source>programatically</source>
</property>
<property>
<name>fs.s3n.multipart.copy.block.size</name>
<value>5368709120</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.connect.max-wait.ms</name>
<value>900000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
yarn.timeline-service.entity-group-fs-store.scan-interval-seconds
</name>
<value>60</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>hadoop.security.group.mapping.ldap.ssl</name>
<value>false</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.intermediate-data-encryption.enable</name>
<value>false</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.sharedcache.store.class</name>
<value>
org.apache.hadoop.yarn.server.sharedcachemanager.store.InMemorySCMStore
</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.fail-fast</name>
<value>false</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.admin.client.thread-count</name>
<value>1</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
hadoop.security.kms.client.encrypted.key.cache.size
</name>
<value>500</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.app.mapreduce.shuffle.log.separate</name>
<value>true</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>ipc.client.kill.max</name>
<value>10</value>
<source>core-default.xml</source>
</property>
<property>
<name>
hadoop.security.group.mapping.ldap.search.filter.group
</name>
<value>(objectClass=group)</value>
<source>core-default.xml</source>
</property>
<property>
<name>fs.AbstractFileSystem.file.impl</name>
<value>org.apache.hadoop.fs.local.LocalFs</value>
<source>core-default.xml</source>
</property>
<property>
<name>hadoop.http.authentication.kerberos.keytab</name>
<value>${user.home}/hadoop.keytab</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.client.nodemanager-connect.max-wait-ms</name>
<value>180000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.job.map.output.collector.class</name>
<value>org.apache.hadoop.mapred.MapTask$MapOutputBuffer</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>hadoop.security.uid.cache.secs</name>
<value>14400</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.map.cpu.vcores</name>
<value>1</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.log-aggregation.retain-check-interval-seconds</name>
<value>-1</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.map.log.level</name>
<value>INFO</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapred.child.java.opts</name>
<value>-Xmx200m</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.app.mapreduce.am.hard-kill-timeout-ms</name>
<value>10000</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>hadoop.registry.zk.session.timeout.ms</name>
<value>60000</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.job.running.map.limit</name>
<value>0</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>
yarn.sharedcache.store.in-memory.initial-delay-mins
</name>
<value>10</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
yarn.timeline-service.entity-group-fs-store.cleaner-interval-seconds
</name>
<value>3600</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.sharedcache.client-server.thread-count</name>
<value>50</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
yarn.nodemanager.local-cache.max-files-per-directory
</name>
<value>8192</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>s3native.blocksize</name>
<value>67108864</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.client.completion.pollinterval</name>
<value>5000</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>fs.s3a.socket.send.buffer</name>
<value>8192</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.job.maps</name>
<value>2</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>fs.AbstractFileSystem.swebhdfs.impl</name>
<value>org.apache.hadoop.fs.SWebHdfs</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.job.acl-view-job</name>
<value></value>
<source>mapred-default.xml</source>
</property>
<property>
<name>fs.s3a.readahead.range</name>
<value>64K</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.connect.retry-interval.ms</name>
<value>30000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
yarn.timeline-service.leveldb-timeline-store.ttl-interval-ms
</name>
<value>300000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>fs.s3a.multipart.threshold</name>
<value>2147483647</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.shuffle.max.connections</name>
<value>0</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>hadoop.shell.safely.delete.limit.num.files</name>
<value>100</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>false</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.task.io.sort.factor</name>
<value>10</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>hadoop.security.dns.log-slow-lookups.threshold.ms</name>
<value>1000</value>
<source>core-default.xml</source>
</property>
<property>
<name>ha.health-monitor.sleep-after-disconnect.ms</name>
<value>1000</value>
<source>core-default.xml</source>
</property>
<property>
<name>ha.zookeeper.session-timeout.ms</name>
<value>5000</value>
<source>core-default.xml</source>
</property>
<property>
<name>
yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users
</name>
<value>true</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
mapreduce.input.fileinputformat.list-status.num-threads
</name>
<value>1</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>io.skip.checksum.errors</name>
<value>false</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.scheduler.client.thread-count</name>
<value>50</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
rpc.engine.org.apache.hadoop.ipc.ProtocolMetaInfoPB
</name>
<value>org.apache.hadoop.ipc.ProtobufRpcEngine</value>
<source>programatically</source>
</property>
<property>
<name>mapreduce.jobhistory.move.thread-count</name>
<value>3</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.zk-state-store.parent-path</name>
<value>/rmstore</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.timeline-service.client.fd-retain-secs</name>
<value>300</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>ipc.client.idlethreshold</name>
<value>4000</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.sharedcache.cleaner.initial-delay-mins</name>
<value>10</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.task.profile.params</name>
<value>
-agentlib:hprof=cpu=samples,heap=sites,force=n,thread=y,verbose=n,file=%s
</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapreduce.jobhistory.keytab</name>
<value>/etc/security/keytab/jhs.service.keytab</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>1024</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
yarn.resourcemanager.container-tokens.master-key-rolling-interval-secs
</name>
<value>86400</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.reduce.shuffle.fetch.retry.interval-ms</name>
<value>1000</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>
yarn.timeline-service.entity-group-fs-store.app-cache-size
</name>
<value>10</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>hadoop.user.group.static.mapping.overrides</name>
<value>dr.who=;</value>
<source>core-default.xml</source>
</property>
<property>
<name>
hadoop.security.kms.client.encrypted.key.cache.low-watermark
</name>
<value>0.3f</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.dispatcher.exit-on-error</name>
<value>true</value>
<source>programatically</source>
</property>
<property>
<name>fs.s3a.connection.ssl.enabled</name>
<value>true</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.runtime.linux.docker.capabilities</name>
<value>
CHOWN,DAC_OVERRIDE,FSETID,FOWNER,MKNOD,NET_RAW,SETGID,SETUID,SETFCAP,SETPCAP,NET_BIND_SERVICE,SYS_CHROOT,KILL,AUDIT_WRITE
</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.node-labels.fs-store.retry-policy-spec</name>
<value>2000, 500</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>fs.AbstractFileSystem.webhdfs.impl</name>
<value>org.apache.hadoop.fs.WebHdfs</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.scheduler.monitor.policies</name>
<value>
org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy
</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>ipc.server.listen.queue.size</name>
<value>128</value>
<source>core-default.xml</source>
</property>
<property>
<name>rpc.metrics.quantile.enable</name>
<value>false</value>
<source>core-default.xml</source>
</property>
<property>
<name>
yarn.nodemanager.resource.system-reserved-memory-mb
</name>
<value>-1</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds
</name>
<value>-1</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
yarn.client.nodemanager-client-async.thread-pool-max-size
</name>
<value>500</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>hadoop.security.group.mapping</name>
<value>
org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback
</value>
<source>core-default.xml</source>
</property>
<property>
<name>
yarn.resourcemanager.system-metrics-publisher.enabled
</name>
<value>false</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.am.liveness-monitor.expiry-interval-ms</name>
<value>600000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.nm.liveness-monitor.expiry-interval-ms</name>
<value>600000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>ftp.bytes-per-checksum</name>
<value>512</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.sharedcache.nested-level</name>
<value>3</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
rpc.engine.org.apache.hadoop.yarn.server.api.ResourceManagerAdministrationProtocolPB
</name>
<value>org.apache.hadoop.ipc.ProtobufRpcEngine</value>
<source>programatically</source>
</property>
<property>
<name>mapreduce.job.emit-timeline-data</name>
<value>false</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapreduce.map.memory.mb</name>
<value>1024</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.client.nodemanager-connect.retry-interval-ms</name>
<value>10000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>hadoop.http.cross-origin.max-age</name>
<value>1800</value>
<source>core-default.xml</source>
</property>
<property>
<name>
yarn.timeline-service.leveldb-timeline-store.start-time-read-cache-size
</name>
<value>10000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.scheduler.include-port-in-node-name</name>
<value>false</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.job.speculative.retry-after-no-speculate</name>
<value>1000</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>hadoop.registry.zk.connection.timeout.ms</name>
<value>15000</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>DESKTOP-H804TCF:8032</value>
<source>programatically</source>
</property>
<property>
<name>ipc.client.rpc-timeout.ms</name>
<value>0</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.task.skip.start.attempts</name>
<value>2</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>fs.s3a.socket.recv.buffer</name>
<value>8192</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.zk-timeout-ms</name>
<value>10000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
yarn.timeline-service.entity-group-fs-store.summary-store
</name>
<value>
org.apache.hadoop.yarn.server.timeline.LeveldbTimelineStore
</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
hadoop.security.groups.cache.background.reload.threads
</name>
<value>3</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.sharedcache.cleaner.resource-sleep-ms</name>
<value>0</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.map.skip.maxrecords</name>
<value>0</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>
yarn.resourcemanager.system-metrics-publisher.dispatcher.pool-size
</name>
<value>10</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.timeline-service.hostname</name>
<value>0.0.0.0</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>hadoop.registry.rm.enabled</name>
<value>false</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.job.reducer.preempt.delay.sec</name>
<value>0</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.node-labels.configuration-type</name>
<value>centralized</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.shuffle.ssl.enabled</name>
<value>false</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>2.1</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.container-manager.thread-count</name>
<value>20</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/tmp/hadoop-${user.name}</value>
<source>core-default.xml</source>
</property>
<property>
<name>fs.AbstractFileSystem.har.impl</name>
<value>org.apache.hadoop.fs.HarFs</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.localizer.cache.target-size-mb</name>
<value>10240</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.app.mapreduce.shuffle.log.backups</name>
<value>0</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.minicluster.use-rpc</name>
<value>false</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.http.policy</name>
<value>HTTP_ONLY</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.timeline-service.webapp.https.address</name>
<value>${yarn.timeline-service.hostname}:8190</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.amlauncher.thread-count</name>
<value>50</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>tfile.fs.output.buffer.size</name>
<value>262144</value>
<source>core-default.xml</source>
</property>
<property>
<name>fs.ftp.host.port</name>
<value>21</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.task.io.sort.mb</name>
<value>100</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>
hadoop.security.group.mapping.ldap.search.attr.group.name
</name>
<value>cn</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.amrmproxy.address</name>
<value>0.0.0.0:8048</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>hadoop.security.group.mapping.ldap.read.timeout.ms</name>
<value>60000</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.output.fileoutputformat.compress.type</name>
<value>RECORD</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>file.bytes-per-checksum</name>
<value>512</value>
<source>core-default.xml</source>
</property>
<property>
<name>ha.health-monitor.check-interval.ms</name>
<value>1000</value>
<source>core-default.xml</source>
</property>
<property>
<name>
yarn.resourcemanager.delegation.key.update-interval
</name>
<value>86400000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
yarn.resourcemanager.resource-tracker.client.thread-count
</name>
<value>50</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.reduce.input.buffer.percent</name>
<value>0.0</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>
yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage
</name>
<value>false</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>ha.health-monitor.rpc-timeout.ms</name>
<value>45000</value>
<source>core-default.xml</source>
</property>
<property>
<name>io.bytes.per.checksum</name>
<value>512</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>8192</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.leveldb-state-store.path</name>
<value>${hadoop.tmp.dir}/yarn/system/rmstore</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.task.files.preserve.failedtasks</name>
<value>false</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.delete.thread-count</name>
<value>4</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.output.fileoutputformat.compress.codec</name>
<value>org.apache.hadoop.io.compress.DefaultCodec</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>map.sort.class</name>
<value>org.apache.hadoop.util.QuickSort</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>
yarn.nodemanager.resource.count-logical-processors-as-cores
</name>
<value>false</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.jobhistory.jobname.limit</name>
<value>50</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapreduce.job.classloader</name>
<value>false</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>hadoop.registry.zk.retry.ceiling.ms</name>
<value>60000</value>
<source>core-default.xml</source>
</property>
<property>
<name>io.seqfile.compress.blocksize</name>
<value>1000000</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.task.profile.maps</name>
<value>0-2</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>
yarn.nodemanager.localizer.cache.cleanup.interval-ms
</name>
<value>600000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>hadoop.http.cross-origin.allowed-origins</name>
<value>*</value>
<source>core-default.xml</source>
</property>
<property>
<name>
yarn.timeline-service.client.fd-flush-interval-secs
</name>
<value>10</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>hadoop.security.java.secure.random.algorithm</name>
<value>SHA1PRNG</value>
<source>core-default.xml</source>
</property>
<property>
<name>fs.client.resolve.remote.symlinks</name>
<value>true</value>
<source>core-default.xml</source>
</property>
<property>
<name>
yarn.resourcemanager.delegation-token-renewer.thread-count
</name>
<value>50</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.shuffle.listen.queue.size</name>
<value>128</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>
yarn.nodemanager.disk-health-checker.min-healthy-disks
</name>
<value>0.25</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.job.end-notification.retry.interval</name>
<value>1000</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapreduce.jobhistory.loadedjobs.cache.size</name>
<value>5</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>fs.s3a.fast.upload.active.blocks</name>
<value>4</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.local-dirs</name>
<value>${hadoop.tmp.dir}/nm-local-dir</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.task.exit.timeout.check-interval-ms</name>
<value>20000</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>hadoop.registry.jaas.context</name>
<value>Client</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.timeline-service.webapp.address</name>
<value>${yarn.timeline-service.hostname}:8188</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>0.0.0.0:10020</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>ipc.server.log.slow.rpc</name>
<value>false</value>
<source>core-default.xml</source>
</property>
<property>
<name>file.blocksize</name>
<value>67108864</value>
<source>core-default.xml</source>
</property>
<property>
<name>fs.s3a.block.size</name>
<value>32M</value>
<source>core-default.xml</source>
</property>
<property>
<name>
yarn.timeline-service.entity-group-fs-store.leveldb-cache-read-cache-size
</name>
<value>10485760</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.sharedcache.cleaner.period-mins</name>
<value>1440</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.metrics.runtime.buckets</name>
<value>60,300,1440</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>ipc.client.ping</name>
<value>true</value>
<source>core-default.xml</source>
</property>
<property>
<name>
yarn.resourcemanager.leveldb-state-store.compaction-interval-secs
</name>
<value>3600</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.configuration.provider-class</name>
<value>org.apache.hadoop.yarn.LocalConfigurationProvider</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.recovery.enabled</name>
<value>false</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>0.0.0.0</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>fs.s3n.multipart.uploads.enabled</name>
<value>false</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.disk-health-checker.enable</name>
<value>true</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
yarn.nodemanager.amrmproxy.interceptor-class.pipeline
</name>
<value>
org.apache.hadoop.yarn.server.nodemanager.amrmproxy.DefaultRequestInterceptor
</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>ha.failover-controller.cli-check.rpc-timeout.ms</name>
<value>20000</value>
<source>core-default.xml</source>
</property>
<property>
<name>ftp.client-write-packet-size</name>
<value>65536</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.reduce.shuffle.parallelcopies</name>
<value>5</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>hadoop.caller.context.signature.max.size</name>
<value>40</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.jobhistory.principal</name>
<value>jhs/[email protected]</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>
hadoop.http.authentication.simple.anonymous.allowed
</name>
<value>true</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>-1</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
yarn.resourcemanager.rm.container-allocation.expiry-interval-ms
</name>
<value>600000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
yarn.nodemanager.windows-container.cpu-limit.enabled
</name>
<value>false</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.client.genericoptionsparser.used</name>
<value>true</value>
<source>programatically</source>
</property>
<property>
<name>
yarn.timeline-service.http-authentication.simple.anonymous.allowed
</name>
<value>true</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.jobhistory.jhist.format</name>
<value>json</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>
yarn.resourcemanager.reservation-system.planfollower.time-step
</name>
<value>1000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.job.ubertask.maxreduces</name>
<value>1</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>fs.s3a.connection.establish.timeout</name>
<value>5000</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.health-checker.interval-ms</name>
<value>600000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>fs.s3a.multipart.purge</name>
<value>false</value>
<source>core-default.xml</source>
</property>
<property>
<name>
hadoop.security.kms.client.encrypted.key.cache.num.refill.threads
</name>
<value>2</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.timeline-service.store-class</name>
<value>
org.apache.hadoop.yarn.server.timeline.LeveldbTimelineStore
</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.shuffle.transfer.buffer.size</name>
<value>131072</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.zk-num-retries</name>
<value>1000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
yarn.sharedcache.store.in-memory.staleness-period-mins
</name>
<value>10080</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.webapp.address</name>
<value>${yarn.nodemanager.hostname}:8042</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.app.mapreduce.client-am.ipc.max-retries</name>
<value>3</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>ipc.ping.interval</name>
<value>60000</value>
<source>core-default.xml</source>
</property>
<property>
<name>ha.failover-controller.new-active.rpc-timeout.ms</name>
<value>60000</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.jobhistory.client.thread-count</name>
<value>10</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>fs.trash.interval</name>
<value>0</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.fileoutputcommitter.algorithm.version</name>
<value>1</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapreduce.reduce.skip.maxgroups</name>
<value>0</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>1024</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.health-checker.script.timeout-ms</name>
<value>1200000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.client.progressmonitor.pollinterval</name>
<value>1000</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>
yarn.resourcemanager.delegation.token.renew-interval
</name>
<value>86400000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.hostname</name>
<value>0.0.0.0</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>false</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-vcores</name>
<value>1</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.app.mapreduce.am.container.log.limit.kb</name>
<value>0</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapreduce.jobhistory.move.interval-ms</name>
<value>180000</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>hadoop.http.authentication.signature.secret.file</name>
<value>${user.home}/hadoop-http-auth-signature-secret</value>
<source>core-default.xml</source>
</property>
<property>
<name>
yarn.resourcemanager.nm-tokens.master-key-rolling-interval-secs
</name>
<value>86400</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.container-executor.class</name>
<value>
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor
</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>hadoop.security.authorization</name>
<value>false</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.localizer.address</name>
<value>${yarn.nodemanager.hostname}:8040</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.jobhistory.recovery.store.fs.uri</name>
<value>${hadoop.tmp.dir}/mapred/history/recoverystore</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapreduce.shuffle.connection-keep-alive.enable</name>
<value>false</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>hadoop.common.configuration.version</name>
<value>0.23.0</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.app.mapreduce.task.container.log.backups</name>
<value>0</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>hadoop.security.groups.negative-cache.secs</name>
<value>30</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.ifile.readahead</name>
<value>true</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>
yarn.nodemanager.resource.percentage-physical-cpu-limit
</name>
<value>100</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.job.max.split.locations</name>
<value>10</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>hadoop.registry.zk.quorum</name>
<value>localhost:2181</value>
<source>core-default.xml</source>
</property>
<property>
<name>fs.s3a.threads.keepalivetime</name>
<value>60</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.jobhistory.joblist.cache.size</name>
<value>20000</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapreduce.job.end-notification.max.attempts</name>
<value>5</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>hadoop.security.groups.cache.background.reload</name>
<value>false</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.reduce.shuffle.connect.timeout</name>
<value>180000</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>0.0.0.0:19888</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>fs.s3a.connection.timeout</name>
<value>200000</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.sharedcache.nm.uploader.replication.factor</name>
<value>10</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>hadoop.http.authentication.token.validity</name>
<value>36000</value>
<source>core-default.xml</source>
</property>
<property>
<name>ipc.client.connect.max.retries.on.timeouts</name>
<value>45</value>
<source>core-default.xml</source>
</property>
<property>
<name>
yarn.timeline-service.client.internal-timers-ttl-secs
</name>
<value>420</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
yarn.nodemanager.docker-container-executor.exec-name
</name>
<value>/usr/bin/docker</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.app.mapreduce.am.job.committer.cancel-timeout</name>
<value>60000</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>30000</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.reduce.log.level</name>
<value>INFO</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapreduce.reduce.shuffle.merge.percent</name>
<value>0.66</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>ipc.client.fallback-to-simple-auth-allowed</name>
<value>false</value>
<source>core-default.xml</source>
</property>
<property>
<name>io.serializations</name>
<value>
org.apache.hadoop.io.serializer.WritableSerialization, org.apache.hadoop.io.serializer.avro.AvroSpecificSerialization, org.apache.hadoop.io.serializer.avro.AvroReflectSerialization
</value>
<source>core-default.xml</source>
</property>
<property>
<name>fs.s3.block.size</name>
<value>67108864</value>
<source>core-default.xml</source>
</property>
<property>
<name>
yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user
</name>
<value>nobody</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>hadoop.kerberos.kinit.command</name>
<value>kinit</value>
<source>core-default.xml</source>
</property>
<property>
<name>
hadoop.security.kms.client.encrypted.key.cache.expiry
</name>
<value>43200000</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.fs.state-store.uri</name>
<value>${hadoop.tmp.dir}/yarn/system/rmstore</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.dispatcher.drain-events.timeout</name>
<value>300000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.admin.acl</name>
<value>*</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.reduce.merge.inmem.threshold</name>
<value>1000</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.cluster.max-application-priority</name>
<value>0</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>net.topology.impl</name>
<value>org.apache.hadoop.net.NetworkTopology</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
<value>true</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>io.map.index.skip</name>
<value>0</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.webapp.https.address</name>
<value>${yarn.resourcemanager.hostname}:8090</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.admin-env</name>
<value>MALLOC_ARENA_MAX=$MALLOC_ARENA_MAX</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>hadoop.security.crypto.cipher.suite</name>
<value>AES/CTR/NoPadding</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.task.profile.map.params</name>
<value>${mapreduce.task.profile.params}</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>hadoop.security.crypto.buffer.size</name>
<value>8192</value>
<source>core-default.xml</source>
</property>
<property>
<name>
yarn.nodemanager.aux-services.mapreduce_shuffle.class
</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.container-metrics.enable</name>
<value>true</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>fs.s3a.path.style.access</name>
<value>false</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.cluster.acls.enabled</name>
<value>false</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.sharedcache.uploader.server.address</name>
<value>0.0.0.0:8046</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.log-aggregation-status.time-out.ms</name>
<value>600000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>fs.s3a.threads.max</name>
<value>10</value>
<source>core-default.xml</source>
</property>
<property>
<name>fs.har.impl.disable.cache</name>
<value>true</value>
<source>core-default.xml</source>
</property>
<property>
<name>ipc.client.connect.timeout</name>
<value>20000</value>
<source>core-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.remote-app-log-dir-suffix</name>
<value>logs</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>fs.df.interval</name>
<value>60000</value>
<source>core-default.xml</source>
</property>
<property>
<name>hadoop.util.hash.type</name>
<value>murmur</value>
<source>core-default.xml</source>
</property>
<property>
<name>mapreduce.jobhistory.minicluster.fixed.ports</name>
<value>false</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.app.mapreduce.shuffle.log.limit.kb</name>
<value>0</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>ha.zookeeper.acl</name>
<value>world:anyone:rwcda</value>
<source>core-default.xml</source>
</property>
<property>
<name>
yarn.timeline-service.entity-group-fs-store.done-dir
</name>
<value>/tmp/entity-file-history/done/</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.delegation.token.max-lifetime</name>
<value>604800000</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
mapreduce.job.speculative.speculative-cap-running-tasks
</name>
<value>0.1</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapreduce.map.sort.spill.percent</name>
<value>0.80</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.nodemanager.recovery.supervised</name>
<value>false</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>file.stream-buffer-size</name>
<value>4096</value>
<source>core-default.xml</source>
</property>
<property>
<name>
yarn.resourcemanager.ha.automatic-failover.embedded
</name>
<value>true</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.nodemanager.minimum.version</name>
<value>NONE</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>
yarn.resourcemanager.history-writer.multi-threaded-dispatcher.pool-size
</name>
<value>10</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.sharedcache.webapp.address</name>
<value>0.0.0.0:8788</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>yarn.app.mapreduce.am.resource.mb</name>
<value>1536</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>local</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapreduce.job.reduce.slowstart.completedmaps</name>
<value>0.05</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>yarn.resourcemanager.client.thread-count</name>
<value>50</value>
<source>yarn-default.xml</source>
</property>
<property>
<name>mapreduce.jobhistory.intermediate-done-dir</name>
<value>
${yarn.app.mapreduce.am.staging-dir}/history/done_intermediate
</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>fs.s3a.attempts.maximum</name>
<value>20</value>
<source>core-default.xml</source>
</property>
</configuration>
时间: 2024-10-10 00:37:35

【大数据系列】win10不借助Cygwin安装hadoop2.8的相关文章

【大数据系列】hadoop单节点安装官方文档翻译

Hadoop: Setting up a Single Node Cluster. HADOOP:建立单节点集群 Purpose Prerequisites Supported Platforms Required Software Installing Software Download Prepare to Start the Hadoop Cluster Standalone Operation Pseudo-Distributed Operation Configuration Setu

大数据系列(2)——Hadoop集群坏境CentOS安装

前言 前面我们主要分析了搭建Hadoop集群所需要准备的内容和一些提前规划好的项,本篇我们主要来分析如何安装CentOS操作系统,以及一些基础的设置,闲言少叙,我们进入本篇的正题. 技术准备 VMware虚拟机.CentOS 6.8 64 bit 安装流程 因为我的笔记本是Window7操作系统,然后内存配置,只有8G,内存配置太低了,当然为了演示,我会将Hadoop集群中的主节点分配2GB内存,然后剩余的三个节点都是1GB配置. 所有的节点存储我都设置为50GB. 在安装操作系统之前,我们需要

大数据系列之分布式数据库HBase-1.2.4+Zookeeper 安装及增删改查实践

之前介绍过关于HBase 0.9.8版本的部署及使用,本篇介绍下最新版本HBase1.2.4的部署及使用,有部分区别,详见如下: 1. 环境准备: 1.需要在Hadoop[hadoop-2.7.3] 启动正常情况下安装,hadoop安装可参考LZ的文章 大数据系列之Hadoop分布式集群部署 2. 资料包  zookeeper-3.4.9.tar.gz,hbase-1.2.4-bin.tar.gz 2. 安装步骤: 1.安装zookeeper 1.解压zookeeper-3.4.9.tar.gz

大数据系列之数据仓库Hive安装

Hive主要分为以下几个部分 ?户接口1.包括CLI,JDBC/ODBC,WebUI元数据存储(metastore)1.默认存储在?带的数据库derby中,线上使?时?般换为MySQL驱动器(Driver)1.解释器.编译器.优化器.执?器Hadoop1.?MapReduce 进?计算,?HDFS 进?存储 前提部分:Hive的安装需要在Hadoop已经成功安装且成功启动的基础上进行安装.若没有安装请移步至大数据系列之Hadoop分布式集群部署. 使用包: apache-hive-2.1.1-b

大数据系列(3)——Hadoop集群完全分布式坏境搭建

前言 上一篇我们讲解了Hadoop单节点的安装,并且已经通过VMware安装了一台CentOS 6.8的Linux系统,咱们本篇的目标就是要配置一个真正的完全分布式的Hadoop集群,闲言少叙,进入本篇的正题. 技术准备 VMware虚拟机.CentOS 6.8 64 bit 安装流程 我们先来回顾上一篇我们完成的单节点的Hadoop环境配置,已经配置了一个CentOS 6.8 并且完成了java运行环境的搭建,Hosts文件的配置.计算机名等诸多细节. 其实完成这一步之后我们就已经完成了Had

大数据系列之数据仓库Hive原理

Hive系列博文,持续更新~~~ 大数据系列之数据仓库Hive原理 大数据系列之数据仓库Hive安装 大数据系列之数据仓库Hive中分区Partition如何使用 大数据系列之数据仓库Hive命令使用及JDBC连接 Hive的工作原理简单来说就是一个查询引擎 先来一张Hive的架构图: Hive的工作原理如下: 接收到一个sql,后面做的事情包括:1.词法分析/语法分析 使用antlr将SQL语句解析成抽象语法树-AST2.语义分析 从Megastore获取模式信息,验证SQL语句中队表名,列名

大数据系列之分布式计算批处理引擎MapReduce实践

关于MR的工作原理不做过多叙述,本文将对MapReduce的实例WordCount(单词计数程序)做实践,从而理解MapReduce的工作机制. WordCount: 1.应用场景,在大量文件中存储了单词,单词之间用空格分隔 2.类似场景:搜索引擎中,统计最流行的N个搜索词,统计搜索词频率,帮助优化搜索词提示. 3.采用MapReduce执行过程如图 3.1MapReduce将作业的整个运行过程分为两个阶段 3.1.1Map阶段和Reduce阶段 Map阶段由一定数量的Map Task组成 输入

一步一步学习大数据系列

概要 一步一步学习大数据系列 包括: 一步一步学习大数据系列之 Linux 一步一步学习大数据系列之 Linux 01-Linux 系统安装 02-Linux 图形界面及文件系统结构介绍 03-局域网工作机制和网络地址配置 04-vmware 虚拟网络的配置 05-Linux 网络配置及 CRT 远程连接06- Linux常用命令 09-SSH免密登录配置. 10-CRT 工具设置 11 -more service- chkconfig 命令12.Linux软件安装 一步一步学习大数据系列之 H

玩转大数据系列之Apache Pig高级技能之函数编程(六)

原创不易,转载请务必注明,原创地址,谢谢配合! http://qindongliang.iteye.com/ Pig系列的学习文档,希望对大家有用,感谢关注散仙! Apache Pig的前世今生 Apache Pig如何自定义UDF函数? Apache Pig5行代码怎么实现Hadoop的WordCount? Apache Pig入门学习文档(一) Apache Pig学习笔记(二) Apache Pig学习笔记之内置函数(三) 玩转大数据系列之Apache Pig如何与Apache Lucen