分布式缓存法计算矩阵乘法

1）做矩阵F是.txt格式，右矩阵B是SequenceFile，代码如下：

  1 package matrix;
  2
  3 import java.io.BufferedReader;
  4 import java.io.FileReader;
  5 import java.io.IOException;
  6 import java.net.URI;
  7
  8 import org.apache.hadoop.conf.Configuration;
  9 import org.apache.hadoop.fs.Path;
 10 import org.apache.hadoop.io.*;
 11 import org.apache.hadoop.mapreduce.Job;
 12 import org.apache.hadoop.mapreduce.Mapper;
 13 import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
 14 import org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat;
 15 import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
 16 import org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat;
 17 import org.apache.hadoop.filecache.DistributedCache;//分布式缓存库
 18
 19
 20 public class matrixProduct {
 21     public static class MyMapper extends Mapper<IntWritable, Text,
 22     IntWritable, Text>{
 23         private int leftMatrixRowNum;
 24         private int leftMatrixColNum;
 25         private int rightMatrixRowNum;
 26         private double[][] cacheMatrix;
 27         private double[] valueVector;
 28         private StringBuilder result=new StringBuilder();
 29         //@SuppressWarnings("resource")
 30         @Override
 31         protected void setup(
 32                 Mapper<IntWritable, Text, IntWritable, Text>.Context context)
 33                 throws IOException, InterruptedException {
 34             // TODO 自动生成的方法存根
 35             super.setup(context);
 36             leftMatrixRowNum=Integer.valueOf(context.getConfiguration().get("leftMatrixRowNum"));
 37             leftMatrixColNum=Integer.valueOf(context.getConfiguration().get("leftMatrixColNum"));
 38             rightMatrixRowNum=Integer.valueOf(context.getConfiguration().get("rightMatrixRowNum"));
 39             cacheMatrix=new double[leftMatrixRowNum][leftMatrixColNum];
 40             valueVector=new double[rightMatrixRowNum];
 41
 42             try {
 43                   Path[] cacheFiles=DistributedCache.getLocalCacheFiles(context.getConfiguration());
 44                 if(cacheFiles!=null&&cacheFiles.length>0){
 45                     String line;
 46                     BufferedReader dataReader=new BufferedReader(new FileReader(cacheFiles[0].toString()));
 47                     int i=-1;
 48                     while((line=dataReader.readLine())!=null){
 49                         ++i;
 50                         String[] eleStrings=line.split("\t");
 51                         for(int j=0;j<eleStrings.length;++j){
 52                             cacheMatrix[i][j]=Double.valueOf(eleStrings[j]).doubleValue();
 53                         }
 54                     }
 55                 }
 56
 57             } catch (Exception e) {
 58                 // TODO: handle exception
 59                 System.out.println("setup exception");
 60             }
 61
 62         }
 63
 64         @Override
 65         protected void map(IntWritable key, Text value,
 66                 Mapper<IntWritable, Text, IntWritable, Text>.Context context)
 67                 throws IOException, InterruptedException {
 68             // TODO 自动生成的方法存根
 69             super.map(key, value, context);
 70             String[] valueArray=value.toString().split("\t");
 71
 72             for (int i = 0; i < valueArray.length; i++) {
 73                 valueVector[i] = Double.valueOf(valueArray[i]).doubleValue();
 74             }
 75             double temp=0;
 76             for (int i=0;i<1043;++i) {
 77                 temp=0;
 78                 for (int j=0;j<1043;++j){
 79                 temp+=cacheMatrix[i][j]*valueVector[j];
 80                 }
 81                 if(i!=1042)
 82                     result.append(String.valueOf(temp)).append("\t");
 83                 else
 84                     result.append(String.valueOf(temp)).append("\n");
 85
 86             }
 87             context.write(key, new Text(result.toString()));
 88
 89         }
 90
 91     }
 92     public static void run(String s1,String s2,String s3,String leftMatrixRowNum,String leftMatrixColNum
 93             ,String rightMatrixRowNum)throws Exception{
 94         System.out.println("ewr");
 95         URI fileURI=new URI(s1);
 96         Configuration conf=new Configuration();
 97         conf.set("leftMatrixRowNum", leftMatrixRowNum);
 98         conf.set("leftMatrixColNum", leftMatrixColNum);
 99         conf.set("rightMatrixRowNum", rightMatrixRowNum);
100         Job job=new Job(conf,"matrix cache memory");
101         job.setJarByClass(matrixProduct.class);
102         job.setMapperClass(MyMapper.class);
103         job.setNumReduceTasks(0);
104         DistributedCache.addCacheFile(fileURI, conf);
105         job.setMapOutputKeyClass(IntWritable.class);
106         job.setMapOutputValueClass(Text.class);
107         job.setOutputKeyClass(IntWritable.class);
108         job.setOutputValueClass(Text.class);
109         job.setInputFormatClass(SequenceFileInputFormat.class);
110         job.setOutputFormatClass(SequenceFileOutputFormat.class);
111         FileInputFormat.setInputPaths(job, new Path(s2));
112         FileOutputFormat.setOutputPath(job, new Path(s3));
113         System.exit(job.waitForCompletion(true)?0:1);
114     }
115
116
117
118     public static void main(String[] args) throws IOException, Exception{
119
120         run(args[0], args[1],args[2],args[3],args[4],args[5]);
121
122     }
123
124 }

时间： 2024-10-22 05:43:20

分布式缓存法计算矩阵乘法的相关文章

使用分布式缓存求多矩阵乘积

使用分布式缓存有两点需要注意,这是今天折腾了一天的体会. 1)利用DistributedCache类添加缓存文件的语句要紧紧跟在Configuration实例之后 1 Configuration conf=new Configuration(); 2 DistributedCache.addCacheFile(new URI(cachePath),conf);//添加分布式缓存 3 FileSystem fs=FileSystem.get(URI.create(cachePath),conf);

向MapReduce转换：通过部分成绩计算矩阵乘法

代码共分为四部分: <strong><span style="font-size:18px;">/*** * @author YangXin * @info 封装共现关系列 */ package unitSix; import java.io.IOException; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.mapreduce.Mapper; import org.apa

strassen算法——矩阵乘法

strassen算法可以看做是分治递归法求解矩阵乘法的改进. 利用分治递归法求解矩阵乘法的过程大致: 矩阵C = A * B(A.B.C都是n x n矩阵) 可以发现(A11 * B11).(A12 * B21)--等子矩阵的乘法运算需要继续递归.上面有8个乘法,所以需要递归8次. 时间复杂度关系公式 T(n) = 8T(n/2) + O(n^2),这里8T(n/2)是8次递归,O(n^2)是求C11,C12,C21,C22所做的加法,因为(A11*B11).(A12*B21)--都有n^2 /

蓝桥杯 BASIC_17 矩阵乘法（矩阵快速幂）

问题描述给定一个N阶矩阵A,输出A的M次幂(M是非负整数) 例如: A = 1 2 3 4 A的2次幂 7 10 15 22 输入格式第一行是一个正整数N.M(1<=N<=30, 0<=M<=5),表示矩阵A的阶数和要求的幂数接下来N行,每行N个绝对值不超过10的非负整数,描述矩阵A的值输出格式输出共N行,每行N个整数,表示A的M次幂所对应的矩阵.相邻的数之间用一个空格隔开样例输入 2 2 1 2 3 4 样例输出 7 10 15 22 这道题题目很简单,而且数据量也很

矩阵乘法的计算和来源

矩阵乘法的计算矩阵,是线性代数中的基本概念之一.一个m×n的矩阵就是m×n个数排成m行n列的一个数阵. 矩阵乘法是一种高效的算法可以把一些一维递推优化到log(n),还可以求路径方案等,所以更是一种应用性极强的算法.必须注意的是,只有当矩阵A的列数与矩阵B的行数相等时A×B才有意义. 一般单说矩阵乘积时,指的便是一般矩阵乘积.若A为m×n矩阵,B为n×p矩阵,则他们的乘积AB(有时记做A·B)会是一个m×p矩阵.其乘积矩阵的元素如下面式子得出: 上面是一个通过代数公式的方式说明这类乘法的抽象性

Spark中的矩阵乘法分析

前言: 矩阵乘法在数据挖掘/机器学习中是常用的计算步骤,并且在大数据计算中,shuffle过程是不可避免的,矩阵乘法的不同计算方式shuffle的数据量都不相同.通过对矩阵乘法不同计算方式的深入学习,希望能够对大数据算法实现的shuffle过程优化有所启发.网上有很多分布式矩阵乘法相关的文章和论文,但是鲜有对Spark中分布式矩阵乘法的分析.本文针对Spark中分布式矩阵乘法的实现进行必要的说明讨论. 分布式矩阵乘法原理: 矩阵乘法计算可以分为内积法和外积法.根据实现颗粒度的不同,也可以分为普通

矩阵乘法的Strassen算法详解

题目描述请编程实现矩阵乘法,并考虑当矩阵规模较大时的优化方法. 思路分析根据wikipedia上的介绍:两个矩阵的乘法仅当第一个矩阵B的列数和另一个矩阵A的行数相等时才能定义.如A是m×n矩阵和B是n×p矩阵,它们的乘积AB是一个m×p矩阵,它的一个元素其中 1 ≤ i ≤ m, 1 ≤ j ≤ p. 值得一提的是,矩阵乘法满足结合律和分配率,但并不满足交换律,如下图所示的这个例子,两个矩阵交换相乘后,结果变了: 下面咱们来具体解决这个矩阵相乘的问题. 解法一.暴力解法其实,通过前面的分析

0815------算法笔记----------矩阵连乘问题

1.矩阵连乘问题的定义 1.1 给定 n 个矩阵的连乘积 A1A2...An,因为矩阵乘法满足结合律,所以计算矩阵的连乘积可以有不同的计算次序(这个次序的组合数满足卡特兰数),采用不同的计算次序计算的数乘次数也不相同.例如,A1A2A3,这三个矩阵的维数分别是10*100,100*5,和5*50,若先计算A1A2,总的计算次数为10*100*5+10*5*50 = 7500,然而先计算A2A3,总的计算次数为 100*5*50 + 10*100*50 = 75000,可见计算数乘次数相差10倍.

算法导论-矩阵乘法-strassen算法

目录 1.矩阵相乘的朴素算法 2.矩阵相乘的strassen算法 3.完整测试代码c++ 4.性能分析 5.参考资料内容 1.矩阵相乘的朴素算法 T(n) = Θ(n3) 朴素矩阵相乘算法,思想明了,编程实现简单.时间复杂度是Θ(n^3).伪码如下 1 for i ← 1 to n 2 do for j ← 1 to n 3 do c[i][j] ← 0 4 for k ← 1 to n 5 do c[i][j] ← c[i][j] + a[i][k]⋅ b[k][j] 2.矩阵相乘的stra