GenomicRangeQuery /codility/ preFix sums

首先上题目:

A DNA sequence can be represented as a string consisting of the letters A, C, G and T, which correspond to the types of successive nucleotides in the sequence. Each nucleotide has an impact factor, which is an integer. Nucleotides of types A, C, G and T have impact factors of 1, 2, 3 and 4, respectively. You are going to answer several queries of the form: What is the minimal impact factor of nucleotides contained in a particular part of the given DNA sequence?

The DNA sequence is given as a non-empty string S =S[0]S[1]...S[N-1] consisting of N characters. There are M queries, which are given in non-empty arrays P and Q, each consisting of M integers. The K-th query (0 ≤ K < M) requires you to find the minimal impact factor of nucleotides contained in the DNA sequence between positions P[K] and Q[K] (inclusive).

For example, consider string S = CAGCCTA and arrays P, Q such that:

    P[0] = 2    Q[0] = 4
    P[1] = 5    Q[1] = 5
    P[2] = 0    Q[2] = 6

The answers to these M = 3 queries are as follows:

  • The part of the DNA between positions 2 and 4 contains nucleotides G and C (twice), whose impact factors are 3 and 2 respectively, so the answer is 2.
  • The part between positions 5 and 5 contains a single nucleotide T, whose impact factor is 4, so the answer is 4.
  • The part between positions 0 and 6 (the whole string) contains all nucleotides, in particular nucleotide Awhose impact factor is 1, so the answer is 1.

Assume that the following declarations are given:

struct Results {
  int * A;
  int M;
};

Write a function:

struct Results solution(char *S, int P[], int Q[], int M);

that, given a non-empty zero-indexed string S consisting of N characters and two non-empty zero-indexed arrays P and Q consisting of M integers, returns an array consisting of M integers specifying the consecutive answers to all queries.

The sequence should be returned as:

  • a Results structure (in C), or
  • a vector of integers (in C++), or
  • a Results record (in Pascal), or
  • an array of integers (in any other programming language).

For example, given the string S = CAGCCTA and arrays P, Q such that:

    P[0] = 2    Q[0] = 4
    P[1] = 5    Q[1] = 5
    P[2] = 0    Q[2] = 6

the function should return the values [2, 4, 1], as explained above.

Assume that:

  • N is an integer within the range [1..100,000];
  • M is an integer within the range [1..50,000];
  • each element of arrays P, Q is an integer within the range [0..N − 1];
  • P[K] ≤ Q[K], where 0 ≤ K < M;
  • string S consists only of upper-case English letters A, C, G, T.

Complexity:

  • expected worst-case time complexity is O(N+M);
  • expected worst-case space complexity is O(N), beyond input storage (not counting the storage required for input arguments).

Elements of input arrays can be modified.

Copyright 2009–2015 by Codility Limited. All Rights Reserved. Unauthorized copying, publication or disclosure prohibited.

2.题目分析

这个题目看上去蛮简单,就是在一个序列中的任意一段slice中,找出出现过的最小值。

不过要求时间复杂度为O(N),便费了一些周折。

1. 首先是将str转换为int 数组,为后面的处理加速,免去每次都要遍历。

先用strlen函数获得str的长度,再通过temp 指针遍历这个str,将A转为1,C转为2,G转为3,T转为4。

获得了这个数组之后就比较好操作了~

2. 获得任何段的最小值看似很简单,遍历就可以了,不过这样时间复杂度就会变为O(N*M)。

如何减少时间复杂度呢?答案是用空间换时间。

通过prefix的想法,我先统计出了在每一个位置之前包括这个位置,1所出现的次数,2所出现的次数,3所出现的次数,4所出现的次数。

分别存入数组A,B,C,D;

每次查询时,只要判断A[end of slice]-A[Begin of Slice]是否大于零,即可判断是否1出现过。这一步的时间复杂度就从O(N)降为了O(1);

注意,需要检查str[begin of slice]是否为1,因为当这个位置是1时,是不会表现出来的。

其余查询依然,通过增加空间复杂度,降低了时间的复杂度。

找出最小的出现值即可。

3.代码如下:

  1 // you can write to stdout for debugging purposes, e.g.
  2 // printf("this is a debug message\n");
  3
  4 struct Results solution(char *S, int P[], int Q[], int M) {
  5     struct Results result;
  6     // write your code in C99
  7     int len = strlen(S);
  8     // printf("len is %d",len);
  9
 10
 11     int arr[len];
 12     int i;
 13     char* temp = S;
 14     for(i=0;i<len;i++)
 15     {
 16         if(*temp==‘C‘)
 17         {
 18             arr[i]=2;
 19         }
 20         else if(*temp==‘A‘)
 21         {
 22             arr[i]=1;
 23         }
 24         else if(*temp==‘G‘)
 25         {
 26             arr[i]=3;
 27         }
 28         else if(*temp==‘T‘)
 29         {
 30             arr[i]=4;
 31         }
 32         temp++;
 33     }
 34
 35     int A[len];
 36     int B[len];
 37     int C[len];
 38     int D[len];
 39
 40         if(arr[0]==1)
 41         {
 42             A[0]=1;
 43             B[0]=0;
 44             C[0]=0;
 45             D[0]=0;
 46         }
 47
 48         if(arr[0]==2)
 49         {
 50             A[0]=0;
 51             B[0]=1;
 52             C[0]=0;
 53             D[0]=0;
 54         }
 55
 56
 57         if(arr[0]==3)
 58         {
 59             A[0]=0;
 60             B[0]=0;
 61             C[0]=1;
 62             D[0]=0;
 63         }
 64
 65         if(arr[0]==4)
 66         {
 67             A[0]=0;
 68             B[0]=0;
 69             C[0]=0;
 70             D[0]=1;
 71         }
 72
 73     // printf("%d %d %d %d \n",A[0],B[0],C[0],D[0]);
 74     for(i=1;i<len;i++)
 75     {
 76         // printf("%d\n",arr[i]);
 77         if(arr[i]==1)
 78         {
 79             A[i]=A[i-1]+1;
 80             B[i]=B[i-1];
 81             C[i]=C[i-1];
 82             D[i]=D[i-1];
 83         }
 84
 85         if(arr[i]==2)
 86         {
 87             A[i]=A[i-1];
 88             B[i]=B[i-1]+1;
 89             C[i]=C[i-1];
 90             D[i]=D[i-1];
 91         }
 92
 93
 94         if(arr[i]==3)
 95         {
 96             A[i]=A[i-1];
 97             B[i]=B[i-1];
 98             C[i]=C[i-1]+1;
 99             D[i]=D[i-1];
100         }
101
102         if(arr[i]==4)
103         {
104             A[i]=A[i-1];
105             B[i]=B[i-1];
106             C[i]=C[i-1];
107             D[i]=D[i-1]+1;
108         }
109
110         // printf("%d %d %d %d \n",A[i],B[i],C[i],D[i]);
111     }
112     result.A = malloc(sizeof(int)*M);
113     result.M = M;
114
115     for(i=0;i<M;i++)
116     {
117         int tempP = P[i];
118         int tempQ = Q[i];
119         if(arr[tempP]==1 || (A[tempQ]-A[tempP]) > 0)
120         {
121             result.A[i]=1;
122         }
123         else if(arr[tempP]==2 || (B[tempQ]-B[tempP]) > 0)
124         {
125             result.A[i]=2;
126         }
127         else if(arr[tempP]==3 || (C[tempQ]-C[tempP]) > 0)
128         {
129             result.A[i]=3;
130         }
131         else
132         {
133             result.A[i]=4;
134         }
135     }
136
137
138     return result;
139 }
时间: 2024-12-27 21:43:48

GenomicRangeQuery /codility/ preFix sums的相关文章

CodeForces 837F - Prefix Sums | Educational Codeforces Round 26

按tutorial打的我血崩,死活挂第四组- - 思路来自FXXL /* CodeForces 837F - Prefix Sums [ 二分,组合数 ] | Educational Codeforces Round 26 题意: 设定数组 y = f(x) 使得 y[i] = sum(x[j]) (0 <= j < i) 求初始数组 A0 经过多少次 f(x) 后 会有一个元素 大于 k 分析: 考虑 A0 = {1, 0, 0, 0} A1 = {1, 1, 1, 1} -> {C(

codeforces:Prefix Sums

题目大意: 给出一个函数P,P接受一个数组A作为参数,并返回一个新的数组B,且B.length = A.length + 1,B[i] = SUM(A[0], ..., A[i]).有一个无穷数组序列A[0], A[1], ... 满足A[i]=P(A[i-1]),其中i为任意自然数.对于输入k和A[0],求一个最小的下标t,使得A[t]中包含不小于k的数值. 其中A[0].length <= 2e5, k <= 1e18,且A[0]中至少有两个正整数. 数学向的题目.本来以为是个找规律的题目

[CF1204E]Natasha,Sasha and the Prefix Sums 题解

前言 本文中的排列指由n个1, m个-1构成的序列中的一种. 题目这么长不吐槽了,但是这确实是一道好题. 题解 DP题话不多说,直接状态/变量/转移. 状态 我们定义f表示"最大prefix sum"之和 变量 f[i][j]为有i个1,j个-1的"最大prefix sum"之和 转移 我们记C[i][j]为\(\left(\begin{matrix} i \\ j\end{matrix}\right)\),那么: \[f[i][j] = \left\{\begin

E. Natasha, Sasha and the Prefix Sums

给定n个 1 m个 -1的全排 求所有排列的$f(a)=max(0,max_{1≤i≤l}∑_{j=1}^{i}a_{j})$之和 组合数,枚举 #include <bits/stdc++.h> using namespace std; typedef long long ll; const ll MOD = 998244853; int n, m; ll C[4002][4002]; ll sum; ll realSum; ll ans; void init() { for(int i=0;

CF1024E Natasha, Sasha and the Prefix Sums——DP/数学(组合数)

题面 CF1024E 解析 题意就是要求所有由$n$个$1$.$m$个$-1$构成的序列的最大前缀和的和 算法一$(DP)$ $n$, $m$都小于等于$2000$, 显然可以$DP$ 设$dp[i][j]$表示由$i$个$1$, $j$个$-1$构成的序列的最大前缀和的和 $i$个$1$, $j$个$-1$构成的序列, 可以看做是在$i-1$个$1$, $j$个$-1$的序列的最前面加一个$1$得到,也可以看做是在$i$个$1$, $j-1$个$-1$的序列最前面加一个$-1$得到 这也就意味

CodeForces - 1204E Natasha, Sasha and the Prefix Sums (组合数学,卡特兰数扩展)

题意:求n个1,m个-1组成的所有序列中,最大前缀之和. 首先引出这样一个问题:使用n个左括号和m个右括号,组成的合法的括号匹配(每个右括号都有对应的左括号和它匹配)的数目是多少? 1.当n=m时,显然答案为卡特兰数$C_{2n}^{n}-C_{2n}^{n+1}$ 2.当n<m时,无论如何都不合法,答案为0 3.当n>m时,答案为$C_{n+m}^{n}-C_{n+m}^{n+1}$,这是一个推论,证明过程有点抽象,方法是把不合法的方案数等价于从(-2,0)移动到(n+m,n-m)的方案数,

CF1204E Natasha, Sasha and the Prefix Sums

题意 给\(n\)个1和\(m\)个0,定义一个01串的权值为它所有前缀和的最大值(包括0),求可以组成的所有不同串的权值和,答案对998244853取模 思路 由于数据较小,本题有个\(O(n^2)\)比较复杂的DP做法,自行百度... 实际上本题用数学规律可以\(O(n)\)做 设\(f_i\)表示权值为\(i\)的01串数量,直接求不容易,再设\(g_i\)为权值至少为\(i\)的01串数量,那么\(f_i=g_i-g_{i+1}\) 利用求卡特兰数列的一种方法:将01串看做从坐标系\((

CF1303G Sum of Prefix Sums

原题链接 这题好毒瘤啊! 首先看到是树上查询全局某式子的最大值,珂以想到这题无外乎是链分治(\(\text{dsu on tree}\))或点分治.作为一个专业的链分治选手, 窝在比赛中想了\(30\)min也不会.比赛后想了想,好像\(\text{dsu on tree}\)不星(珂能还是窝太菜了). 点分治.因为是全局最大值,所以珂以不用在分治里容斥. 假设当前的子树的根是\(\text{u}\). 由于不用容斥,所以要确保在分治里面路径不能有重复边,这个珂以枚举\(\text{u}\)的儿

Solution of Codility

Solution of Codility codility.com is another great place to improve your programming skill. Train myself , and record here. Lesson 1: Time Complexity Lesson 2: Counting Elements Lesson 3: Prefix Sums Lesson 4: Sorting MaxProductOfThree: * Distinct: *