POJ #1080 - Human Gene Functions

A classic 2D DP problem. A disguise of LCS - actually not very hard to decode: it is about 2 sequences‘ matching, though with a weight value of each match.

The point of this problem: how to decode problem statement and how to distill the actuall model behind. Coding is not very hard, but my 80% debug time is spent on a stupid detail: in the 2 level for loop, indices start from 1, but char index of each string should be i - 1, j - 1.

Main reference: http://blog.csdn.net/xiaoxiaoluo/article/details/7366537

The AC code:

//    1080
//    http://blog.csdn.net/xiaoxiaoluo/article/details/7366537
/*
 *    Q1: What is the target value? Similarity Val
 *    Q2: What are the Variables? indices of two strings
 *    So, dp[i][j] = val
 *    in which i is the index of 1st string, j is of the 2nd, and value is similarity
 *
 *    The key is recurrence relations:
 *  Eq1: s0[i] isChar, s1[j] isChar
         dp[i][j] = dp[i-1][j-1] + score[s0[i]][s1[j]]
    Eq2: s0[i] isChar, s1[j] is ‘-‘
         dp[i][j] = dp[i][j-1] + score[‘-‘][s1[j]]
    Eq3: s0[i] is ‘-‘, s1[j] isChar
         dp[i][j] = dp[i-1][j] + score[s0[i]][‘-‘]

    The above eqs are to simulate LCS eqs. ‘-‘ is artificially put to match strings
 */
#include <stdio.h>

#define MAX_LEN 100

int score[5][5] = {
    { 5, -1, -2, -1, -3 },
    {-1,  5, -3, -2, -4 },
    {-2, -3,  5, -2, -2 },
    {-1, -2, -2,  5, -1 },
    {-3, -4, -2, -1,  0 }
};

int Inx(char c)
{
    switch (c)
    {
    case ‘A‘: return 0;
    case ‘C‘: return 1;
    case ‘G‘: return 2;
    case ‘T‘: return 3;
    case ‘-‘: return 4;
    }
}

int max2(int a, int b)
{
    return (a > b) ? (a) : (b);
}

int calc(int len0, char in0[MAX_LEN], int len1, char in1[MAX_LEN])
{
    int dp[MAX_LEN + 1][MAX_LEN + 1];

    //    Init
    dp[0][0] = 0;
    for (int i = 1; i <= len0; i ++)
    {
        dp[i][0] = dp[i - 1][0] + score[Inx(in0[i-1])][Inx(‘-‘)]; // eq2
    }
    for (int j = 1; j <= len1; j++)
    {
        dp[0][j] = dp[0][j - 1] + score[Inx(‘-‘)][Inx(in1[j-1])]; // eq1
    }

    //    Go
    for (int i = 1; i <= len0; i ++)
    for (int j = 1; j <= len1; j ++)
    {
        int val0 = dp[i - 1][j - 1] + score[Inx(in0[i-1])][Inx(in1[j-1])];
        int val1 = dp[i][j - 1] + score[Inx(‘-‘)][Inx(in1[j-1])];
        int val2 = dp[i - 1][j] + score[Inx(in0[i-1])][Inx(‘-‘)];
        dp[i][j] = max2(val0, max2(val1, val2));
    }

    return dp[len0][len1];
}
int main()
{
    int n; scanf("%d", &n);
    while (n--)
    {
        int len[2] = { 0 };
        char in0[MAX_LEN] = { 0 };
        char in1[MAX_LEN] = { 0 };

        scanf("%d", len);        scanf("%s", in0);
        scanf("%d", len + 1);    scanf("%s", in1);

        int ret = calc(len[0], in0, len[1], in1);
        printf("%d\n", ret);
    }
    return 0;
}

POJ #1080 - Human Gene Functions

时间: 2024-12-09 08:45:07

POJ #1080 - Human Gene Functions的相关文章

POJ 1080 Human Gene Functions(LCS)

Description It is well known that a human gene can be considered as a sequence, consisting of four nucleotides, which are simply denoted by four letters, A, C, G, and T. Biologists have been interested in identifying human genes and determining their

POJ 1080 Human Gene Functions(动态规划)

一开始用的DFS,无限TLE,贴丑代码 //version 1 TLE #include<cstdio> #include<cstring> #include<iostream> #define MAX_INT 2147483647 #define MAXN 105 using namespace std; int Map[5][5] = { {0,-3,-4,-2,-1}, {-3,5,-1,-2,-1}, {-4,-1,5,-3,-2}, {-2,-2,-3,5,-

POJ 1080 Human Gene Functions(求两字符串相似度:LCS变形)

POJ 1080 Human Gene Functions(求两字符串相似度:LCS变形) http://poj.org/problem?id=1080 题意: HDU1080 给你两个由字符A,C,G,T构造的字符串s1和s2, 现在你可以在这两个字符串中插入空格, 使得两串长相等(但是不能使得s1的空格对应s2的空格位置). 然后给你s1的特定字符对应s2中特定字符所能获得的分数矩阵: 问你最后两个字符串所能获得的最大分数是多少? 分析: 本题很类似于求字符串最短编辑距离或者求字符串LCS的

POJ 1080 Human Gene Functions(DP)

Human Gene Functions Time Limit: 1000MS   Memory Limit: 10000K Total Submissions: 18007   Accepted: 10012 Description It is well known that a human gene can be considered as a sequence, consisting of four nucleotides, which are simply denoted by four

poj 1080 ——Human Gene Functions——————【最长公共子序列变型题】

Human Gene Functions Time Limit: 1000MS   Memory Limit: 10000K Total Submissions: 17805   Accepted: 9917 Description It is well known that a human gene can be considered as a sequence, consisting of four nucleotides, which are simply denoted by four

poj 1080 Human Gene Functions(lcs,较难)

Human Gene Functions Time Limit: 1000MS   Memory Limit: 10000K Total Submissions: 19573   Accepted: 10919 Description It is well known that a human gene can be considered as a sequence, consisting of four nucleotides, which are simply denoted by four

POJ - 1080 - Human Gene Functions (LCS的变形)

题目传送:Human Gene Functions 思路:LCS的变形,定义状态dp[ i ][ j ]为取字符串s前i个字符字符串t前j个字符所获得的最大值,则可以得到状态转移方程为: dp[ i ][ j ] = max(dp[ i ][ j - 1] + f[ ' - ' ][ t[ j ] ], dp[ i - 1 ][ j ] + f[ s [ i ] ][ ' - ' ], dp[i - 1][ j - 1] + f[ s [ i ] ][ t [ j ] ]); AC代码: #in

poj 1080 Human Gene Functions (dp,LCS)

链接:poj 1080 题意:给定两个字符串,求它们对齐匹配的最大值 要求:可以两个字符匹配,也可以一个字符和'-'匹配, 但是不能两个'-'匹配,例如: AGTGATG GTTAG 这两个字符串可以看成是 AGTGATG -GTTA-G 也可以看成是 AGTGAT-G -GT--TAG 分析:这是一个变形的最长公共子序列,最优解: 1.取字符i-1和j-1的时候dp[i][j]=dp[i-1][j-1]+a[s1[i-1]][s2[j-1]]; 2.取字符i-1,不取j-1的时候dp[i][j

POJ 1080 Human Gene Functions 【dp】

题目大意:每次给出两个碱基序列(包含ATGC的两个字符串),其中每一个碱基与另一串中碱基如果配对或者与空串对应会有一个分数(可能为负),找出一种方式使得两个序列配对的分数最大 思路:字符串动态规划的经典题,很容易想到状态dp[i][j],指第一个长度为i的串和第二个长度为j的串配对的最大分数.显然,这个状态可以由dp[i][j-1],dp[i-1][j],dp[i-1][j-1]三个子问题得到,即第一串最后一个字符对应空格.第二串最后一个字符对应空格和第一串第二串最后一个字符配对所得到的分数这三