Spark转GemFire任务（一）

ADMG-40 2.2.2.4 - Global Tax Warehouse (GTW) CoA mapping -- Both BtB and Project 1

Revision 7/15, change all latam to mdg

Create GFS account mapped to GTW using FAGL_011ZC, write to /tax/gfs_gtw_mapping

and go through /btb_mdg/ska1 to check all 10 digit GL account can be mapped, if not, write to exception region /tax/master_data_exception

Use BTB_MDG.SKA1 (if BTB_MDG is not ingested, use BTB_LATAM.SKA1):

For every record of SKA1 (select every record from SKA1 where KTOPL = ‘JNJG’ and XSPEB <> ‘X’ ) -- collection A

collectionA:

Path: /btb_mdg/ska1

Filter: KTOPL = ‘JNJG’ and XSPEB <> ‘X’

Column: saknr

自己本身

select records in BTB_MDG.FAGL_011ZC (where VERSN=’GTW’ AND KTOPL=’JNJG’) – collection B

collectionB:

Path: /btb_mdg/fagl_011zc

Filter: trim(versn) = ‘GTW’ and ktopl = ‘JNJG’

Column: from vonkt

To biskt

自己本身的

match each A.SAKNR with [B. VONKT , B. BISKT], B.VONKT is the starting number, B. BISKT is the ending number, both numbers are included in the range, when a match is found,

Validate wheather each item in collectionA in the range of collectionB

判断是否 A包含于B

4. Get gfsCode (A.SAKNR) and gtwCode (B.ERGSL) description:

for gfsCode -- /btb_mdg/skat: spras:E AND ktopl: JNJG AND saknr:%A.SAKNR%, let it be D, D.txt50 is the gfsCode description.

gtwCode -- /btb_mdg/fagl_011qt : versn:GTW AND spras:E AND ergsl:%gtwCode%, let it be E, E.txt45 is the gtwCode description

gfsCode:

Select txt50 as gfsCodeDescription from /btb_mdg/skat where spras = ‘E’ and ktopl = ‘JNJG’ and saknr in collectionA.saknr

和A的saknr有关系

gtwCode:

Select txt45 as gtwCodeDescription from /btb_mdg/fagl_011qt where trim(versn) = ‘GTW’ and spras = ‘E’ and ergsl in collectionB.ergsl

和B的ergsl有关系

save it into /tax/gfs_gtw_mapping: <gfs:A.SAKNR, gfs_description:D.txt50, gtw:B.ERGSL, gtw_description:E.txt45>

Insert into /tax/gfs_gtw_mapping(

Gfs : collectionA.saknr,

Gfs_description : txt50,

Gtw : collectionB.ergsl,

Gtw_Description : txt45

)

Validation:

3.1 If there is no match, write exceptionCode:INVALID_GTW_MAPPING, excepitonMessage: No Match on GTW Code for GFS code % A.SAKNR%

items in CollectionA but not in the range of CollectionB will be written into Exception

exceptionCode : INVALID_GTW_MAPPING

exceptionMessage : No Match on GTW Code for GFS code % A.SAKNR%

all [B. VONKT , B. BISKT] should not have overlap, check all [B. VONKT , B. BISKT] do not overlap by sorting B. VONKT and B. BISKT and make sure B.VONKT(n+1) < B. BISKT (n), ), if there is invalid range, exceptionCode: INVALID_GFS_MAPPING, exceptionMessage overlapping range is found, [B(n). valfrom, B(n).valto] and [B(n+1). valfrom, B(n+1).valto],

All the ranges in CollectionB cannot have overlap,if it does,written into Exception

exceptionCode: INVALID_GFS_MAPPING

exceptionMessage overlapping range is found, [B(n). valfrom, B(n).valto] and [B(n+1). valfrom, B(n+1).valto]

6. Saving exception:

Exception should be saved to /tax/master_data_exception

{

sourceSystem: btb_mdg,

exceptionCode:

exceptionMessage:

timestamp

}

时间： 2024-10-12 12:27:39

Spark转GemFire任务（一）

Spark转GemFire任务（一）的相关文章

Spark转GemFire任务（二）

Apache Spark的设计思路

基于Spark MLlib平台的协同过滤算法---电影推荐系统

Spark SQL 之 Join 实现

spark性能调优之资源调优

Spark 整合hive 实现数据的读取输出

spark 教程三 spark Map filter flatMap union distinct intersection操作

Spark运行命令示例

Spark Job具体的物理执行