Spark转GemFire任务（二）

ADMG-2.2.1.3 - BRAVO CoA Mapping - TB

Revision 7/11: If Bravo code is not numeric, need to find one level up in FAGL_011PC to find the 6 digits for bravo, look at blue colored text

Create GFS account mapped to Bravo using FAGL_011ZC, write to /tax/gfs_bravo_mapping

and go through /btb_latam/ska1 to check all 10 digit GL account can be mapped, if not, write to exception region /tax/master_data_exception

Use BTB_MDG.SKA1 (if BTB_MDG is not ingested, use BTB_LATAM.SKA1):

For every record of SKA1 (select every record from SKA1 where KTOPL = ‘JNJG’ and XSPEB <> ‘X’ ) -- collection A
select records in BTB_LATAM.FAGL_011ZC (where VERSN=’JNJG’ AND KTOPL=’JNJG’) – collection B
match each A.SAKNR with [B. VONKT , B. BISKT], B.VONKT is the starting number, B. BISKT is the ending number, both numbers are included in the range, when a match is found, if B.ERGSL is 6 digit numeric, let it be bravoCode, go to 5, if B.ERGSL is not numeric, go to 4
Look into /btb_mdg/fagl_011pc for ergsl : B.ERGSL AND versn:JNJG, once found, let it be C, C.parent is an id, and use the id to look up /btb_mdg/fagl_011pc -- id : %C.parent% AND versn:JNJG, once found, let it be D, if D.ergsl is 6 digits numeric, use it as bravoCode,

go to 5

Get gfsCode (A.SAKNR) and bravoCode (%bravoCode%) description: for gfsCode -- /btb_mdg/skat: spras:E AND ktopl: JNJG AND saknr:%A.SAKNR%, let it be D, D.txt50 is the gfsCode description.

bravoCode -- /btb_mdg/fagl_011qt : versn:JNJG AND spras:E AND ergsl:%bravoCode%, let it be E, E.txt45 is the bravoCode description

save it into /tax/gfs_bravo_mapping: <gfs:A.SAKNR, gfs_description:D.txt50, bravo:B.ERGSL, bravo_description:E.txt45>

Validation:

6.1 If there is no match, write exceptionCode:INVALID_BRAVO_MAPPING, excepitonMessage: No Match on Bravo for GFS code % A.SAKNR%

all [B. VONKT , B. BISKT] should not have overlap, check all [B. VONKT , B. BISKT] do not overlap by sorting B. VONKT and B. BISKT and make sure B.VONKT(n+1) < B. BISKT (n), ), if there is invalid range, exceptionCode: INVALID_GFS_MAPPING, exceptionMessage overlapping range is found, [B(n). valfrom, B(n).valto] and [B(n+1). valfrom, B(n+1).valto],

6.2

Saving exception:

Exception should be saved to /tax/master_data_exception

{

sourceSystem: btb_latam,

exceptionCode:

exceptionMessage:

timestamp

}

Example:

A.SAKNR = 111110000, it matches with B. VONK = 111110000, B. BISKT=111110999, because A.SAKNR>= B. VONK, and A.SAKNR<= B. BISKT, add <111110000, 111110> into memory map C.

If A. SAKNR = ABCD, then there is a NoMatch exception

If there are [B. VONK, B. BISKT] = [111110000, 111110999’] , [111110993, 11112000], then there is Overlap Range exception

Helper:

Algorithm can be: sort the column E (to), and use this post of search :

https://stackoverflow.com/questions/19198586/search-sorted-listlong-for-closest-and-less-than to find the E column and return column B

Business scenario:

There are many similar matching a range to a code, so if you can make it generic and not specific on the collection A and collection B, that will be better.

时间： 2024-10-15 14:32:07

Spark转GemFire任务（二）

Spark转GemFire任务（二）的相关文章

Spark用Java实现二次排序的自定义key

Spark机器学习实战 (十二) - 推荐系统实战

spark调优（二）：调节并行度

Spark RDD编程（二）

Azure HDInsight 和 Spark 大数据实战(二)

spark调优（二）------合理调节作业中的并行度

【spark】示例：二次排序

spark als scala实现(二)

Spark（五十二）：Spark Scheduler模块之DAGScheduler流程