Demo of Python "Map Reduce Filter"

Here I share with you a demo for python map, reduce and filter functional programming thatowned by me(Xiaoqiang).

I assume there are two DB tables, that `file_logs` and `expanded_attrs` which records more columns to expand table `file_logs`. For demonstration, we assume that there are more than one file logs for a same tuple of (platform_id, client_id). We need to feture
out which is the one lasted updated for (platform_id=1, client_id=1) tuple.

Here is the thoughts:

1. Filter out all file logs for tuple (platform_id=1, client_id=1) from original file logs,

2. Merge expand table attributes into file_logs table in memory, like union selection.

3. Reduce the full version of file_logs for figuring out which is latest updated.

Demo codes shows here (use Python 2.6+, 2.7+):

BTW, you are welcome if you feature out a more effective way of working or any issues you found. Thanks. :)

#!/usr/bin/env python

"""
Requirement:
    known platform_id=1, client_id=1 as pid and cid.
    exists file_logs and expanded_attrs which are array of objects, expanded_attrs is a table of columns expand table file_logs
    as file_logs contains more than one for pid=1,cid=1, we need to find out which is the one latest updated.
"""

file_logs = [
    { 'file_log_id': '1', 'platform_id': '1', 'client_id': '1', 'file': 'path/to/platform/client/j-1/stdout' },
    { 'file_log_id': '2', 'platform_id': '1', 'client_id': '1', 'file': 'path/to/platform/client/j-2/stdout' },
    { 'file_log_id': '3', 'platform_id': '2', 'client_id': '3', 'file': 'path/to/platform/client/j-3/stdout' },
]

expanded_attrs = [
    { 'file_log_id': '1', 'attr_name': 'CLICK', 'attr_value': '100' },
    { 'file_log_id': '1', 'attr_name': 'SUPPRESSION', 'attr_value': '100' },
    { 'file_log_id': '1', 'attr_name': 'last_updated', 'attr_value': '2014-07-14' },
    { 'file_log_id': '2', 'attr_name': 'CLICK', 'attr_value': '200' },
    { 'file_log_id': '2', 'attr_name': 'SUPPRESSION', 'attr_value': '200' },
    { 'file_log_id': '2', 'attr_name': 'last_updated', 'attr_value': '2014-07-15' },
    { 'file_log_id': '3', 'attr_name': 'CLICK', 'attr_value': '300' },
    { 'file_log_id': '3', 'attr_name': 'SUPPRESSION', 'attr_value': '300' },
    { 'file_log_id': '3', 'attr_name': 'last_updated', 'attr_value': '2014-07-15' },
]

platform_id = '1'
client_id = '1'

target_scope_filelogs = filter(lambda x: x['platform_id'] == platform_id and x['client_id'] == client_id, file_logs)

map(
    lambda x:
        x.update(reduce(
            lambda xx, xy: xx.update({ xy['attr_name']: xy['attr_value'] }) is None and xx,
            filter(lambda xx: xx['file_log_id'] == x['file_log_id'], expanded_attrs),
            dict()
        )),
    target_scope_filelogs
)

print reduce(lambda x, y: x['last_updated'] > y['last_updated'] and x or y, target_scope_filelogs)
#> {'file_log_id': '2', 'platform_id': '1', 'last_updated': '2014-07-15', 'SUPPRESSION': '200', 'file': 'path/to/platform/client/j-2/stdout', 'client_id': '1', 'CLICK': '200'}

Demo of Python "Map Reduce Filter"

时间： 2024-12-16 07:09:50

Demo of Python "Map Reduce Filter"

Demo of Python "Map Reduce Filter"的相关文章

[python基础知识]python内置函数map/reduce/filter

day05 协程函数，递归函数，匿名函数lambda，内置函数map reduce filter max min zip sorted，匿名函数lambda和内置函数结合使用，面向过程编程与函数编程，模块与包的使用，re模块内置函数

python--函数式编程 (高阶函数(map , reduce ,filter,sorted)，匿名函数(lambda))

Python-函数式编程-map reduce filter lambda 三元表达式闭包

王亟亟的Python学习之路（八）-函数式编程，map(),reduce(),filter()

python map、filter、reduce

Python进阶：函数式编程(高阶函数，map,reduce,filter,sorted,返回函数,匿名函数,偏函数)...啊啊啊

Python基础篇【第2篇】: Python内置函数--map/reduce/filter/sorted

python内置函数map/reduce/filter