celery的使用

1.celery的任务调度

# -*- coding: utf-8 -*-
import threading

from bs4 import BeautifulSoup
from tornado import httpclient
from celery import Celery
from tornado.httpclient import HTTPClient

broker = ‘redis://localhost:6379‘
backend = ‘redis://localhost:6379‘

app = Celery(‘tasks‘, broker=broker, backend=backend)

visited = {}

@app.task
def get_html(url):
    http_client = HTTPClient()
    try:
        response = http_client.fetch(url, follow_redirects=True)
        return response.body
    except httpclient.HTTPError as e:
        return None
    finally:
        http_client.close()

def start(url):
    threads = []
    for i in range(20):
        t = threading.Thread(target=schedule, args=(url,))
        t.daemon = True
        t.start()
        threads.append(t)

    for thread in threads:
        thread.join()

def process_html(url, html):
    print url + ": " + html
    _add_links_to_queue(url, html)

def schedule(url):
    print "before call _work " + url
    _worker.delay(url)
    print "after call _work " + url

def _add_links_to_queue(url, html):
    soup = BeautifulSoup(html)
    links = soup.find_all(‘a‘)
    for link in links:
        try:
            _url = link[‘href‘]
        except:
            pass

        if not _url.startswith(‘http‘):
            _url = ‘http://‘ + _url
        print url + "==>" + _url
        schedule(_url)

@app.task
def _worker(url):
    print str(threading.currentThread()) + " running " + url
    while 1:
        if url in visited:
            continue
        result = get_html.delay(url)
        try:
            html = result.get(timeout=5)
        except Exception as e:
            print(url)
            print(e)
        finally:
            process_html(url, html)
            visited[url] = True

if __name__ == ‘__main__‘:
    start("http://www.hao123.com/")

2.celery如何进行负载均衡设计

celery有send_task方式去做任务调度，因此，负载均衡的话，可以采用自己的算法去做任务分配，可参考：http://blog.csdn.net/vintage_1/article/details/47664187

时间： 2024-12-23 04:59:55

celery的使用的相关文章

django项目开发中遇到过一些问题,发送请求后服务器要进行一系列耗时非常长的操作,用户要等待很久的时间.可不可以立刻对用户返回响应,然后在后台运行那些操作呢? crontab定时任务很难达到这样的要求 ,异步任务是很好的解决方法,有一个使用python写的非常好用的异步任务工具Celery. broker.worker和backend Celery的架构由三部分组成,消息中间件(broker),任务执行单元(worker)和任务执行结果存储(result backends)组成. 应用程序调用

django celery的分布式异步之路(一) hello world

设想你遇到如下场景: 1)高并发 2)请求的执行相当消耗机器资源,流量峰值的时候可能超出单机界限 3)请求返回慢,客户长时间等在页面等待任务返回 4)存在耗时的定时任务这时你就需要一个分布式异步的框架了. celery会是一个不错的选择.本文将一步一步的介绍如何使用celery和django进行集成,并进行分布式异步编程. 1.安装依赖默认你已经有了python和pip.我使用的版本是: python 2.7.10 pip 9.0.1virtualenv 15.1.0 创建沙盒环境,我们生产

结合Django+celery二次开发定时周期任务

需求: 前端时间由于开发新上线一大批系统,上完之后没有配套的报表系统.监控,于是乎开发.测试.产品.运营.业务部.财务等等各个部门就跟那饥渴的饿狼一样需要各种各样的系统数据满足他们.刚开始一天一个还能满足他们,优化脚本之后只要开发提供查询数据的SQL.收件人.执行时间等等参数就可以几分钟写完一个定时任务脚本 ,到后面不知道是不是吃药了一天三四个定时任务,不到半个月手里一下就20多个定时任务了,渐渐感到力不从心了,而且天天还要给他们修改定时任务的SQL.收件人.执行时间等等,天天写定时任务脚本

celery出现警告或异常的解决方式

做个笔记,记录下使用celery踩过的坑,不定期更新. warnings.warn(CDeprecationWarning(W_PICKLE_DEPRECATED)) 我用的是Flask,所以在Flask的配置文件 confg.py 中,设置好CELERY_ACCEPT_CONTENT这个属性即可. WARNING/MainProcess 一样对配置文件做下修改增加属性 CELERY_REDIRECT_STDOUTS_LEVEL = 'INFO' p.p1 { margin: 0.0px

celery queue

1.vi tasks.py #coding:utf-8 from server import app import random,string,smtplib @app.task def add(x,y): return x+y @app.task def send_mail(): SUBJECT="临时登录密码" HOST="smtp.163.com" # TO=passwords['config']['em

Celery/RabbitMQ在Ubuntu上的安装

1.安装RabbitMQ sudo apt-get install rabbitmq-server sudo rabbitmqctl add_user [username] [password] sudo rabbitmqctl add_vhost [vhostname] sudo rabbitmqctl set_user_tags [username] [tagname] sudo rabbitmqctl set_permissions -p [vhostname] [username]".*

Celery的实践指南

celery原理: celery实际上是实现了一个典型的生产者-消费者模型的消息处理/任务调度统,消费者(worker)和生产者(client)都可以有任意个,他们通过消息系统(broker)来通信. 典型的场景为: 客户端启动一个进程(生产者),当用户的某些操作耗时较长或者比较频繁时,考虑接入本消息系统,发送一个task任务给broker. 后台启动一个worker进程(消费者),当发现broker中保存有某个任务到了该执行的时间,他就会拿过来,根据task类型和参数执行. 实践中的典型场景:

redis celery too many connection

用django 框架,异步任务用celery,队列用redis 出现了这个问题,too many connection Couldn't ack '5f41afc62d-a112-bef34d5de1cc', reason:ConnectionError('Too many connections',) Traceback (most recent call last): File "/srv/www/wom/env/lib/python2.6/site-packages/kombu/messa

在tornado中使用celery实现异步任务处理之一

一.简介 tornado-celery是用于Tornado web框架的非阻塞 celery客户端. 通过tornado-celery可以将耗时任务加入到任务队列中处理, 在celery中创建任务,tornado中就可以像调用AsyncHttpClient一样调用这些任务. ? Celery中两个基本的概念:Broker.Backend Broker : 其实就是一开始说的消息队列 ,用来发送和接受消息. Broker有几个方案可供选择:RabbitMQ,Redis,数据库等 Backend: