关于alzheimer disease论文的统计

1.获取2016年的所有关键字,保存到keyword_2016.json中

import pymysql
import json

conn= pymysql.connect(
        host=‘localhost‘,
        port = 3306,
        user=‘root‘,
        passwd=‘‘,
        db =‘python‘,
        )
cursor = conn.cursor()

sql = "SELECT union_kwd_str,pmc_id FROM alzheimer where pub_year = ‘2016‘ && union_kwd_str != ‘‘ "
a = cursor.execute(sql)
print a
b = cursor.fetchmany(a)  #b has 7887 abstract list

abstract_list = []
pmc_id_dict= {}

for j in range(a):
    abstract_list.append(b[j][0])
    pmc_id_dict[j] = b[j][1]

def output_to_json(data,filename):
    with open(filename,‘w‘) as file:
        file.write(json.dumps(data))
        file.close()
    return json.dumps(data)

output_data = {
        ‘pub_year‘: "2016",
        ‘count‘: a,
        ‘keyword‘: abstract_list
    }
output_to_json(output_data, ‘keyword_2016.json‘)

从keyword_2016。json中读取关键词,并统计选出前25的关键词

import re
import collections
import json

def input_from_json(filename):
    with open(filename,‘r‘) as file:
        data = json.loads(file.read())
        file.close()
        return data

def count_word(path):
    result = {}
    keyword_list = input_from_json(path)[‘keyword‘]
    for all_the_text in keyword_list:
        for word in all_the_text.split(‘,‘):
            if word not in result:
                result[word] = 0
            result[word] += 1
    return result

def sort_by_count(d):  

    d = collections.OrderedDict(sorted(d.items(), key = lambda t: -t[1]))
    return d  

if __name__ == ‘__main__‘:
    file_name = "keyword_2016.json"
    fobj2 = open(‘sort_keyword_2016.json‘,‘w‘)

    dword = count_word(file_name)
    dword = sort_by_count(dword)  

    jsonlist = []
    num = 0

    for key,value in dword.items():
        num += 1
        key = re.sub("_", " ", key)
        data = {
        ‘name‘: key,
        ‘value‘: value
        }
        json_data = json.dumps(data)

        if num < 25:
            fobj2.write(json_data)
            fobj2.write(‘,‘)
        if num == 25:
            fobj2.write(json_data)

  

2.获取发表论文量排名前十的国家

1)把所有第一作者的信息保存到authorinfor.json中

import pymysql
import json

conn= pymysql.connect(
        host=‘localhost‘,
        port = 3306,
        user=‘root‘,
        passwd=‘‘,
        db =‘python‘,
        )
cursor = conn.cursor()

sql = "SELECT authorinfor,pmc_id FROM alzheimer WHERE authorinfor != ‘‘"
a = cursor.execute(sql)
print a
b = cursor.fetchmany(a)  #b has 7887 abstract list

authorinfor_list = []
pmc_id_dict= {}

for j in range(a):
    authorinfor_list.append(b[j][0])
    pmc_id_dict[j] = b[j][1]

def output_to_json(data,filename):
    with open(filename,‘w‘) as file:
        file.write(json.dumps(data))
        file.close()
    return json.dumps(data)

output_data = {
        ‘pub_year‘: "2016",
        ‘count‘: a,
        ‘authorinfor‘: authorinfor_list,
        ‘pmc_id‘: pmc_id_dict
    }
output_to_json(output_data, ‘authorinfor.json‘)

2)选出排名前十的国家

import re
import collections
import json

def input_from_json(filename):
    with open(filename,‘r‘) as file:
        data = json.loads(file.read())
        file.close()
        return data

def count_word(path):
    result = {}
    authorinfor_list = input_from_json(path)[‘authorinfor‘]
    for all_the_text in authorinfor_list:
        country = all_the_text.split(‘,‘)[-1]
        country = re.sub("\.","",country)
        country = re.sub("\\n","",country)
        country = country.encode(‘utf-8‘)

        if country not in result:
            result[country] = 0
        result[country] += 1
    return result 

def sort_by_count(d):  

    d = collections.OrderedDict(sorted(d.items(), key = lambda t: -t[1]))
    return d  

if __name__ == ‘__main__‘:
    file_name = "authorinfor.json"
    fobj2 = open(‘sort_country.json‘,‘w‘)

    dword = count_word(file_name)
    dword = sort_by_count(dword)  

    jsonlist = []
    num = 0

    for country,value in dword.items():
        num += 1
        data = {
        ‘name‘: country,
        ‘value‘: value
        }
        json_data = json.dumps(data)

        if num < 50:
            fobj2.write(json_data)
            fobj2.write(‘\n‘)

    countrylist = dword.keys()
    valuelist = dword.values()

    print countrylist[:11]
    print valuelist[:11]
时间: 2024-08-06 22:47:15

关于alzheimer disease论文的统计的相关文章

AD预测论文研读系列1

A Deep Learning Model to Predict a Diagnosis of Alzheimer Disease by Using 18F-FDG PET of the Brain 原文链接 提要 目的 开发并验证一种深度学习算法,该算法可以基于脑部18F FDG PET来预测AD.轻度认知障碍或者二者均不是的诊断结果,并将其性能与放射学阅读器的性能进行比较 材料和方法 来自ADNI的18F-FDG PET脑图(含2109张图片,包括1002个病人)用于训练.验证,40张来自4

hdu 4622 Reincarnation(后缀数组|后缀自动机|KMP)

Reincarnation Time Limit: 6000/3000 MS (Java/Others)    Memory Limit: 131072/65536 K (Java/Others) Total Submission(s): 2138    Accepted Submission(s): 732 Problem Description Now you are back,and have a task to do: Given you a string s consist of lo

【深度学习Deep Learning】资料大全

转载:http://www.cnblogs.com/charlotte77/p/5485438.html 最近在学深度学习相关的东西,在网上搜集到了一些不错的资料,现在汇总一下: Free Online Books Deep Learning66 by Yoshua Bengio, Ian Goodfellow and Aaron Courville Neural Networks and Deep Learning42 by Michael Nielsen Deep Learning27 by

解析&#183;优化 ZKW线段树

德鲁伊!大自然已经听命于我了! 死亡之翼长子奈法利安 ZKW天下第一! 摘自某群聊天记录 ZKW线段树,即非递归形式的线段树,出自终极大犇ZKW的论文<统计的力量>.与普通的线段树相比,ZKW线段树由于是非递归形式,效率极高,代码也极短,成为了OI比赛中极为实用的优化算法之一.虽然ZKW线段树无法处理有运算优先级的线段树问题,但是在一般的问题上和常数偏大的问题上总能带来极强的游戏体验. ZKW线段树的建树 普通线段树 1 / 2 3 <---------------"弱小,可伶

AD统计,排名前十的国家每年的论文统计量

1.获取每个国家的论文数量,采取的方法是写好sql语句,直接用sql语句统计数量,可能这种方式速度会比较慢,另外一种方法是把id全部传过来,在本地做统计. import pymysql import json import re import collections import json def get_article_from_mysql(sql): conn= pymysql.connect( host='localhost', port = 3306, user='root', pass

2014中国科技核心期刊(中国科技论文统计源期刊)名录——计算机类

2014中国科技核心期刊(中国科技论文统计源期刊)名录 —— 中信所发布的

提取mongodb中论文的信息,填入mysql,加快统计速度

1.创建mysql的alzheimer表,包括pmc_id,journal,title,abstract,name,authorinfor,pun_year,keyword,reference信息. #encoding = utf-8 import pymysql import json def input_from_json(filename): with open(filename,'r') as file: data = json.loads(file.read()) return dat

国外论文搜索

学术资源搜索google篇: google的废话也多一些,因为它的功能很强大,尤其对于国外的很多学术资源用它最好.下面我就介绍一下,自己用google搜集文献资料的方法,供大家参考.    1.国外论文搜索 我们注意到,从网上找到的国外论文大部分是pdf格式.所以,细心一点会发现,在google搜索的文献旁边都有一个[pdf]字样,因此我尝试用"keywords" +"pdf" 的模式搜索国外文献,效果很好! 比如,我查找国外海洋防污涂料的文献,输入 "a

{ICIP2014}{收录论文列表}

This article come from HEREARS-L1: Learning Tuesday 10:30–12:30; Oral Session; Room: Leonard de Vinci 10:30  ARS-L1.1—GROUP STRUCTURED DIRTY DICTIONARY LEARNING FOR CLASSIFICATION Yuanming Suo, Minh Dao, Trac Tran, Johns Hopkins University, USA; Hojj