关于alzheimer disease论文的统计

1.获取2016年的所有关键字，保存到keyword_2016.json中

import pymysql
import json

conn= pymysql.connect(
        host=‘localhost‘,
        port = 3306,
        user=‘root‘,
        passwd=‘‘,
        db =‘python‘,
        )
cursor = conn.cursor()

sql = "SELECT union_kwd_str,pmc_id FROM alzheimer where pub_year = ‘2016‘ && union_kwd_str != ‘‘ "
a = cursor.execute(sql)
print a
b = cursor.fetchmany(a)  #b has 7887 abstract list

abstract_list = []
pmc_id_dict= {}

for j in range(a):
    abstract_list.append(b[j][0])
    pmc_id_dict[j] = b[j][1]

def output_to_json(data,filename):
    with open(filename,‘w‘) as file:
        file.write(json.dumps(data))
        file.close()
    return json.dumps(data)

output_data = {
        ‘pub_year‘: "2016",
        ‘count‘: a,
        ‘keyword‘: abstract_list
    }
output_to_json(output_data, ‘keyword_2016.json‘)

从keyword_2016。json中读取关键词，并统计选出前25的关键词

import re
import collections
import json

def input_from_json(filename):
    with open(filename,‘r‘) as file:
        data = json.loads(file.read())
        file.close()
        return data

def count_word(path):
    result = {}
    keyword_list = input_from_json(path)[‘keyword‘]
    for all_the_text in keyword_list:
        for word in all_the_text.split(‘,‘):
            if word not in result:
                result[word] = 0
            result[word] += 1
    return result

def sort_by_count(d):  

    d = collections.OrderedDict(sorted(d.items(), key = lambda t: -t[1]))
    return d  

if __name__ == ‘__main__‘:
    file_name = "keyword_2016.json"
    fobj2 = open(‘sort_keyword_2016.json‘,‘w‘)

    dword = count_word(file_name)
    dword = sort_by_count(dword)  

    jsonlist = []
    num = 0

    for key,value in dword.items():
        num += 1
        key = re.sub("_", " ", key)
        data = {
        ‘name‘: key,
        ‘value‘: value
        }
        json_data = json.dumps(data)

        if num < 25:
            fobj2.write(json_data)
            fobj2.write(‘,‘)
        if num == 25:
            fobj2.write(json_data)

2.获取发表论文量排名前十的国家

1）把所有第一作者的信息保存到authorinfor.json中

import pymysql
import json

conn= pymysql.connect(
        host=‘localhost‘,
        port = 3306,
        user=‘root‘,
        passwd=‘‘,
        db =‘python‘,
        )
cursor = conn.cursor()

sql = "SELECT authorinfor,pmc_id FROM alzheimer WHERE authorinfor != ‘‘"
a = cursor.execute(sql)
print a
b = cursor.fetchmany(a)  #b has 7887 abstract list

authorinfor_list = []
pmc_id_dict= {}

for j in range(a):
    authorinfor_list.append(b[j][0])
    pmc_id_dict[j] = b[j][1]

def output_to_json(data,filename):
    with open(filename,‘w‘) as file:
        file.write(json.dumps(data))
        file.close()
    return json.dumps(data)

output_data = {
        ‘pub_year‘: "2016",
        ‘count‘: a,
        ‘authorinfor‘: authorinfor_list,
        ‘pmc_id‘: pmc_id_dict
    }
output_to_json(output_data, ‘authorinfor.json‘)

2）选出排名前十的国家

import re
import collections
import json

def input_from_json(filename):
    with open(filename,‘r‘) as file:
        data = json.loads(file.read())
        file.close()
        return data

def count_word(path):
    result = {}
    authorinfor_list = input_from_json(path)[‘authorinfor‘]
    for all_the_text in authorinfor_list:
        country = all_the_text.split(‘,‘)[-1]
        country = re.sub("\.","",country)
        country = re.sub("\\n","",country)
        country = country.encode(‘utf-8‘)

        if country not in result:
            result[country] = 0
        result[country] += 1
    return result 

def sort_by_count(d):  

    d = collections.OrderedDict(sorted(d.items(), key = lambda t: -t[1]))
    return d  

if __name__ == ‘__main__‘:
    file_name = "authorinfor.json"
    fobj2 = open(‘sort_country.json‘,‘w‘)

    dword = count_word(file_name)
    dword = sort_by_count(dword)  

    jsonlist = []
    num = 0

    for country,value in dword.items():
        num += 1
        data = {
        ‘name‘: country,
        ‘value‘: value
        }
        json_data = json.dumps(data)

        if num < 50:
            fobj2.write(json_data)
            fobj2.write(‘\n‘)

    countrylist = dword.keys()
    valuelist = dword.values()

    print countrylist[:11]
    print valuelist[:11]

时间： 2024-10-10 03:09:39

关于alzheimer disease论文的统计的相关文章

AD预测论文研读系列1

A Deep Learning Model to Predict a Diagnosis of Alzheimer Disease by Using 18F-FDG PET of the Brain 原文链接提要目的开发并验证一种深度学习算法,该算法可以基于脑部18F FDG PET来预测AD.轻度认知障碍或者二者均不是的诊断结果,并将其性能与放射学阅读器的性能进行比较材料和方法来自ADNI的18F-FDG PET脑图(含2109张图片,包括1002个病人)用于训练.验证,40张来自4

hdu 4622 Reincarnation(后缀数组|后缀自动机|KMP)

Reincarnation Time Limit: 6000/3000 MS (Java/Others) Memory Limit: 131072/65536 K (Java/Others) Total Submission(s): 2138 Accepted Submission(s): 732 Problem Description Now you are back,and have a task to do: Given you a string s consist of lo

【深度学习Deep Learning】资料大全

转载:http://www.cnblogs.com/charlotte77/p/5485438.html 最近在学深度学习相关的东西,在网上搜集到了一些不错的资料,现在汇总一下: Free Online Books Deep Learning66 by Yoshua Bengio, Ian Goodfellow and Aaron Courville Neural Networks and Deep Learning42 by Michael Nielsen Deep Learning27 by

解析·优化 ZKW线段树

德鲁伊!大自然已经听命于我了! 死亡之翼长子奈法利安 ZKW天下第一! 摘自某群聊天记录 ZKW线段树,即非递归形式的线段树,出自终极大犇ZKW的论文<统计的力量>.与普通的线段树相比,ZKW线段树由于是非递归形式,效率极高,代码也极短,成为了OI比赛中极为实用的优化算法之一.虽然ZKW线段树无法处理有运算优先级的线段树问题,但是在一般的问题上和常数偏大的问题上总能带来极强的游戏体验. ZKW线段树的建树普通线段树 1 / 2 3 <---------------"弱小,可伶

AD统计，排名前十的国家每年的论文统计量

1.获取每个国家的论文数量,采取的方法是写好sql语句,直接用sql语句统计数量,可能这种方式速度会比较慢,另外一种方法是把id全部传过来,在本地做统计. import pymysql import json import re import collections import json def get_article_from_mysql(sql): conn= pymysql.connect( host='localhost', port = 3306, user='root', pass

2014中国科技核心期刊（中国科技论文统计源期刊）名录——计算机类

2014中国科技核心期刊(中国科技论文统计源期刊)名录 —— 中信所发布的

提取mongodb中论文的信息，填入mysql，加快统计速度

1.创建mysql的alzheimer表,包括pmc_id,journal,title,abstract,name,authorinfor,pun_year,keyword,reference信息. #encoding = utf-8 import pymysql import json def input_from_json(filename): with open(filename,'r') as file: data = json.loads(file.read()) return dat

国外论文搜索

学术资源搜索google篇: google的废话也多一些,因为它的功能很强大,尤其对于国外的很多学术资源用它最好.下面我就介绍一下,自己用google搜集文献资料的方法,供大家参考. 1.国外论文搜索我们注意到,从网上找到的国外论文大部分是pdf格式.所以,细心一点会发现,在google搜索的文献旁边都有一个[pdf]字样,因此我尝试用"keywords" +"pdf" 的模式搜索国外文献,效果很好! 比如,我查找国外海洋防污涂料的文献,输入 "a

{ICIP2014}{收录论文列表}

This article come from HEREARS-L1: Learning Tuesday 10:30–12:30; Oral Session; Room: Leonard de Vinci 10:30 ARS-L1.1—GROUP STRUCTURED DIRTY DICTIONARY LEARNING FOR CLASSIFICATION Yuanming Suo, Minh Dao, Trac Tran, Johns Hopkins University, USA; Hojj