MongoDB University 第五周作业——aggregate聚合高级查询

HOMEWORK: HOMEWORK 5.2 (HANDS ON)

Crunching the Zipcode dataset
Please calculate the average population of cities in California (abbreviation CA) and New York (NY) (taken together) with populations over 25,000.

For this problem, assume that a city name that appears in more than one state represents two separate cities.

Please round the answer to a whole number. 
Hint: The answer for CT and NJ (using this data set) is 38177.

Please note:

  • Different states might have the same city name.
  • A city might have multiple zip codes.

For purposes of keeping the Hands On shell quick, we have used a subset of the data you previously used in zips.json, not the full set. This is why there are only 200 documents (and 200 zip codes), and all of them are in New York, Connecticut, New Jersey, and California.

If you prefer, you may download the handout and perform your analysis on your machine with

> mongoimport -d test -c zips --drop small_zips.json

Once you‘ve generated your aggregation query and found your answer, select it from the choices below.

  • 448055
  • 592167
  • 935718
  • 198242
  • 693777

    /***********ANSWER**************/

db.zips.aggregate([
{
    $group:{
                "_id": {state: "$state",
                        city: "$city"},
                pop: {"$sum": "$pop"}
           }
},

{
    $match:{
                "pop":{"$gt":25000}, 
                "_id.state": {
                                "$in": ["CA", "NY"]
                              }
           }
},
{
    $group:{
                "_id":null,
                avg: {"$avg": "$pop"}
           }
}  

])

c:\Program Files\MongoDB 2.6 Standard\bin>type aggregate_right_5_2.js | mongo

MongoDB shell version: 2.6.7
connecting to: test
{ "_id" : null, "avg" : 44804.782608695656 }
bye

/***************************************************************************/

HOMEWORK: HOMEWORK 5.3 (HANDS ON)

Who‘s the easiest grader on campus?
A set of grades are loaded into the grades collection.

The documents look like this:

{
	"_id" : ObjectId("50b59cd75bed76f46522c392"),
	"student_id" : 10,
	"class_id" : 5,
	"scores" : [
		{
			"type" : "exam",
			"score" : 69.17634380939022
		},
		{
			"type" : "quiz",
			"score" : 61.20182926719762
		},
		{
			"type" : "homework",
			"score" : 73.3293624199466
		},
		{
			"type" : "homework",
			"score" : 15.206314042622903
		},
		{
			"type" : "homework",
			"score" : 36.75297723087603
		},
		{
			"type" : "homework",
			"score" : 64.42913107330241
		}
	]
}

There are documents for each student (student_id) across a variety of classes (class_id). Note that not all students in the same class have the same exact number of assessments. Some students have three homework assignments, etc.

Your task is to calculate the class with the best average student performance. This involves calculating an average for each student in each class of all non-quiz assessments and then averaging those numbers to get a class average. To be clear, each student‘s average includes only exams and homework grades. Don‘t include their quiz scores in the calculation.

What is the class_id which has the highest average student perfomance?

Hint/Strategy: You need to group twice to solve this problem. You must figure out the GPA that each student has achieved in a class and then average those numbers to get a class average. After that, you just need to sort. The class with the lowest average is the class with class_id=2. Those students achieved a class average of 37.6

If you prefer, you may download the handout and perform your analysis on your machine with

> mongoimport -d test -c grades --drop grades.json

Below, choose the class_id with the highest average student average.

  • 8
  • 9
  • 1
  • 5
  • 7
  • 0
  • 6

/*******ANSWER**********/

db.grades.aggregate([
                        { 
                          $unwind : "$scores" 
                        },
                        { 
                          $match: { 
                                    $or: [
                                            {"scores.type":"exam"},
                                            {"scores.type":"homework"}
                                         ]
                                  }
                        },
                        { 
                          $group: { 
                                    _id:{ 
                                            class_id: "$class_id",
                                            student_id: "$student_id"
                                        },
                                    grade:{ 
                                             $avg: "$scores.score"
                                          }
                                   }
                        },
                        { 
                          $group: {
                                    _id:{
                                            class_id: "$_id.class_id"
                                         },
                                   avg_grade:{
                                            $avg: "$grade"
                                             }
                                  }
                        },
                        { 
                                 $sort: { avg_grade : -1 }
                        },
                        {
                                 $limit:1
                        }
           ]).pretty()

c:\Program Files\MongoDB 2.6 Standard\bin>type aggregate_grade_5_3.js | mongo

MongoDB shell version: 2.6.7
connecting to: test
{ "_id" : { "class_id" : 1 }, "avg_grade" : 64.50642324269174 }
bye

/***************************************************************************/

HOMEWORK: HOMEWORK 5.4

Removing Rural Residents
In this problem you will calculate the number of people who live in a zip code in the US where the city starts with a digit. We will take that to mean they don‘t really live in a city. Once again, you will be using the zip code collection, which you will find in the ‘handouts‘ link in this page. Import it into your mongod using the following command from the command line:

> mongoimport -d test -c zips --drop zips.json

If you imported it correctly, you can go to the test database in the mongo shell and conform that

> db.zips.count()

yields 29,467 documents.

The project operator can extract the first digit from any field. For example, to extract the first digit from the city field, you could write this query:

db.zips.aggregate([
    {$project: 
     {
	first_char: {$substr : ["$city",0,1]},
     }	 
   }
])

Using the aggregation framework, calculate the sum total of people who are living in a zip code where the city starts with a digit. Choose the answer below.

Note that you will need to probably change your projection to send more info through than just that first character. Also, you will need a filtering step to get rid of all documents where the city does not start with a digital (0-9).

  • 298015
  • 345232
  • 245987
  • 312893
  • 158249
  • 543282

/*******ANSWER**********/

db.zips.aggregate([
                   {
                     $project: {  _id:1,
                                  city:1,
                                  state:1,
                                  pop:1,
                                  city_first_char:{
                                                    $substr : ["$city",0,1]
                                                  }
                               }
                    },
                    
                   {
                     $match:{
                              city_first_char:
                             {$in:["0","1","2","3","4","5","6","7","8","9"]}
                            }
                    },
                    
                    {
                     $group:{
                              _id: null,
                               total_pop: {
                                            $sum: "$pop"
                                           }
                            }
                    }
                  ])

c:\Program Files\MongoDB 2.6 Standard\bin>type aggregate_ciry_5_4.js | mongo

MongoDB shell version: 2.6.7
connecting to: test
{ "_id" : null, "total_pop" : 298015 }
bye
时间: 2024-10-12 13:44:38

MongoDB University 第五周作业——aggregate聚合高级查询的相关文章

MongoDB University 第三周作业——删除文档中的数组元素

前两周的作业比较简单,轻松搞定,这周的作业突然难起来了,费了好大劲儿写完,还有很多需要改进的地方,目前只能算是勉强实现了功能. 需求: HOMEWORK: HOMEWORK 3.1 Download the students.json file from the Download Handout link and import it into your local Mongo instance with this command: $ mongoimport -d school -c stude

软件项目管理第五周作业

1.psp Job Type Date Start End Total 四周总结 随笔 2016.4.4 23:00 23:23 23 站立会议 会议 2016.4.4 13:30 13:45 15 数据库 编码测试 2016.4.4 13:50 15:20 90 站立会议 会议 2016.4.5 13:00 13:15 15 摇一摇1 编码测试 2016.4.5 13:35 14:20 45 站立会议 会议 2016.4.6 13:05 13:15 10 数据库函数添加 编码测试 2016.4

《机电传动控制》第五周作业

机电传动控制第五周作业 一.传动电机或控制电机在工业或生活中的应用: 1.电气伺服传动领域 在要求速度控制和位置控制(伺服)的场合,特种电机的应用越来越广泛.开关磁阻电动机.永磁无刷直流电动机.步进电动机.永磁交流伺服电动机.永磁直流电动机等都已在数控机床.工业电气自动化.自动生产线.工业机器人以及各种军.民用装备等领域获得了广泛应用.如交流伺服电机驱动系统应用在凹版印刷机中,以其高控制精度实现了极高的同步协调性,使这种印刷设备具有自动化程度高.套准精度高.承印范围大.生产成本低.节约能源.维修

解题报告——2018级2016第二学期第五周作业排座椅

解题报告--2018级2016第二学期第五周作业 F:排座椅 描述 上课的时候总会有一些同学和前后左右的人交头接耳,这是令小学班主任十分头疼的一件事情.不过,班主任小雪发现了一些有趣的现象,当同学们的座次确定下来 之后,只有有限的D对同学上课时会交头接耳.同学们在教室中坐成了M行N列,坐在第i行第j列的同学的位置是(i,j),为了方便同学们进出,在教室中设 置了K条横向的通道,L条纵向的通道.于是,聪明的小雪想到了一个办法,或许可以减少上课时学生交头接耳的问题:她打算重新摆放桌椅,改变同学们桌椅

解题报告—— 2018级2016第二学期第五周作业 删数问题

解题报告--  2018级2016第二学期第五周作业 删数问题 描述 键盘输入一个高精度的正整数n(<=240位),去掉其中任意s个数字后剩下的数字按原左右次序将组成一个新的正整数.编程对给定的n和s,寻找一种方案,使得剩下的数字组成的新数最小. 输入ns输出最后剩下的最小数样例输入 178543 4 样例输出 13 分析: 这题题目上已表明是贪心算法:原本最容易产生的错误贪心准则是删去其中最大的数字:但通过简单举例便可得之,这种贪心准则要漏洞:通过简单的计算举例发现如果这个数是一位比一位大的话

20179214 2017-2018-2 《密码与安全新技术》第五周作业

20179214 2017-2018-2 <密码与安全新技术>第五周作业 课程:<密码与安全新技术> 班级: 201792 姓名: 刘胜楠 学号:20179214 上课教师:谢四江 上课日期:2018年3月29日 必修/选修: 选修 学习内容总结 ICO众筹 所有成功的数字货币以及区块链(本文区块链指"区块链公有链")项目无一不是社区项目.常见的ICO里,数字货币和区块链项目向早期爱好者出售项目代币.项目团队通过ICO获取技术开发和市场拓展资金:而项目爱好者通过

软件过程与项目管理(第五周作业)

协作图(第五周项目所分配的任务) 一.协作图的作用 协作图是在一种给定语境中描述协作中各个对象间的组织交互关系的空间组织结构的图形化方式,从定义中可以分析它的作用为:对象间消息的传递来反映具体的使用语境的逻辑表达,一个使用情境的逻辑可能是一个用例的一部分或是一条控制流:它的交互关联显示对象交互的空间组织结构,显示一种对象间的关系,而不注重顺序:表现一个类的操作实现,协作图中可以说明类操作中使用的参数,变量,返回值.当表现一个系统的行为时,消息编号对应了程序中嵌套调用的结构和信号传递过程. 序列图

软件工程_东师站_第五周作业

1.psp Date Type Job Start Int(min) End Total(min) 20160404 助教 团队博客 14:00 16:20 25 14:25 18:00 100 站立会议 "耐撕"站立会议 15:10 15:40 30 编码 重构 18:00 5 18:30 25 编码 选择抢答者(JSP) 18:30 10 19:30 50 20160405 编码 选择抢答者(生成抢答者圈圈) 18:10 15 19:00 35 看书 计算机网络与因特网 22:00

第五周作业。

第五周时候解决的问题. 就拿自己做的那个APP项目来说吧.由于项目需求,清明前花了一个下午时间来实现一个下拉刷新的ListView.上网看了第三方的库,发现不是很适合自己用.于是自己尝试的去实现了个一个下拉刷新的ListVIew. 项目地址: https://github.com/wukunguang/GongGong 首先,大概描述下用户使用整个下拉刷新的过程. 触摸-> 按住 -> 向下拖动 -> 松开 那么程序内部实现的操作大概可分解为: 捕获触摸动作  -> 捕获向下拖动