Statistics and Linear Algebra 5

1. The way to get the minimum number in Pandas:

  lowest_income_county = income["county"][income["median_income"].idxmin()] #[income["median_income"].idxmin()] returns the index of minimum number.

  high_pop_county = income[income["pop_over_25"] > 500000]

  lowest_income_high_pop_county = high_pop_county["county"][high_pop_county["median_income"].idxmin()] #find the county that has more than500000 residents with the lowest median income

2. random function , after random seed, only one call of random will follow the seed:

  random.seed(20) #setup the random seed

  new_sequence = [random.randint(0,10) for _ in range(10)]

3. To select certain number of sample form data:

  shopping_sample = random.sample(shopping, 4) #select 4 data from list shopping 

4.  Roll a dice for 10 times in the range 1 to 6, and histogram the result into to a hist with 6 bins.

  def roll():
    return random.randint(1, 6) # create a function to generate a random number from 1 to 6

  random.seed(1)
  small_sample = [roll() for _ in range(10)]

  plt.hist(small_sample, 6)
  plt.show()

5. Roll the dice for 100 times, and repeat this expertment 100 times:

  def probability_of_one(num_trials, num_rolls):
    probabilities = []
    for i in range(num_trials):
      die_rolls = [roll() for _ in range(num_rolls)]
      one_prob = len([d for d in die_rolls if d==1]) / num_rolls
      probabilities.append(one_prob)
    return probabilities

  random.seed(1)
  small_sample = probability_of_one(300, 50)
  plt.hist(small_sample, 20)
  plt.show()

6. Random sampling is more important than picking up samples:  

  mean_median_income = income["median_income"].mean()
  print(mean_median_income)

  def get_sample_mean(start, end):
    return income["median_income"][start:end].mean()

  def find_mean_incomes(row_step):
    mean_median_sample_incomes = []
    for i in range(0, income.shape[0], row_step):
      mean_median_sample_incomes.append(get_sample_mean(i, i+row_step)) # pick up the mean of 1-100, 2-101 ,3 -102
    return mean_median_sample_incomes

  nonrandom_sample = find_mean_incomes(100)
  plt.hist(nonrandom_sample, 20)
  plt.show()

  def select_random_sample(count):
    random_indices = random.sample(range(0, income.shape[0]), count)
    return income.iloc[random_indices]

  random.seed(1)

  random_sample = [select_random_sample(100)["median_income"].mean() for _ in range(1000)] # get the mean  of randomly 100 number 
  plt.hist(random_sample, 20)
  plt.show()

7. If we would like to do some calculations between the sample columns, we can do it like this:

  def select_random_sample(count):# This function is to get "count" number of sample from the data set
    random_indices = random.sample(range(0, income.shape[0]), count)
    return income.iloc[random_indices]

  random.seed(1)

  mean_ratios = []
  for i in range(1000): # loop 1000 times
    sample = select_random_sample(100)
    ratio = sample[‘median_income_hs‘]/sample[‘median_income_college‘]
    mean_ratios.append(ratio.mean()) # Get the mean of the ratio between two column and append it into the target list.

  plt.hist(mean_ratios,20)
  plt.show

8. Santistical Signifcance, the way to determine if a result is valid for a population or not:

  significance_value = None

  count = 0
  for i in mean_ratios:
    if i > 0.675: # We get 0.675 from another dataset
      count += 1
  significance_value = count / len(mean_ratios)# The result is 0.14, which means in the result there is only 1.4% percent of country salary is higher than the one we get from salary data from after the program. Which means the program is really successful

时间: 2024-07-30 20:56:24

Statistics and Linear Algebra 5的相关文章

Statistics and Linear Algebra 2

1. The way to calculate the variance of a certain set of data: pts_mean = sum(nba_stats["pts"])/len(nba_stats['pts']) point_variance = 0 for i in nba_stats['pts']: difference = (i - pts_mean) ** 2 point_variance += difference point_variance = po

Statistics and Linear Algebra 6

1. Two ways to get a column of another column with max/min values: a. most_bars_country = flags["name"][flags["bars"].idxmax()] b. bars_sorted = flags.sort_values("bars", ascending=[0]) most_bars_country = bars_sorted["n

Statistics and Linear Algebra 3

1. Get the r value and the p value between the dataset: r_fta_pts,p_value = pearsonr(nba_stats["pts"],nba_stats["fta"])  r_stl_pf,p_value = pearsonr(nba_stats["stl"],nba_stats["pf"]) # It will return R value and P v

Statistics and Linear Algebra 1

1. Add a value to each element in a list: degrees_zero = [f + 459.67 for f in fahrenheit_degrees] 2. Assign the index of a list into the list: survey_responses = ["none", "some", "a lot", "none", "a few",

Statistics and Linear Algebra 4

1.The way to calculate the slope: the covariance of x and y divided by the variance of x from numpy import cov slope_density = cov(wine_quality["quality"],wine_quality["density"])[0,1]/wine_quality["density"].var() #cov(x,y)

A Linear Algebra Problem(唯一性的判定)

A Linear Algebra Problem Time Limit: 3000/1000MS (Java/Others)     Memory Limit: 65535/65535KB (Java/Others) Submit Status God Kufeng is the God of Math. However, Kufeng is not so skilled with linear algebra, especially when dealing with matrixes. On

《Linear Algebra and Its Applications》- 线性方程组

同微分方程一样,线性代数也可以称得上是一门描述自然的语言,它在众多自然科学.经济学有着广阔的建模背景,这里笔者学识有限暂且不列举了,那么这片文章来简单的讨论一个问题——线性方程组. 首先从我们中学阶段就很熟系的二元一次方程组,我们采用换元(其实就是高斯消元)的方法.但是现在我们需要讨论更加一般的情况,对于线性方程,有如下形式: a1x1+a2x2+…anxn = b. 现在我们给出多个这样的方程构成方程组,我们是否有通用的解法呢? 在<Linear Algebra and Its Applica

Here’s just a fraction of what you can do with linear algebra

Here’s just a fraction of what you can do with linear algebra The next time someone wonders what the point of linear algebra is, send them here. I write a blog on math and programming and I see linear algebra applied to computer science all the time.

Memo - Chapter 6 of Strang&#39;s Linear Algebra and Its Applications

1.实对称矩阵的正定 2.实对称矩阵的半正定 3. Sylvester’s law of inertia : 4.Sylvester’s law of inertia 的推论: 5. SVD 6.瑞利伤: Memo - Chapter 6 of Strang's Linear Algebra and Its Applications