I’ve earned it! https://confirm.udacity.com/TLVUZQTR
Tag Archives: Udacity
Simple Linear Regression
In this lesson, you will: Identify Regression Applications Learn How Regression Works Apply Regression to Problems Using Python Machine Learning is frequently split into supervised and unsupervised learning. Regression, which you will be learning about in this lesson (and its extensions in later lessons), is an example of supervised machine learning. In supervised machine learning, you are interested in predicting […]
Case Study: A/B Tests
A/B tests are used to test changes on a web page by running an experiment where a control group sees the old version, while the experiment group sees the new version. A metric is then chosen to measure the level of engagement from users in each group. These results are then used to judge whether one version is more effective than […]
Hypothesis Testing
rules for setting up null and alternative hypotheses: The H_0H0 is true before you collect any data. The H_0H0 usually states there is no effect or that two groups are equal. The H_0H0 and H_1H1 are competing, non-overlapping hypotheses. H_1H1 is what we would like to prove to be true. H_0H0 contains an equal sign of some kind – either =, \leq≤, or \geq≥. H_1H1 contains the opposition […]
Confidence Intervals – Udacity
import pandas as pd import numpy as np import matplotlib.pyplot as plt %matplotlib inline np.random.seed(42) full_data = pd.read_csv(‘../data/coffee_dataset.csv’) sample_data = full_data.sample(200) diffs = [] for _ in range(10000): bootsamp = sample_data.sample(200, replace = True) coff_mean = bootsamp[bootsamp[‘drinks_coffee’] == True][‘height’].mean() nocoff_mean = bootsamp[bootsamp[‘drinks_coffee’] == False][‘height’].mean() diffs.append(coff_mean – nocoff_mean) np.percentile(diffs, 0.5), np.percentile(diffs, 99.5) # statistical evidence […]
Statistics – Udacity
Descriptive Statistics Descriptive statistics is about describing our collected data using the measures discussed throughout this lesson: measures of center, measures of spread, shape of our distribution, and outliers. We can also use plots of our data to gain a better understanding. Inferential Statistics Inferential Statistics is about using our collected data to draw conclusions to a larger […]
Probability – Udacity
Probability Here you learned some fundamental rules of probability. Using notation, we could say that the outcome of a coin flip could either be T or H for the event that the coin flips tails or heads, respectively. Then the following rules are true: \bold{P(H)} = 0.5P(H)=0.5 \bold{1 – P(H) = P(\text{not H})} = 0.51−P(H)=P(not H)=0.5 where \bold{\text{not H}}not H is the event […]
Data Analysis Process – Case Study 2 – Udacity
Drawing Conclusions
Data Analysis Process – Case Study 1 – Udacity
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
% matplotlib inline
Plotting with Pandas – Udacity
import pandas as pd % matplotlib inline df_census = pd.read_csv(‘census_income_data.csv’) df_census.info() df_census.hist(figsize=8, 8)); df_census[‘age’].hist() df_census[‘age’].plot(kind=’hist’); df_census[‘education’].value_counts() #aggregates counts for each unique value in a column df_census[‘education’].value_counts().plot(kind=’bar’) df_census[‘education’].value_counts().plot(kind=’pie’, figsize=(8, 8)); df_cancer = pd.read_csv(‘cancer_data_edited.csv’) pd.plotting.scatter_matrix(df_cancer, figsize=(15, 15)); df_cancer.plot(x=’compactness’, y=’concavity’, kind=’scatter’); df_cancer[‘concave_points’].plot(kind=’box’); import pandas as pd df = pd.read_csv(‘cancer_data_edited.csv’) df.head() df_m = df[df[‘diagnosis’] == ‘M’] df_m.head() […]