0%

# 第一周

## 总结: ## 机器学习定义:

Well-posed Learning Problem: A computer program is said to learnfrom experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.
`T:` Classifying emails as spam or not spam.
`E:` Watching u label emails as spam or not spam.
`P:` The number of emails correctly classified as spam.

## 机器学习算法分类:

### Supervised Learning

• `regression`: 在连续的数据中预测
• `classification`: 最大的区别在于预测的结果, 肯定为 yes or no, 或者一个集合内, e.g. 红, 黄, 蓝, 绿

### Unsupervised learning

• Clustering
• Non-clustering (cocktail party problem: 麦克风的分离(two audio sources))

## Cost Function

linear regression:

• `m` = number of training examples
• `(x_i, y_i)` → single training example, i: index

cost function(squared error function) → measure the accuracy

a fancier version of an average: 个人理解用平方将个别差异放大.
(为什么要除以 2m 而不是 m)??(哇, 下一节的笔记就有解释): The mean is halved(1/2) as a convenience for the computation of the gradient descent.
(没看懂, 希望之后会提及) → 求导时会多出一个 2, 刚好抵消了.

linear & cost function

⚠️注意:
`a := b`: assignment(overwrite a’s value by b)
`a = b`: truth assignment

gradient decent + cost function

chain rule:
https://zs.symbolab.com/solver/derivative-calculator/%5Cfrac%7Bd%7D%7Bdx%7D%5Cleft(%5Cleft(3x%2B1%5Cright)%5E%7B2%7D%5Cright)%5E%7B2%7D%5Cright))

convex function(bowl shape function) → 永远只有一个最低点

## Linear Algebra Review

Matrix

1. A: [1, 3] → 1 x 2 matrix
2. A: [1, 3] → A_12 = 3

Vector

Vectorization

### Matrix Multiplication Properties

1. A x B != B x A (not commutative)
2. (AxB)xC = Ax(BxC) (associate)
3. identity matrix: AxI = IxA
<img style="max-height:300px" class="lazy" data-original="\images\blog\180707_cousera_ml\DraggedImage-15.png)

# 第二周:

MATLAB Online: https://matlab.mathworks.com/

## Multiple Features:

1 feature: size → price
n features: size, bedrooms, floors, .. → price

Multivariate linear regression:

## Normalization

Feature Scaling:

feature 1: -1 \<-> 3
feature 2: -1000 \<-> 1000

mean normalization + feature scaling

## Features and Polynomial Regression

Polynomial: 二次, 三次, n 次方程, 多项式

# 第三周

## Classigication 的分类

1. 只聚类为 1 或 0(binary classification problem): `{0(negative), 1(positive)}`
2. 聚类为一个集合(multi-classification problem): `{0, 1, 2, 3}`

## Hypothesis Representation

### Logistic Function

Logistic Function 的值其实还有另外一个含义: 代表输出结果为 1 的可能性:

## Simplified Cost Function & distance

distance 求导(说实话没看懂, 视频中只给了结果, 可能推到比较复杂一些):

## 很有趣的一段话:

“When I walk around Silicon Valley, I live here in Silicon Valley, there are a lot of engineers that are frankly making a ton of money for their company using machine learning algorithms. And I know we’ve only been, you know studying this stuff for a little while, But if u understand linear regression, `logistic regression,`the advanced optimization algorithms, and regularizaion, by now, frankly, you probably know quite a lot more machine learning than many, certainly now, but you probably know quite a lot more machine learning right now than frankly, many of the Silicon Valley engineers out there having very successful careers. You know, making tons of money for the companies. Or building products using machine learning algorithms. ”

# 第四周

## Non-linear Hypothesis:

100 features instead of two.
50*50 pixel images → n=2500(pixels)

2500: (x_1, x_1), (x_1, x_2), (x_1, x_2), ....
2499: (x_2, x_2), (x_2, x_3), (x_2, x_3), ....
...
answer = 2500 + 2499 + ... + 1 = 3 million

## Neurons and the Brain

Neural Networks → mimic the brain

Neuron: 神经元

(brain 的模拟图)

bias unit: x_0

# 乱七八糟:

your coding time is the most valuable resource.

matrix, vector or scalar: 矩阵, 向量, 数量

theta: θ
alpha: α
lambda: λ

constant e: exp(1)

slope: 斜率
converge: 收敛
derivative: 导数