Writing your own functions
計算兩點距離
1: > calDist<-function(x1, x2){
2: + d = x1 - x2;
3: + dist = d * d;
4: + sdist = sqrt( sum(dist) );
5: + sdist
6: + }
定義兩個資料點x1, x2
1: > x1<-c(1,2)
2: > x2<-c(0,0)
1: > calDist(x1,x2)
2: [1] 2.236068
回傳結果至另一個變數y
1: > y = calDist(x1,x2)
2: > y
3: [1] 2.236068
範例2: y1 = 2*x1
1: > timestwo<-function(x){
2: + y<- 2*x
3: + print(x)
4: + print(y)
5: + }
呼叫timestwo副程式, 其中第2行為印出x1, 第3行為印出y1的值
1: > y1 = timestwo(x1)
2: [1] 1 2
3: [1] 2 4
Writing a for-loop in R
> for(i in 1:9)
+ print(i)
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] 6
[1] 7
[1] 8
[1] 9
寫一個99乘法表
for(i in 1:9)
for(j in 1:9)
print( c(i, j, i*j) );
跑出來如下:
產生一個1~50的奇數序列
> foo = seq(1, 100, by=2)
> foo
[1] 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47
[25] 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 81 83 85 87 89 91 93 95
[49] 97 99
將foo的每個元素取平方, 並存在foo.squared
> foo.squared = NULL
接著跑一個 for-loop , 儲存在foo.squared[i]
> for (i in 1:50 ) {
+ foo.squared[i] = foo[i]^2
+ }
> foo.squared
[1] 1 9 25 49 81 121 169 225 289 361 441 529 625 729
[15] 841 961 1089 1225 1369 1521 1681 1849 2025 2209 2401 2601 2809 3025
[29] 3249 3481 3721 3969 4225 4489 4761 5041 5329 5625 5929 6241 6561 6889
[43] 7225 7569 7921 8281 8649 9025 9409 9801
當然, 如果不跑for-loop 也可以直接將每個元素, 即向量化(vectorization) 取平方
foo.squared2 = foo^2
> foo.squared2
[1] 1 9 25 49 81 121 169 225 289 361 441 529 625 729
[15] 841 961 1089 1225 1369 1521 1681 1849 2025 2209 2401 2601 2809 3025
[29] 3249 3481 3721 3969 4225 4489 4761 5041 5329 5625 5929 6241 6561 6889
[43] 7225 7569 7921 8281 8649 9025 9409 9801
將不必要的元素刪除提升計算速度
(1)速度快版本: 利用 !is.na(bar.squared) 找出非nan的元素, 並回存bar.squared
bar = seq(1,200000, by=2)
bar.squared = rep(NA, 200000)
for (i in 1:length(bar) ) {
bar.squared[i] = bar[i]^2
}
#get rid of excess NAs
bar.squared = bar.squared[!is.na(bar.squared)]
summary(bar.squared)
(2)速度慢版本: bar.squared含有nan, 造成進行統計時速度降低很多
bar = seq(1, 200000, by=2)
bar.squared = NULL
for (i in 1:length(bar) ) {
bar.squared[i] = bar[i]^2
}
summary(bar.squared)
Least squares
x <- c(0.02, 0.02, 0.06, 0.06, 0.11, 0.11, 0.22, 0.22, 0.56, 0.56,
1.10, 1.10)
y <- c(76, 47, 97, 107, 123, 139, 159, 152, 191, 201, 207, 200)
利用plot(x,y)繪圖
假設模型 y = p1*x/p2*x
則SSE(sum of squared error) 函式為
fn <- function(p) sum((y - (p[1] * x)/(p[2] + x))^2)
如果使用spline
> spline(x, y)
$x
[1] 0.02000000 0.08352941 0.14705882 0.21058824 0.27411765 0.33764706
[7] 0.40117647 0.46470588 0.52823529 0.59176471 0.65529412 0.71882353
[13] 0.78235294 0.84588235 0.90941176 0.97294118 1.03647059 1.10000000
$y
[1] 61.5000 118.0314 143.0677 154.2625 162.6790 171.1198 179.2518 186.7350
[9] 193.2297 198.4064 202.1812 204.7155 206.1813 206.7504 206.5948 205.8866
[17] 204.7977 203.5000
並將spline的數據和原始x,y畫在同一張, 不過這一組spline看起來的確很”卡”, 因此有必要內插一些數據讓曲線更加平滑
假設我們猜一組參數p1=200, p2 =0.1當作初始值, 並在x區間每隔0.05內插一筆, 存成xfit
xfit <- seq(.02, 1.1, .05)
plot(x, y)
xfit <- seq(.02, 1.1, .05)
yfit <- 200 * xfit/(0.1 + xfit)
lines(spline(xfit, yfit))
曲線有比較平滑不過誤差頗大, 接著利用nlm(Non-Linear Minimization)函式進行參數估測(parameter estimation)
out <- nlm(fn, p = c(200, 0.1), hessian = TRUE)
其中fn為error function
fn <- function(p) sum((y - (p[1] * x)/(p[2] + x))^2)
p為估測參數猜測的初始值200和0.1
1: > out <- nlm(fn, p = c(200, 0.1), hessian = TRUE)
2: > out
3: $minimum
4: [1] 1195.449
5:
6: $estimate
7: [1] 212.68384222 0.06412146
8:
9: $gradient
10: [1] -0.0001534978 0.0934207640
11:
12: $hessian
13: [,1] [,2]
14: [1,] 11.94725 -7661.319
15: [2,] -7661.31875 8039421.152
16:
17: $code
18: [1] 3
19:
20: $iterations
21: [1] 26
$esitmate 即為參數估測結果
1: > out$estimate[1]
2: [1] 212.6838
3: > out$estimate[2]
4: [1] 0.06412146
因此, yfit更新為
yfit <- out$estimate[1] * xfit/(out$estimate[2] + xfit)
相當於
yfit <- 212.68384222 * xfit/(0.06412146 + xfit)
(1)繪製原始數據
plot(x, y)
(2)非線性估測的yfit
lines(spline(xfit, yfit))
(1)+(2)結果如下
假設估測初始值猜測從(0,0)開始猜, 發現迭代次數反而比較少…
out <- nlm(fn, p = c(0, 0), hessian = TRUE)
$minimum
[1] 1195.449
$estimate
[1] 212.68385154 0.06412146
$gradient
[1] -4.433729e-05 2.426467e-02
$hessian
[,1] [,2]
[1,] 11.94725 -7661.319
[2,] -7661.31938 8039422.512
$code
[1] 3
$iterations
[1] 16
Dynamic graphics
R透過rggobi套件, 才有辦法進行互動
3D繪圖
rgl 套件: 3D visualization device system (OpenGL)
參考資料:
留言列表