下載各類工具套件 http://CRAN.R-project.org
R-intro.pdf為官方手冊,值得初學者研讀
線上中文版: R 導論
宣告一個變數名稱為 x 的向量, 其中 <- 為指派運算子(assign operator)
1: > x <- c(1, 2, 3, 4, 5)
2: > x
3: [1] 1 2 3 4 5
同理, 另一種等效運算子–>
1: > c(6, 7, 8, 9, 10)->x
2: > x
3: [1] 6 7 8 9 10
也可以用assign指令,
1: > assign("x", c(2, 3, 4))
2: > x
3: [1] 2 3 4
堆疊向量: y = [x 0 2*x] ,
1: > x
2: [1] 2 3 4
3: > y<-c(x, 0, 2*x)
4: > y
5: [1] 2 3 4 0 4 6 8
其中x為3個元素的向量, 因此y的元素個數為7個
1: > length(x)
2: [1] 3
3: > length(y)
4: [1] 7
最小值min() 最大值max() 平均值mean(), 加總sum(), 乘積prod()
1: > z = c(min(x), max(x), mean(x), sum(x), prod(x))
2: > z
3: [1] 2 4 3 9 24
plot( z )
1: > sum((x-mean(x))^2)/(length(x)-1)
2: [1] 2.5
3: > var(x)
4: [1] 2.5
查詢 help
1: help("sort")
2: tarting httpd help server ... done
3: sort(x,decreasing =FALSE)
4: 1] 1 2 3 4 5
5: sort(x,decreasing =TRUE)
6: 1] 5 4 3 2 1
2015/01/10
pg. 1~14 sect. 2.3 Generating regular sequences
產生1,2,~15序列
1: > s1 = 1:15
2: > s1
3: [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
下指令seq()也可以
1: > s2 = seq(from=1, to=15)
2: > s2
3: [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
產生間隔2的序列
1: > s3 = seq(from=1, to=15, by=2)
2: > s3
3: [1] 1 3 5 7 9 11 13 15
產生間隔0.5遞減序列
1: > s4 = seq(from=14, to=1, by = -0.5)
2: > s4
3: [1] 14.0 13.5 13.0 12.5 12.0 11.5 11.0 10.5 10.0 9.5 9.0 8.5 8.0 7.5
4: [15] 7.0 6.5 6.0 5.5 5.0 4.5 4.0 3.5 3.0 2.5 2.0 1.5 1.0
NA: not available一般用來表示missing value或一些特殊狀況
1: > 0/0
2: [1] NaN
3: > Inf-Inf
4: [1] NaN
定義一個z序列如下
1: > z = c(1:3, NA, 3:-1)
2: > z
3: [1] 1 2 3 NA 3 2 1 0 -1
4: > z>0
5: [1] TRUE TRUE TRUE NA TRUE TRUE TRUE FALSE FALSE
接下來篩選z>0的元素, 其中[]內為篩選條件,第一個>0, 第二個為有限is.finite(), 兩個條件進行&運算後, 當作z的布林篩選條件,
其結果為true的保留下來, 為false剔除
1: > x = z[ z>0 & is.finite(z)]
2: > x
3: [1] 1 2 3 3 2 1
paste()指令
The paste() function takes an arbitrary number of arguments and concatenates them one by
one into character strings.
1: > x = paste(c("str1", "str2"), 1:3)
2: > x
3: [1] "str1 1" "str2 2" "str1 3"
4: > x[1]
5: [1] "str1 1"
6: > x[2]
7: [1] "str2 2"
8: > x[3]
9: [1] "str1 3"
2015/1/11 p16 2.6 Character vectors
產生字串x1,y2,x3,y4,…字母和數字中間沒有”空格”
1: > x<-paste(c("X","Y"), 1:10, sep="")
2: > x
3: [1] "X1" "Y2" "X3" "Y4" "X5" "Y6" "X7" "Y8" "X9" "Y10"
若省略sep參數, 字母與數字中間會有空格
1: > y = paste(c("X","Y"), 1:10)
2: > y
3: [1] "X 1" "Y 2" "X 3" "Y 4" "X 5" "Y 6" "X 7" "Y 8" "X 9" "Y 10"
若字母與數字中間想插入”_”底線
1: > z<-paste(c("X","Y"), 1:10, sep="_")
2: > z
3: [1] "X_1" "Y_2" "X_3" "Y_4" "X_5" "Y_6" "X_7" "Y_8" "X_9" "Y_10"
類似MATLAB repmat
語法: rep( array, times = N)
1: > rep( c(1,2), times=3)
2: [1] 1 2 1 2 1 2
rep( c(1,2,3), times=2) 重複array [1, 2, 3]兩次, 並將其結果當作另一個陣列索引值
1: > rep(c(1,2,3), times=2)
2: [1] 1 2 3 1 2 3
3: > c(1.1, 1.2, 1.3)[rep(c(1,2,3), times=2)]
4: [1] 1.1 1.2 1.3 1.1 1.2 1.3
-------------------------------------------------------------------------------------------------------------
產生一個序列z = [0,1,2…9], 接著利用as物件進行資料型態轉型, 轉字元as.character()並存成z_str
接著, 將該字元陣列轉回數值陣列 z_num, 並利用z==z_num驗證兩個數值陣列相同
1: > z = 0:9
2: > z_str = as.character(z)
3: > z_str
4: [1] "0" "1" "2" "3" "4" "5" "6" "7" "8" "9"
5: > z_num = as.numeric(z_str)
6: > z_num
7: [1] 0 1 2 3 4 5 6 7 8 9
8: > z==z_num
9: [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
產生一個變數x, 初始化empty物件(數值類型), 指定第三個元素=5, 其他未定義的元素則為NA
1: > x <- numeric()
2: > x
3: numeric(0)
4: > x[3] = 5
5: > x
6: [1] NA NA 5
利用is.na()來判定哪些元素為not available
1: > is.na(x)
2: [1] TRUE TRUE FALSE
索引:x = 1 2 3 4 5 1 2 3 4 5 6
數值:val = 1 2 3 4 5 6 7 8 9 10 11
將索引分類
1: > xf
2: [1] 1 2 3 4 5 1 2 3 4 5 6
3: Levels: 1 2 3 4 5 6
根據索引分類, 並套用mean(), 計算相同分群間的平均值
其中tapply()為內建函式:Apply a Function Over a Ragged Array
1: > y<-tapply(1:11, xf, mean)
2: > y
3: 1 2 3 4 5 6
4: 3.5 4.5 5.5 6.5 7.5 11.0
試試看如果要取相同分群的min, max和sum
1: > z1<-tapply(val, xf, max)
2: > z2<-tapply(val, xf, min)
3: > z1
4: 1 2 3 4 5 6
5: 6 7 8 9 10 11
6: > z2
7: 1 2 3 4 5 6
8: 1 2 3 4 5 11
9: > z3<-tapply(val, xf, sum)
10: > z3
11: 1 2 3 4 5 6
12: 7 9 11 13 15 11
自定義function pointer
stderr <- function(x) sqrt(var(x)/length(x))
1: > stderr <- function(x) sqrt(var(x)/length(x))
2: > z4<-tapply(val, xf, stderr)
3: > z4
4: 1 2 3 4 5 6
5: 2.5 2.5 2.5 2.5 2.5 NA
呼叫ordered()
1: > z5<-tapply(val, xf, ordered)
2: > z5
3: $`1`
4: [1] 1 6
5: Levels: 1 < 6
6:
7: $`2`
8: [1] 2 7
9: Levels: 2 < 7
10:
11: $`3`
12: [1] 3 8
13: Levels: 3 < 8
14:
15: $`4`
16: [1] 4 9
17: Levels: 4 < 9
18:
19: $`5`
20: [1] 5 10
21: Levels: 5 < 10
22:
23: $`6`
24: [1] 11
25: Levels: 11
2015/01/15 pg.24
5 Arrays and matrices