系统发育中快速/缓慢收敛的mcmc数据是什么样的

快速收敛的数据满足信号一致:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
    49   50
T_01 ACCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
T_02 AACCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
T_03 AAACCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
T_04 AAAACCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
T_05 AAAAACCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
T_06 AAAAAACCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
T_07 AAAAAAACCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
T_08 AAAAAAAACCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
T_09 AAAAAAAAACCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
T_10 AAAAAAAAAACCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
T_11 AAAAAAAAAAACCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
T_12 AAAAAAAAAAAACCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
T_13 AAAAAAAAAAAAACCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
T_14 AAAAAAAAAAAAAACCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
T_15 AAAAAAAAAAAAAAACCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
T_16 AAAAAAAAAAAAAAAACCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
T_17 AAAAAAAAAAAAAAAAACCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
T_18 AAAAAAAAAAAAAAAAAACCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
T_19 AAAAAAAAAAAAAAAAAAACCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
T_20 AAAAAAAAAAAAAAAAAAAACCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
T_21 AAAAAAAAAAAAAAAAAAAAACCCCCCCCCCCCCCCCCCCCCCCCCCCCC
T_22 AAAAAAAAAAAAAAAAAAAAAACCCCCCCCCCCCCCCCCCCCCCCCCCCC
T_23 AAAAAAAAAAAAAAAAAAAAAAACCCCCCCCCCCCCCCCCCCCCCCCCCC
T_24 AAAAAAAAAAAAAAAAAAAAAAAACCCCCCCCCCCCCCCCCCCCCCCCCC
T_25 AAAAAAAAAAAAAAAAAAAAAAAAACCCCCCCCCCCCCCCCCCCCCCCCC
T_26 AAAAAAAAAAAAAAAAAAAAAAAAAACCCCCCCCCCCCCCCCCCCCCCCC
T_27 AAAAAAAAAAAAAAAAAAAAAAAAAAACCCCCCCCCCCCCCCCCCCCCCC
T_28 AAAAAAAAAAAAAAAAAAAAAAAAAAAACCCCCCCCCCCCCCCCCCCCCC
T_29 AAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCCCCCCCCCCCCCCCCCCC
T_30 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCCCCCCCCCCCCCCCCCC
T_31 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCCCCCCCCCCCCCCCCC
T_32 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCCCCCCCCCCCCCCCC
T_33 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCCCCCCCCCCCCCCC
T_34 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCCCCCCCCCCCCCC
T_35 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCCCCCCCCCCCCC
T_36 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCCCCCCCCCCCC
T_37 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCCCCCCCCCCC
T_38 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCCCCCCCCCC
T_39 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCCCCCCCCC
T_40 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCCCCCCCC
T_41 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCCCCCCC
T_42 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCCCCCC
T_43 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCCCCC
T_44 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCCCC
T_45 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCCC
T_46 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCC
T_47 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCC
T_48 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACC
T_49 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAC

该数据文件可以在Mrbayes中快速收敛:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
Chain results (1000000 generations requested):

0 -- [-348856.375] [...7 remote chains...]
1000 -- [-74447.065] [...7 remote chains...] -- 0:00:00
2000 -- [-45964.399] [...7 remote chains...] -- 0:08:19
3000 -- (-42577.505) [...7 remote chains...] -- 0:11:04
4000 -- (-41096.351) [...7 remote chains...] -- 0:08:18
5000 -- [-40706.751] [...7 remote chains...] -- 0:09:57

Average standard deviation of split frequencies: 0.113752

6000 -- (-40614.814) [...7 remote chains...] -- 0:11:02
7000 -- (-40580.207) [...7 remote chains...] -- 0:09:27
8000 -- (-40543.464) [...7 remote chains...] -- 0:10:20
9000 -- (-40526.129) [...7 remote chains...] -- 0:11:00
10000 -- (-40498.073) [...7 remote chains...] -- 0:09:54

Average standard deviation of split frequencies: 0.003343

11000 -- (-40515.419) [...7 remote chains...] -- 0:10:29
12000 -- (-40512.721) [...7 remote chains...] -- 0:10:58
13000 -- (-40518.564) [...7 remote chains...] -- 0:11:23
14000 -- (-40517.432) [...7 remote chains...] -- 0:10:33
15000 -- (-40498.062) [...7 remote chains...] -- 0:10:56

Average standard deviation of split frequencies: 0.000000

16000 -- (-40503.398) [...7 remote chains...] -- 0:11:16

如果此时,将该文件中otu的序号打乱,生成新文件,例如简单交换前25行和后25行序号位置,此时新文件支持和原来截然不同的系统发育树(T1和T49从需要突变50位点变成需要突变1位点),如果将二者合并成新文件再随机选择其中50列,即可得到缓慢收敛的数据:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
    49   50
T_01 AAAACCAACCCCCACCCCCCCCCCCACCCACCCCCCCCCCCCACCCCCCC
T_02 AAAACCAACCCCCACCCCCCCCCCCACCCACCCCCCCCCCCCACCCCCCC
T_03 AAAACCAACCCCCACCCCACCCCCCACCCACCCCCCCCCCCCACCCCCCC
T_04 AAAACCAACCCCCACCCCACCCCCCACCCACCCCCCCCCCCCACCCCCCC
T_05 AAAACCAACCCCCACCCCACCCCCCACCCACCCCCCCCCCCCACCCCCCC
T_06 AAAACCAACCCCCACCCCACCCCCCCCCCACCCCCACCCCCCACCCCCCC
T_07 AAAACCAACCCCCACCCCAACCCCCCCCCACCCACACCCCCCACCCCCCC
T_08 AAAACCAACCCCCACCCCAACCCCCCCCCACCCACACCCCCCACCCCCCC
T_09 CAAACCAACCCACACCCCAACCCCCCCCCACCCACACCCCCCACCCCCCC
T_10 CAAACCAACCCACACCCCAACCCCCCCCCACCCACACCCCCCACCCCCCC
T_11 CAAACCAACCCACACCCCAACCCCCCCCCACCCACACCCACCACCCACCC
T_12 CAAACCAACCCACACCCCAACCCCCCCCCACCCACACCCACCACCCACCC
T_13 CAAACCAACCCACACCCCAACCCCCCCCCACCCACACCCACCACCCACCC
T_14 CAAACCAACCCACACCACAACCCCCCCCCACCCACACCCACCACCCACCC
T_15 CACACCAACCCACACCACAACCCCCCCCCACCCACACCCACCACCCACCC
T_16 CACACCAACCCACACCACAACCCCCCCCCACCCACACCCACCACCCACCC
T_17 CACACCAACCCACACCACAACCCCCCCCCACCCACACCCACCACCCACCC
T_18 CACACCAACCCACCCCACAACCCCCCCCCACCCACACCCACCACCCACCC
T_19 CACACCAACCCACCCCACAACCCCCCCCCACCCACACCCACCACCCACCC
T_20 CACACCAACCCACCCCACAACCCACCCCCACCAACACCCACCCCCCACCC
T_21 CACACCAACCCACCCCACAACCCACCCCCACCAACACCCACCCCCCACCC
T_22 CACACCAACCCACCCCACAACCCACCCCCACCAACACCCACCCCCCACCC
T_23 CACACCACCCCACCCCACAACCCACCCCCACCAACACCCACCCCCCACCC
T_24 CACCCCCCCCCACCCCACAACCCACCCCCACCAACACCCACCCCCCACCC
T_25 CCCCCCCCCCCACCCCACAACCCACCCCCCCCAACACCAACCCCCCACCC
T_26 AAAACAAACCCAAACCAAAAACAAAACCCACCAAAACCAACCACAAAAAA
T_27 AAAACAAACCCAAACCAAAAACAAAACCCACCAAAACCAACCACAAAAAA
T_28 AAAACAAACCAAAACCAAAAACAAAACCCACCAACACCAACCACAAAAAA
T_29 AAAAAAAACCAAAACCACAAACAAAACCCAACAACACCAACCACAAAAAA
T_30 AAAAAAAACCAAAACCACAAACAAAACCCAACAACACCAACCACAAAAAA
T_31 AAAAAAAACCAAAACCACAAACCAAACCCAACAACACCAACCACAAAAAA
T_32 AAAAAAAACCAAAACCACAAACCAAACACAACAACACCAACCACCAAAAA
T_33 AAAAACAACCAAAACCACAAACCACACACAACAACACCAACCACCAAAAC
T_34 AAAAACAACCAAAACCACAAACCACAAACAACAACACCAACCACCAAAAC
T_35 AAAAACAAACAAAACCACAAACCACAAACAACAACACCAACCACCAAAAC
T_36 AAAAACAAACAAAACCACAAACCACAAACAACAACACCAACAACCAAAAC
T_37 AAAAACAAACAAAACCACAAAACACAAACAACAACACAAACAACCAAAAC
T_38 AAAAACAAACAAAACCACAAAACACAAACAACAACACAAACAACCAAAAC
T_39 AAAAACAAACAAAACCACAAAACACAAACAACAACACAAACAACCAAAAC
T_40 AAAAACAAACAAAACAACAAAACACAAACAACAACACAAACAACCAAAAC
T_41 AAAAACAAACAACACAACAAAACACAAACAACAACACAAACAAACAAACC
T_42 AAAAACAAACAACACAACAAAACACAAACAACAACACAAACAAACAAACC
T_43 AAAAACAAACAACACAACAAAACACAAACAACAACACAAACAAACAAACC
T_44 AAAAACAAACAACACAACAAAACACAAACAACAACACAAACAAACAAACC
T_45 AAAAACAAACAACAAAACAAAACACAAACAACAACACAAACAAACAAACC
T_46 AAAAACAAACAACAAAACAACACACAAACAACAACAAAAACAAACAAACC
T_47 AAAAACAAACAACAAAACAACACACAAACAACAACAAAAACAAACAAACC
T_48 AAAAACAAACAACAAAACAACACACAAACAACAACAAAAAAAAACCAACC
T_49 AAAAACAAACAACAAAACAACACACAAACAACAACAAAAAAAAACCACCC

这种具有强烈互斥的系统发育信号的数据集在多次迭代后仍然难以收敛:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Average standard deviation of split frequencies: 0.242210

46000 -- (-9100.801) [...7 remote chains...] -- 0:12:26
47000 -- (-9099.480) [...7 remote chains...] -- 0:12:30
48000 -- (-9118.462) [...7 remote chains...] -- 0:12:33
49000 -- (-9090.241) [...7 remote chains...] -- 0:12:36
50000 -- (-9101.250) [...7 remote chains...] -- 0:12:40

Average standard deviation of split frequencies: 0.247652

51000 -- (-9105.160) [...7 remote chains...] -- 0:12:24
52000 -- (-9101.906) [...7 remote chains...] -- 0:12:27
53000 -- (-9106.679) [...7 remote chains...] -- 0:12:30
54000 -- (-9105.292) [...7 remote chains...] -- 0:12:33
55000 -- (-9104.724) [...7 remote chains...] -- 0:12:36

Average standard deviation of split frequencies: 0.236504

56000 -- (-9104.708) [...7 remote chains...] -- 0:12:21

造成难收敛的因素有很多,收敛性的评价也很困难。我的问额是,收敛性好的数据集是系统发育信号更好的数据集吗?不,是统计上一致的数据集,并不一定更好。同时该参数也不能完全评判收敛性,还应进一步考虑。