2. 程式化停止 Timer
$ go tool pprof ./gb cpu-27989.pprof
Welcome to pprof! For help, type 'help'.
(pprof) top10
Total: 967 samples
61 6.3% 6.3% 61 6.3% runtime.xchg
44 4.6% 10.9% 44 4.6% runtime.memmove
34 3.5% 14.4% 34 3.5% net/textproto.canonicalMIMEHeaderKey
29 3.0% 17.4% 31 3.2% runtime.settype_flush
27 2.8% 20.2% 227 23.5% runtime.mallocgc
26 2.7% 22.9% 26 2.7% runtime.casp
25 2.6% 25.4% 87 9.0% runtime.MCache_Alloc
25 2.6% 28.0% 46 4.8% runtime.markallocated
23 2.4% 30.4% 23 2.4% bytes.IndexByte
23 2.4% 32.8% 92 9.5% selectgo
$ echo " PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND"; top -d 1 -b | grep gb
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
27989 brandon 20 0 753m 176m 2844 S 57.9 1.1 0:00.59 gb
27989 brandon 20 0 1012m 234m 2852 S 116.8 1.5 0:01.78 gb
27989 brandon 20 0 1058m 280m 2852 S 107.6 1.8 0:02.87 gb
27989 brandon 20 0 1105m 327m 2852 S 107.7 2.1 0:03.96 gb
27989 brandon 20 0 1194m 346m 2852 S 116.9 2.2 0:05.14 gb
27989 brandon 20 0 1195m 347m 2852 S 107.0 2.2 0:06.22 gb
27989 brandon 20 0 1195m 347m 2852 S 103.9 2.2 0:07.27 gb
27989 brandon 20 0 1275m 348m 2852 S 116.6 2.2 0:08.45 gb
27989 brandon 20 0 1277m 350m 2852 S 111.7 2.2 0:09.58 gb
27989 brandon 20 0 1277m 350m 2852 S 110.8 2.2 0:10.70 gb
27989 brandon 20 0 1349m 351m 2852 S 113.6 2.2 0:11.85 gb
27989 brandon 20 0 1359m 353m 2852 S 115.9 2.2 0:13.02 gb
這時的 cpu profiling 已經比較分散平均,看不太出明顯熱點,另外從 top 數據來看 350m 的記憶體用量有些高,改用 heap profiler 觀察看看是否有 memory leak 的狀況
在程式內加入 Mem Profiling 程式
func main() {
SetMemProfileRate(1) //(請參考附件 profilingtool.go)
// ignored
GC()
DumpHeap()
PrintGCSummary()
}
$ ./gb -c 100 -n 200000 -k http://localhost/10k.dat
(略)
Concurrency Level: 100
Time taken for tests: 25.33 seconds
Complete requests: 200000
Failed requests: 0
HTML transferred: 2048000000 bytes
Requests per second: 7896.70 [#/sec] (mean)
Time per request: 12.664 [ms] (mean)
Time per request: 0.127 [ms] (mean, across all concurrent requests)
HTML Transfer rate: 78966.96 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Total: 0 0 76.91 6 1866
Percentage of the requests served within a certain time (ms)
50% 6
66% 8
75% 10
80% 11
90% 14
95% 16
98% 18
99% 21
100% 1866 (longest request)
NumGC:13 Pause:731.86ms Pause(Avg):604.31ms Overhead:27.92% Alloc:46.39M Sys:696.17M Alloc(Rate):33.91M/s Histogram:1.85s 1.85s 1.85s
這裡用設定的 sample rate 比較小,搜集的資料比較完整,相對也比較多效能消耗,這裡的平均停頓時間就高了許多(604.31ms vs 40.91ms)
brandon@brandon-laptop:~/workspaces/go/src/bench/gb$ echo " PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND"; top -d 1 -b | grep gb
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
29236 brandon 20 0 487m 48m 2944 S 48.1 0.3 0:00.49 gb
29236 brandon 20 0 595m 156m 2944 S 102.1 1.0 0:01.52 gb
29236 brandon 20 0 808m 233m 2956 S 123.9 1.5 0:02.77 gb
29236 brandon 20 0 916m 278m 2956 S 137.8 1.8 0:04.16 gb
29236 brandon 20 0 960m 322m 2956 S 137.7 2.0 0:05.55 gb
29236 brandon 20 0 1004m 366m 2956 S 135.7 2.3 0:06.92 gb
29236 brandon 20 0 1047m 410m 2956 S 136.7 2.6 0:08.30 gb
29236 brandon 20 0 1072m 434m 2956 S 153.2 2.7 0:09.85 gb
29236 brandon 20 0 1072m 434m 2956 S 171.4 2.7 0:11.58 gb
29236 brandon 20 0 1072m 434m 2956 S 141.8 2.7 0:13.01 gb
29236 brandon 20 0 1072m 435m 2956 S 137.7 2.8 0:14.40 gb
29236 brandon 20 0 1072m 436m 2956 S 132.6 2.8 0:15.74 gb
29236 brandon 20 0 1072m 436m 2956 S 135.7 2.8 0:17.11 gb
29236 brandon 20 0 1072m 436m 2956 S 135.3 2.8 0:18.48 gb
29236 brandon 20 0 1072m 436m 2956 S 168.4 2.8 0:20.18 gb
29236 brandon 20 0 1072m 436m 2956 S 153.7 2.8 0:21.73 gb
29236 brandon 20 0 1073m 438m 2956 S 126.6 2.8 0:23.01 gb
29236 brandon 20 0 1073m 438m 2956 S 130.8 2.8 0:24.33 gb
29236 brandon 20 0 1073m 438m 2956 S 132.8 2.8 0:25.67 gb
29236 brandon 20 0 1073m 438m 2956 S 136.6 2.8 0:27.05 gb
29236 brandon 20 0 1073m 438m 2956 S 156.5 2.8 0:28.63 gb
29236 brandon 20 0 1073m 438m 2956 S 143.8 2.8 0:30.08 gb
29236 brandon 20 0 1073m 439m 2956 S 126.5 2.8 0:31.36 gb
29236 brandon 20 0 1076m 441m 2956 S 134.6 2.8 0:32.72 gb
29236 brandon 20 0 1076m 441m 2956 S 140.6 2.8 0:34.14 gb
29236 brandon 20 0 1076m 441m 2956 S 153.5 2.8 0:35.69 gb
29236 brandon 20 0 1076m 441m 2956 S 139.6 2.8 0:37.10 gb
29236 brandon 20 0 1076m 441m 2956 S 131.7 2.8 0:38.43 gb
$ go tool pprof ./gb heap-29236-1.pprof
(pprof) top10
Total: 42.4 MB
33.6 79.1% 79.1% 35.6 83.9% time.NewTimer
2.0 4.8% 83.9% 2.0 4.8% addtimer
1.5 3.6% 87.5% 1.5 3.6% main.NewBenchmark
1.5 3.6% 91.1% 2.9 6.7% main.(*Benchmark).Run
1.2 2.8% 93.9% 1.2 2.9% main.NewHttpWorker
0.6 1.5% 95.4% 0.6 1.5% newdefer
0.6 1.4% 96.8% 0.6 1.4% bufio.NewReaderSize
0.4 1.0% 97.8% 0.4 1.0% bufio.NewWriterSize
0.3 0.7% 98.5% 1.6 3.8% net/http.(*Transport).dialConn
0.1 0.3% 98.8% 0.1 0.3% runtime.malg
程式要結束前還有 42.4 MB 沒被回收,有 memory leak
(pprof) web time.NewTimer
從圖中來看 timer.NewTimer 是 main.(*HttpWorker).Run 裡面的 time.After(d timer.Duration) 建立的
(pprof) list main.\(\*HttpWorker\).Run
Total: 42.4 MB
ROUTINE ====================== main.(*HttpWorker).Run in /home/brandon/workspaces/go/src/bench/gb/http.go
0.0 35.6 Total MB (flat / cumulative)
. . 29:
. . 30: var (
. . 31: invalidContnetSize = errors.New("invalid content size")
. . 32: )
. . 33:
---
. . 34: func (h *HttpWorker) Run() {
. . 35: <-h.start
. . 36:
. . 37: for job := range h.jobs {
. . 38:
0.0 0.0 39: executionResult := make(chan *Record)
. . 40:
. 0.0 41: go h.send(job, executionResult)
. . 42:
. . 43: select {
. . 44: case record := <-executionResult:
. . 45: h.collector <- record
. . 46:
. 35.6 47: case <-time.After(time.Duration(MAX_RESPONSE_TIMEOUT) * time.Second):
. . 48: h.collector <- &Record{Error: errors.New("execution timeout")}
. . 49:
. . 50: case <-h.stop:
. . 51: return
. . 52: }
. . 53: }
. . 54:
. . 55: }
---
. . 56:
. . 57: func (h *HttpWorker) send(request *http.Request, executionResult chan<- *Record) {
. . 58:
. . 59: record := &Record{}
. . 60: sw := &StopWatch{}
select 裡包 time.After(Duration)這種 timeout control用法,在官方文件裡有很多地方都有使用,追查過後才知道,這方法使用的 Timer 等時間過期就會被 GC 回收了,但這裡的狀況是需要對每組 http request/response 各別做 timeout control,會大量使用到 Timer 若要放給 GC 回收會有點浪費記憶體,也會額外增加 GC 掃描的時間
修正方法
func (h *HttpWorker) Run() {
<-h.start
for job := range h.jobs {
timeout := time.NewTimer(time.Duration(MAX_RESPONSE_TIMEOUT) * time.Second)
executionResult := make(chan *Record)
go h.send(job, executionResult)
select {
case record := <-executionResult:
h.collector <- record
case <-timeout.C:
h.collector <- &Record{Error: errors.New("execution timeout")}
case <-h.stop:
timeout.Stop()
return
}
timeout.Stop()
}
}
執行結果
$ ./gb -c 100 -n 200000 -k http://localhost/10k.dat
$ go tool pprof ./gb heap-6438-1.pprof
(pprof) top10
Total: 8.6 MB
1.5 17.8% 17.8% 3.0 35.3% main.(*Benchmark).Run
1.5 17.8% 35.6% 1.5 17.8% main.(*Monitor).Run
1.5 17.8% 53.4% 1.5 17.8% main.NewBenchmark
1.5 17.5% 70.9% 1.5 17.5% net/textproto.MIMEHeader.Set
1.0 11.7% 82.5% 1.0 11.7% bufio.NewReaderSize
1.0 11.7% 94.2% 1.0 11.7% bufio.NewWriterSize
0.5 5.8% 100.0% 0.5 5.8% newdefer
0.0 0.0% 100.0% 0.5 5.8% bufio.(*Writer).Flush
0.0 0.0% 100.0% 1.0 11.7% bufio.NewReader
0.0 0.0% 100.0% 1.0 11.7% bufio.NewWriter
time.NewTimer 從清單中消失了,接下來還有 8.6 MB 的空間在程式結束前沒被回收,下一篇繼續來追蹤