? ? ?在實(shí)際的項(xiàng)目中,最難纏的問(wèn)題就是內(nèi)存泄漏,當(dāng)然還有panic之類的,內(nèi)存泄漏分為兩部分用戶空間的和內(nèi)核空間的.我們就分別從這兩個(gè)層面分析一下.
? ? ?用戶空間查看內(nèi)存泄漏和解決都相對(duì)簡(jiǎn)單。定位問(wèn)題的方法和工具也很多相對(duì)容易.我們來(lái)看看.
? ? 1. 查看內(nèi)存信息
? ? ?cat /proc/meminfo、free、cat /proc/slabinfo等
? ? 2.? 查看進(jìn)程的狀態(tài)信息
? ? top、ps、cat /proc/pid/maps/status/fd等
? ?通常我們定位問(wèn)題先在shell下ps查看當(dāng)前運(yùn)行進(jìn)程的狀態(tài),嵌入式上可能顯示的信息會(huì)少一些.
點(diǎn)擊(此處)折疊或打開(kāi)
root@hos-machine:~#?ps?-uaxw
USER PID %CPU %MEM VSZ RSS TTY STAT?START?TIME COMMAND
root 1 0.0 0.1 119872 3328???Ss 8月10 0:24?/sbin/init splash
root 2 0.0 0.0 0 0???S?8月10 0:00?[kthreadd]
root 3 0.0 0.0 0 0???S?8月10 0:44?[ksoftirqd/0]
root 5 0.0 0.0 0 0???S8月10 0:00?[kworker/0:0H]
root 7 0.0 0.0 0 0???S?8月10 3:50?[rcu_sched]
root 8 0.0 0.0 0 0???S?8月10 0:00?[rcu_bh]
root 9 0.0 0.0 0 0???S?8月10 0:12?[migration/0]
root 10 0.0 0.0 0 0???S?8月10 0:01?[watchdog/0]
root 11 0.0 0.0 0 0???S?8月10 0:01?[watchdog/1]
root 12 0.0 0.0 0 0???S?8月10 0:12?[migration/1]
root 13 0.0 0.0 0 0???S?8月10 1:18?[ksoftirqd/1]
root 15 0.0 0.0 0 0???S8月10 0:00?[kworker/1:0H]
root 16 0.0 0.0 0 0???S?8月10 0:01?[watchdog/2]
root 17 0.0 0.0 0 0???S?8月10 0:12?[migration/2]
root 18 0.0 0.0 0 0???S?8月10 1:19?[ksoftirqd/2]
root 20 0.0 0.0 0 0???S8月10 0:00?[kworker/2:0H]
root 21 0.0 0.0 0 0???S?8月10 0:01?[watchdog/3]
root 22 0.0 0.0 0 0???S?8月10 0:13?[migration/3]
root 23 0.0 0.0 0 0???S?8月10 0:41?[ksoftirqd/3]
root 25 0.0 0.0 0 0???S8月10 0:00?[kworker/3:0H]
root 26 0.0 0.0 0 0???S?8月10 0:00?[kdevtmpfs]
root 27 0.0 0.0 0 0???S8月10 0:00?[netns]
root 329 0.0 0.0 0 0???S8月10 0:00?[ext4-rsv-conver]
root 339 0.0 0.0 0 0???S8月10 0:05?[kworker/1:1H]
root 343 0.0 0.0 0 0???S8月10 0:11?[kworker/3:1H]
root 368 0.0 0.0 39076 1172???Ss 8月10 0:10?/lib/systemd/systemd-journald
root 373 0.0 0.0 0 0???S?8月10 0:00?[kauditd]
root 403 0.0 0.0 45772 48???Ss 8月10 0:01?/lib/systemd/systemd-udevd
root 444 0.0 0.0 0 0???S8月10 0:09?[kworker/2:1H]
systemd+?778 0.0 0.0 102384 516???Ssl 8月10 0:04?/lib/systemd/systemd-timesyncd
root 963 0.0 0.0 191264 8???Ssl 8月10 0:00?/usr/bin/vmhgfs-fuse?-o subtype=vmhgfs-fuse,allow_other?/mnt/hgfs
root 987 9.6 0.0 917024 0???Ssl 8月10 416:08?/usr/sbin/vmware-vmblock-fuse?-o subtype=vmware-vmblock,default_permi
root 1007 0.2 0.1 162728 3084???Sl 8月10 10:14?/usr/sbin/vmtoolsd
root 1036 0.0 0.0 56880 844???S?8月10 0:00?/usr/lib/vmware-vgauth/VGAuthService?-s
root 1094 0.0 0.0 203216 388???Sl 8月10 1:48?./ManagementAgentHost
root 1100 0.0 0.0 28660 136???Ss 8月10 0:02?/lib/systemd/systemd-logind
message+?1101 0.0 0.1 44388 2608???Ss 8月10 0:21?/usr/bin/dbus-daemon?--system?--address=systemd:?--nofork?--nopidfile
root 1110 0.0 0.0 173476 232???Ssl 8月10 0:54?/usr/sbin/thermald?--no-daemon?--dbus-enable
root 1115 0.0 0.0 4400 28???Ss 8月10 0:14?/usr/sbin/acpid
root 1117 0.0 0.0 36076 568???Ss 8月10 0:01?/usr/sbin/cron?-f
root 1133 0.0 0.0 337316 976???Ssl 8月10 0:00?/usr/sbin/ModemManager
root 1135 0.0 0.2 634036 5340???Ssl 8月10 0:19?/usr/lib/snapd/snapd
root 1137 0.0 0.0 282944 392???Ssl 8月10 0:06?/usr/lib/accountsservice/accounts-daemon
syslog 1139 0.0 0.0 256396 352???Ssl 8月10 0:04?/usr/sbin/rsyslogd?-n
avahi 1145 0.0 0.0 44900 1092???Ss 8月10 0:11 avahi-daemon:?running?[hos-machine.local]
這個(gè)是ubuntu系統(tǒng)里的信息比較詳細(xì),我們可以很清晰看到VMZ和RSS的對(duì)比信息.VMZ就是這個(gè)進(jìn)程申請(qǐng)的虛擬地址空間,而RSS是這個(gè)進(jìn)程占用的實(shí)際物理內(nèi)存空間.
通常一個(gè)進(jìn)程如果有內(nèi)存泄露VMZ會(huì)不斷增大,相對(duì)的物理內(nèi)存也會(huì)增加,如果是這樣一般需要檢查malloc/free是否匹配。根據(jù)進(jìn)程ID我們可以查看詳細(xì)的VMZ相關(guān)的信息。例:
點(diǎn)擊(此處)折疊或打開(kāi)
root@hos-machine:~#?cat?/proc/1298/status?
Name:????sshd
State:????S?(sleeping)
Tgid:????1298
Ngid:????0
Pid:????1298
PPid:????1
TracerPid:????0
Uid:????0????0????0????0
Gid:????0????0????0????0
FDSize:????128
Groups:????
NStgid:????1298
NSpid:????1298
NSpgid:????1298
NSsid:????1298
VmPeak:???? 65620 kB
VmSize:???? 65520 kB
VmLck:???? 0 kB
VmPin:???? 0 kB
VmHWM:???? 5480 kB
VmRSS:???? 5452 kB
VmData:???? 580 kB
VmStk:???? 136 kB
VmExe:???? 764 kB
VmLib:???? 8316 kB
VmPTE:???? 148 kB
VmPMD:???? 12 kB
VmSwap:???? 0 kB
HugetlbPages:???? 0 kB
Threads:????1
SigQ:????0/7814
SigPnd:????0000000000000000
ShdPnd:????0000000000000000
SigBlk:????0000000000000000
SigIgn:????0000000000001000
SigCgt:????0000000180014005
CapInh:????0000000000000000
CapPrm:????0000003fffffffff
CapEff:????0000003fffffffff
CapBnd:????0000003fffffffff
CapAmb:????0000000000000000
Seccomp:????0
Cpus_allowed:????ffffffff,ffffffff
Cpus_allowed_list:????0-63
Mems_allowed:????00000000,00000001
Mems_allowed_list:????0
voluntary_ctxt_switches:????1307
nonvoluntary_ctxt_switches:????203
如果我們想查看這個(gè)進(jìn)程打開(kāi)了多少文件可以
?ls -l /proc/1298/fd/* | wc
查看進(jìn)程詳細(xì)的內(nèi)存映射信息
cat /proc/7393/maps
我們看一下meminfo各個(gè)注釋:參考documentation/filesystem/proc.txt?
MemTotal:?Total usable ram?(i.e.?physical ram minus a few reserved bits?and?the kernel binary code)?
MemFree:?The sum?of?LowFree+HighFree
Buffers:?Relatively temporary storage?for?raw disk blocks shouldn't get tremendously large (20MB or so)
Cached: in-memory cache for files read from the disk (the pagecache). Doesn't?include?
SwapCached SwapCached:?Memory that once was swapped?out,?is swapped back?in?but still also is?in?the swapfile?(if?memory is needed it
doesn't need to be swapped out AGAIN because it is already in the swapfile. This saves I/O)
Active: Memory that has been used more recently and usually not reclaimed unless absolutely necessary.?
Inactive: Memory which has been less recently used. It is more eligible to be reclaimed for other purposes?
HighTotal:?
HighFree: Highmem is all memory above ~860MB of physical memory Highmem areas are for use by userspace programs, or
for the pagecache. The kernel must use tricks to access this memory, making it slower to access than lowmem.
LowTotal:
LowFree: Lowmem is memory which can be used for everything that highmem can be used for, but it is also available for the
kernel's use?for?its own data structures.?Among many other things,?it is where everything from the Slab is
allocated.?Bad things happen when you're out of lowmem.?
SwapTotal: total amount of swap space available
SwapFree: Memory which has been evicted from RAM, and is temporarily on the disk?
Dirty: Memory which is waiting to get written back to the disk?
Writeback: Memory which is actively being written back to the disk
AnonPages: Non-file backed pages mapped into userspace page tables
AnonHugePages: Non-file backed huge pages mapped into userspace page tables?
Mapped: files which have been mmaped, such as libraries?
Slab: in-kernel data structures cache
SReclaimable: Part of Slab, that might be reclaimed, such as caches
SUnreclaim: Part of Slab, that cannot be reclaimed on memory pressure?
PageTables: amount of memory dedicated to the lowest level of page tables.?
NFS_Unstable: NFS pages sent to the server, but not yet committed to stable storage?
Bounce: Memory used for block device "bounce buffers"?
WritebackTmp: Memory used by FUSE for temporary writeback buffers?
CommitLimit: Based on the overcommit ratio ('vm.overcommit_ratio'), this is the total amount of memory currently available to
be allocated on the system. This limit is only adhered to if strict overcommit accounting is enabled (mode 2 in
'vm.overcommit_memory').
The CommitLimit is calculated with the following formula: CommitLimit = ('vm.overcommit_ratio' * Physical RAM) + Swap
For example, on a system with 1G of physical RAM and 7G
of swap with a `vm.overcommit_ratio` of 30 it would
yield a CommitLimit of 7.3G.
For more details, see the memory overcommit documentation in vm/overcommit-accounting.
Committed_AS: The amount of memory presently allocated on the system. The committed memory is a sum of all of the memory which
has been allocated by processes, even if it has not been
"used" by them as of yet. A process which malloc()'s 1G
of?memory,?but only touches 300M?of?it will only show up as using 300M?of?memory even?if?it has the address space
allocated?for?the entire 1G.?This?1G is memory which has been?"committed"?to by the?VM?and?can be used at any time
by the allocating application.?With strict overcommit enabled on the system?(mode 2?in?'vm.overcommit_memory'),
allocations which would exceed the CommitLimit?(detailed above)?will?not?be permitted.?This?is useful?if?one needs
to guarantee that processes will?not?fail due to lack?of?memory once that memory has been successfully allocated.?
VmallocTotal:?total?size?of?vmalloc memory area
VmallocUsed:?amount?of?vmalloc area which is used?
VmallocChunk:?largest contiguous block?of?vmalloc area which is free
我們只需要關(guān)注幾項(xiàng)就ok. ?buffers/cache/slab/active/anonpages
Active= Active(anon) + Active(file) ? ?(同樣Inactive)
AnonPages: Non-file backed pages mapped into userspace page tables
buffers和cache的區(qū)別注釋說(shuō)的很清楚了.
有時(shí)候不是內(nèi)存泄露,同樣也會(huì)讓系統(tǒng)崩潰,比如cache、buffers等占用的太多,打開(kāi)太多文件,而等待系統(tǒng)自動(dòng)回收是一個(gè)非常漫長(zhǎng)的過(guò)程.
從proc目錄下的meminfo文件了解到當(dāng)前系統(tǒng)內(nèi)存的使用情況匯總,其中可用的物理內(nèi)存=memfree+buffers+cached,當(dāng)memfree不夠時(shí),內(nèi)核會(huì)通過(guò)
回寫(xiě)機(jī)制(pdflush線程)把cached和buffered內(nèi)存回寫(xiě)到后備存儲(chǔ)器,從而釋放相關(guān)內(nèi)存供進(jìn)程使用,或者通過(guò)手動(dòng)方式顯式釋放cache內(nèi)存
點(diǎn)擊(此處)折疊或打開(kāi)
drop_caches
Writing to?this?will cause the kernel to drop clean caches,?dentries?and?inodes from memory,?causing that memory to become free.
To free pagecache:
echo?1?>?/proc/sys/vm/drop_caches?
To free dentries?and?inodes:?
echo?2?>?/proc/sys/vm/drop_caches
To free pagecache,?dentries?and?inodes:?
echo?3?>?/proc/sys/vm/drop_caches
As?this?is a non-destructive operation?and?dirty objects are?not?freeable,?the user should run `sync`first
用戶空間內(nèi)存檢測(cè)也可以通過(guò)mtrace來(lái)檢測(cè)用法也非常簡(jiǎn)單,之前文章我們有提到過(guò). 包括比較有名的工具valgrind、以及dmalloc、memwatch等.各有特點(diǎn).
? ?內(nèi)核內(nèi)存泄露的定位比較復(fù)雜,先判斷是否是內(nèi)核泄露了,然后在具體定位什么操作,然后再排查一些可疑的模塊,內(nèi)核內(nèi)存操作基本都是kmalloc
即通過(guò)slab/slub/slob機(jī)制,所以如果meminfo里slab一直增長(zhǎng)那么很有可能是內(nèi)核的問(wèn)題.我們可以更加詳細(xì)的查看slab信息
cat /proc/slabinfo?
如果支持slabtop更好,基本可以判斷內(nèi)核是否有內(nèi)存泄漏,并且是在操作什么對(duì)象的時(shí)候發(fā)生的。
點(diǎn)擊(此處)折疊或打開(kāi)
cat /proc/slabinfo?
slabinfo?-?version:?2.1
#?name??????:?tunables????:?slabdata??
fuse_request 0 0 288 28 2?:?tunables 0 0 0?:?slabdata 0 0 0
fuse_inode 0 0 448 18 2?:?tunables 0 0 0?:?slabdata 0 0 0
fat_inode_cache 0 0 424 19 2?:?tunables 0 0 0?:?slabdata 0 0 0
fat_cache 0 0 24 170 1?:?tunables 0 0 0?:?slabdata 0 0 0
在內(nèi)核的配置中里面已經(jīng)支持了一部分memleak自動(dòng)檢查的選項(xiàng),可以打開(kāi)來(lái)進(jìn)行跟蹤調(diào)試.
這里沒(méi)有深入的東西,算是拋磚引玉吧~.
?
評(píng)論
查看更多