Python - Note of Garbage Collection
阅读了一篇Python垃圾回收的文章, 用这篇日志记录一下. 原文地址:
Introduction to Python Memory Management
1. reference counting
2. garbage collection
在Python 2.0之前, 只用reference counting作为内存管理.
原理: 记录一个对象被其他对象引用的次数. 当对这个对象的引用移除了, 引用计数也减小了. 要是减到0了, 这个对象也就被释放了.
这种方法很高效, 但也有一些caveat(警告, 缺点的意思吧). 例如它无法解决reference circle的问题(有种死锁的味道):
1 2 3 4 5
Automatic Garbage Collection of Cycles
由于有上边这个reference circle的问题, 所以需要scheduled activity去自动收集垃圾.
分配的值 - 释放的值 > 阈值 的话: the garbage collector就会自动运行了. 它会运行gc模块去查找阈值.
1 2 3 4 5
但要注意的是如果Python已经把内存爆了的话, automatic garbage collection是不会执行的. 这时候你需要去处理抛出的异常, 或者程序已经崩溃了.
‘’‘This is aggravated by the fact that the automatic garbage collection places high weight upon the NUMBER of free objects, not on how large they are. Thus any portion of your code which frees up large blocks of memory is a good candidate for running manual garbage collection. ’‘’
Manual Garbage Collection
虽然在编码中reference cycle是要尽量去避免的, 但还是要有怎么去解决他们的办法.
1 2 3
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
1. Time-based 2. Event-based: For example, when a user disconnects from the application or when the application is known to enter an idle state.
- 不要太随意地去进行垃圾回收, 会严重影响性能(因为要去evalute每一个memory object).
- 在你的应用启动并趋于稳定后, 再进行手动地垃圾回收.
- Run manual garbage collection after infrequently run sections of code which use and then free large blocks of memory. 最好在这时运行手动的垃圾回收: 当一段不常用的代码使用并释放了大量内存的是时候.
- 当一段代码对timing很敏感的时候, 手动回收垃圾最好在它之前或之后运行.
1.Do not run garbage collection too freely, as it can take considerable time to evaluate every memory object within a large system. For example, one team having memory issues tried calling gc.collect() between every step of a complex start-up process, increasing the boot time by 20 times (2000%). Running it more than a few times per day – without specific design reasons – is likely a waste of device resources.
2.Run manual garbage collection after your application has completed start up and moves into steady-state operation. This frees potentially huge blocks of memory used to open and parse file, to build and modify object lists, and even code modules never to be used again. For example, one application reading XML configuration files was consuming about 1.5MB of temporary memory during the process. Without manual garbage collection, there is no way to predict when that 1.5MB of memory will be returned to the python memory pools for reuse.
3.Run manual garbage collection after infrequently run sections of code which use and then free large blocks of memory. For example, consider running garbage collection after a once-per-day task which evaluates thousands of data points, creates an XML ‘report’, and then sends that report to a central office via FTP or SMTP/email. One application doing such daily reports was creating over 800K worth of temporary sorted lists of historical data. Piggy-backing gc.collect() on such daily chores has the nice side-effect of running it once per day for ‘free’.
4.Consider manually running garbage collection either before or after timing-critical sections of code to prevent garbage collection from disturbing the timing. As example, an irrigation application might sit idle for 10 minutes, then evaluate the status of all field devices and make adjustments. Since delays during system adjustment might affect field device battery life, it makes sense to manually run garbage collection as the gateway is entering the idle period AFTER the adjustment process – or run it every sixth or tenth idle period. This insures that garbage collection won’t be triggered automatically during the next timing-sensitive period.