背景
最近在给基于openresty的应用补充单元测试, 我们选用了busted框架来跑单元测试, busted中集成了luacov, 加上配置就可以输出代码覆盖率,具体方式是.busted 配置文件中加上 coverage 选项就开启了代码覆盖率输出
coverage = true
配置好后,make test从生成的 luacov.report.out 文件中可以获取到跑单元测试时,每个文件中代码行数的覆盖情况, 以及整体代码覆盖率的汇总,类似如下:
---------------------------------------------------------
Summary
---------------------------------------------------------
File Hits Missed Coverage
--------------------------------------------------------
foo.lua 89 41 68.46%
bar.lua 29 16 64.44%
...
--------------------------------------------------------
Total 774 739 51.16%
问题
单看上面的报告似乎没有问题,但是仔细查看单个文件的覆盖行,会发现luacov统计的并不准确,如 foo.lua 这个文件,在大括号结尾的前一行被标记为未覆盖,但是显然这些行实际该和它上一行一样不参与到覆盖率的计算
5 FOO = {
errId = "000001",
**0 errMsg = "error foo"
5 },
5 BAR = {
errId = "000002",
**0 errMsg = 'error bar'
5 },
5 FOO_BAR = {
errId = "000003",
**0 errMsg = 'error foo bar'
5 },
在luacov的github issue上看到有用户遇到了同样的问题,luacheck的作者回答说和luajit的IIRC有关,并且写了一个luacov的C扩展 cluacov,将统计的精度和运行效率提升了
于是luarocks install cluacov
, 从luacov的代码可以看到,如果安装了cluacov, luacov会默认使用这个扩展,因此不需要进行任何的配置
local cluacov_ok = pcall(require, "cluacov.version")
local deepactivelines
if cluacov_ok then
deepactivelines = require("cluacov.deepactivelines")
end
满怀期待地再次make test, 直接出现了error, 于是只能单独执行luacov看发生了什么
luajit -lluacov xxx.lua
直接提示core dumped,于是gdb查看core dump信息
(gdb) bt
#0 0x00007f787f4dd1d4 in add_activelines () from /usr/local/lib/lua/5.1/cluacov/deepactivelines.so
#1 0x00007f787f4dd342 in l_deepactivelines () from /usr/local/lib/lua/5.1/cluacov/deepactivelines.so
#2 0x0000000000407ec2 in lj_BC_FUNCC ()
#3 0x000000000040a5ff in gc_call_finalizer (g=g@entry=0x7f787f4a83f0, L=L@entry=0x7f787f4a8380, mo=<optimized out>,
o=o@entry=0x7f787f4c7630) at lj_gc.c:511
#4 0x000000000040a765 in gc_finalize (L=L@entry=0x7f787f4a8380) at lj_gc.c:558
#5 0x000000000040be48 in lj_gc_finalize_udata (L=L@entry=0x7f787f4a8380) at lj_gc.c:565
#6 0x0000000000414591 in cpfinalize (L=0x7f787f4a8380, dummy=<optimized out>, ud=<optimized out>) at lj_state.c:272
#7 0x00000000004082b8 in lj_vm_cpcall ()
#8 0x0000000000414a24 in lua_close (L=0x7f787f4a8380) at lj_state.c:298
#9 0x0000000000404df8 in main (argc=3, argv=<optimized out>) at luajit.c:584
由于没有开启 -g 调试选项, 堆栈信息只能看到发生在了 deepactivelines.so 中,这正是刚才安装的C扩展,看来是遇上麻烦了,于是clone了源码,加上调试选项重新编译替换 deepactivelines.so
(gdb) bt
#0 0x00007f38e8c002a4 in add_activelines (L=L@entry=0x7f38e8bcb380, proto=0xf66ffbf) at src/cluacov/deepactivelines.c:58
#1 0x00007f38e8c00412 in l_deepactivelines (L=0x7f38e8bcb380) at src/cluacov/deepactivelines.c:97
#2 0x0000000000407ec2 in lj_BC_FUNCC ()
#3 0x000000000040a5ff in gc_call_finalizer (g=g@entry=0x7f38e8bcb3f0, L=L@entry=0x7f38e8bcb380, mo=<optimized out>,
o=o@entry=0x7f38e8bd5ab8) at lj_gc.c:511
#4 0x000000000040a765 in gc_finalize (L=L@entry=0x7f38e8bcb380) at lj_gc.c:558
#5 0x000000000040be48 in lj_gc_finalize_udata (L=L@entry=0x7f38e8bcb380) at lj_gc.c:565
#6 0x0000000000414591 in cpfinalize (L=0x7f38e8bcb380, dummy=<optimized out>, ud=<optimized out>) at lj_state.c:272
#7 0x00000000004082b8 in lj_vm_cpcall ()
#8 0x0000000000414a24 in lua_close (L=0x7f38e8bcb380) at lj_state.c:298
#9 0x0000000000404df8 in main (argc=3, argv=<optimized out>) at luajit.c:584
(gdb) f 0
#0 0x00007f38e8c002a4 in add_activelines (L=L@entry=0x7f38e8bcb380, proto=0xf66ffbf) at src/cluacov/deepactivelines.c:58
58 const void *lineinfo = proto_lineinfo(proto);
(gdb) p *proto
Cannot access memory at address 0xf66ffbf
这次可以看到奔溃发生在了 add_activelines的 proto_lineinfo(proto) 处,翻阅代码 proto 是一个GCproto类型的指针,但是指向的内容是个非法的地址,翻看cluacov中的lj2头文件中定义的GCproto
typedef struct GCproto {
GCHeader;
uint8_t numparams; /* Number of parameters. */
uint8_t framesize; /* Fixed frame size. */
MSize sizebc; /* Number of bytecode instructions. */
GCRef gclist;
MRef k; /* Split constant array (points to the middle). */
MRef uv; /* Upvalue list. local slot|0x8000 or parent uv idx. */
MSize sizekgc; /* Number of collectable constants. */
MSize sizekn; /* Number of lua_Number constants. */
MSize sizept; /* Total size including colocated arrays. */
uint8_t sizeuv; /* Number of upvalues. */
uint8_t flags; /* Miscellaneous flags (see below). */
uint16_t trace; /* Anchor for chain of root traces. */
/* ------ The following fields are for debugging/tracebacks only ------ */
GCRef chunkname; /* Name of the chunk this function was defined in. */
BCLine firstline; /* First line of the function definition. */
BCLine numline; /* Number of lines for the function definition. */
MRef lineinfo; /* Compressed map from bytecode ins. to source line. */
MRef uvinfo; /* Upvalue names. */
MRef varinfo; /* Names and compressed extents of local variables. */
} GCproto;
联想我使用的luajit是openresty 1.19.3.1中编译得到的,默认开启了gc64模式,而这里的头文件中没有gc64相关的宏, 很可能和这有关,查看openresty的luajit中的GCproto类型,多了一个条件编译, 开启gc64后多了一个字段
#if LJ_GC64
uint32_t unused_gc64;
#endif
于是用openresty中几个相同的文件替换重新编译替换 deepactivelines.so
问题解决
使用新编译的so之后运行果然不再产生core dumped文件了,相同的测试代码,使用cluacov扩展得到的覆盖率如下,可以看到和luacov相比提升了2.48%,
---------------------------------------------------------
Summary
---------------------------------------------------------
File Hits Missed Coverage
----------------------------------------------------------
foo.lua 89 2 97.80%
bar.lua 29 16 64.44%
...
----------------------------------------------------------
Total 774 669 53.64%
翻看foo.lua
文件覆盖率统计, 之前恼人的 **0 已经消失了
5 FOO = {
errId = "000001",
errMsg = "error foo"
5 },
5 BAR = {
errId = "000002",
errMsg = 'error bar'
5 },
5 FOO_BAR = {
errId = "000003",
errMsg = 'error foo bar'
5 },
到此问题应该是解决了,于是将这个so同步更新到了ci的机器上,以后输出的覆盖率会更准确一些了, 代码改动见github