内存异常处理手段总汇一 -- electric-fence及其源码解析

　　内存越界类问题都不太好搞，但也充满乐趣，有如侦探抽丝剥茧般。因为处理过好些这类问题，也就有想法将这些手段总汇起来，建立成一个辑录供参考。实际第一篇早已做出，即GCC SSP Canary功能简介一篇，这也就是第二篇了。从electric-fence开始主要因其代码简单，一来方便阅读，二来方便自己修改优化。小工具用好即为神器，会用然后再做修改，便能顺心随意。本文最后的总结也提及一个修改tcmalloc实现自己需求的团队例子，器终归是器，用的人才是决定其价值的关键。

　　electric-fence 是一款malloc类函数内存调试工具，主要通过mprotect对memalign、malloc、free、valloc、calloc、strndup、strdup、realloc等函数进行重写，实现越界防护检测机制。这个实现也就决定其只能用于内存越界访问行为debug，没法检测内存泄漏，也无法准确定位C++中new, new[], delete, delete[]的问题（但可以大概定位）。想起曾有大佬让我用efence查内存泄漏，而且还是对象泄漏问题…
　　efence的核心实现由initialize、memalign、free组成，捋顺这三个，整个工作机制也就打通了。

通过LD_PRELOAD和GDB使用eFence

　　如果你只想了解最核心的原理，可以直接通过本文memalign实现 – allocator核心一节了解，然后直接参考使用efence检测内存泄漏问题这篇文章就可顺溜地使用efence了。这里也补充介绍两种debug方式：LD_PRELOAD和GDB。
　　如果是已经编译好但未链接libefence的bin档，我们可以通过”LD_PRELOAD=./libefence.so bin”来预加载libefence.so再执行bin，实现efence符号对libc中malloc符号的覆盖。
　　gdb需在.gdbinit增加如下代码，也可以在此基础上也可增加你需要的配置（具体参考efence - Linux man page介绍）。然后在debug对应程序时，执行efence on即可开启对应功能。

define efence
        set environment EF_ALLOW_MALLOC_0 0
        set environment LD_PRELOAD /usr/lib/libefence.so.0.0
        echo Enabled eFence\n
end

initialize函数 – 配置库运行功能和创建内存索引

　　顾名思义，initialize函数完成整个库的初始化和环境配置工作。其实现主要分为两部分，第一部分是通过环境变量获取各项功能开关设置，第二部分是申请一块不小于1M的内存并配置相关数据结构。

功能开关配置

　　使能对应功能，首先需要code中配置对应变量为-1，然后通过同名环境变量来实现对对应功能的开关，相关变量大概说明如下。

EF_DISABLE_BANNER:是否打印库版本信息

EF_ALIGNMENT:Efence malloc分配空间的内存对齐字节数，默认值为sizeof(int)，这个值也是Efence能够检测的内存越界的最小值。

EF_PROTECT_BELOW：默认情况下，efence检测是高地址越界问题，若将此值设置为1，则表示检测内存低地址越界问题。

EF_PROTECT_FREE：使能use after free检测。

EF_ALLOW_MALLOC_0：是否检查malloc(0)行为

EF_FREE_WIPES：free内存后，是否对该区域填充0xbd

/*
 * initialize sets up the memory allocation arena and the run-time
 * configuration information.
 */
static void
initialize(void)
{
        /* 功能说明和配置 */
       if ( EF_DISABLE_BANNER == -1 ) {
               if ( (string = getenv("EF_DISABLE_BANNER")) != 0 )
                       EF_DISABLE_BANNER = atoi(string);
               else
                       EF_DISABLE_BANNER = 0;
       }

       if ( EF_DISABLE_BANNER == 0 )
               EF_Print(version); // 打印efence库版本信息

 /*
  * Import the user's environment specification of the default
  * alignment for malloc(). We want that alignment to be under
  * user control, since smaller alignment lets us catch more bugs,
  * however some software will break if malloc() returns a buffer
  * that is not word-aligned.
  *
  * I would like
  * alignment to be zero so that we could catch all one-byte
  * overruns, however if malloc() is asked to allocate an odd-size
  * buffer and returns an address that is not word-aligned, or whose
  * size is not a multiple of the word size, software breaks.
  * This was the case with the Sun string-handling routines,
  * which can do word fetches up to three bytes beyond the end of a
  * string. I handle this problem in part by providing
  * byte-reference-only versions of the string library functions, but
  * there are other functions that break, too. Some in X Windows, one
  * in Sam Leffler's TIFF library, and doubtless many others.
  */

    /*
     * 内存对齐配置： 如上面英文注释介绍，这里配置的是malloc的内存对齐颗粒度，
     * 原理上，取值越小能够捕捉到更多非法访问行为，但也会导致malloc返回的
     * 内存地址不是字对齐的，会导致部分软件运行异常。
     */
 if ( EF_ALIGNMENT == -1 ) {
  if ( (string = getenv("EF_ALIGNMENT")) != 0 )
   EF_ALIGNMENT = (size_t)atoi(string);
  else
   EF_ALIGNMENT = sizeof(int);
 }

 /*
  * See if the user wants to protect the address space below a buffer,
  * rather than that above a buffer.
  */
    /*
     * 保护buffer的高地址还是低地址的选项开关，即检测内存低地址越界，还是高地址越界问题 
     */
 if ( EF_PROTECT_BELOW == -1 ) {
  if ( (string = getenv("EF_PROTECT_BELOW")) != 0 )
   EF_PROTECT_BELOW = (atoi(string) != 0);
  else
   EF_PROTECT_BELOW = 0;
 }

 /*
  * See if the user wants to protect memory that has been freed until
  * the program exits, rather than until it is re-allocated.
  */
    /*
     * use after free检测功能开关
     */
 if ( EF_PROTECT_FREE == -1 ) {
  if ( (string = getenv("EF_PROTECT_FREE")) != 0 )
   EF_PROTECT_FREE = (atoi(string) != 0);
  else
   EF_PROTECT_FREE = 0;
 }

 /*
  * See if the user wants to allow malloc(0).
  */
 if ( EF_ALLOW_MALLOC_0 == -1 ) {
  if ( (string = getenv("EF_ALLOW_MALLOC_0")) != 0 )
   EF_ALLOW_MALLOC_0 = (atoi(string) != 0);
  else
   EF_ALLOW_MALLOC_0 = 0;
 }

 /*
  * See if the user wants us to wipe out freed memory.
  */
 if ( EF_FREE_WIPES == -1 ) {
         if ( (string = getenv("EF_FREE_WIPES")) != 0 )
                 EF_FREE_WIPES = (atoi(string) != 0);
         else
                 EF_FREE_WIPES = 0;
 }

    /* 内存申请和初始化操作，具体见下节分析 */
}

Initialize的内存操作部分

　　Initialize的内存操作主要实现流程如下：

第一次申请内存时，通过mmap向操作系统申请不少于MEMORY_CREATION_SIZE字节内存，code中设定为1MB，该值实际会自动向上取最小满足页对齐的大小。

申请到的内存第一页用于存放slot结构体数组。slot用于管理被分配的各个内存单元信息,如返回给用户的实际地址、内部实际地址、大小等，具体如下。

slot[0]用于存放slot结构体数组起始地址和当前数组大小的信息。

struct _Slot {
 void *	userAddress; // 返回给用户的实际地址
 void *	internalAddress; // 实际内存区起始地址
 size_t	userSize; // 用户申请的大小
 size_t	internalSize; // 内部实际申请的大小
 Mode	mode; // 节点对应的模式，具体如下
};

enum _Mode {
 NOT_IN_USE = 0,	/* Available to represent a malloc buffer. */
 FREE,	/* A free buffer. */
 ALLOCATED,	/* A buffer that is in use. */
 PROTECTED,	/* A freed buffer that can not be allocated again. */
 INTERNAL_USE	/* A buffer used internally by malloc(). */
};

代码解析如下

size_t  size = MEMORY_CREATION_SIZE; //  MEMORY_CREATION_SIZE = 1024*1024;
    
/*
 * Get the run-time configuration of the virtual memory page size.
 */
    // 获取当前虚拟内存页大小
bytesPerPage = Page_Size(); 
/*
 * Figure out how many Slot structures to allocate at one time.
 */
    // 计算一个页可以存储多少个slot结构
slotCount = slotsPerPage = bytesPerPage / sizeof(Slot); 
allocationListSize = bytesPerPage;
if ( allocationListSize > size )
    size = allocationListSize; // 若页大于1M，则申请内存的大小扩展为页大小
if ( (slack = size % bytesPerPage) != 0 )
    size += bytesPerPage - slack;  // 将申请的内存以页大小对齐
/*
 * Allocate memory, and break it up into two malloc buffers. The
 * first buffer will be used for Slot structures, the second will
 * be marked free.
 */
    // 通过mmap创建N * bytesPerPage大小的内存
slot = allocationList = (Slot *)Page_Create(size); 
memset((char *)allocationList, 0, allocationListSize);
    // 申请的内存第一页用于存放slot结构数组，这一页的信息记录于slot[0]
    // eFence中allocateMoreSlots没有更新这个结构体的信息，算是个bug吧？
slot[0].internalSize = slot[0].userSize = allocationListSize;
slot[0].internalAddress = slot[0].userAddress = allocationList;
slot[0].mode = INTERNAL_USE;
    // 当申请size大于一页，则剩余部分作为用户内存预留下来。
if ( size > allocationListSize ) {
    slot[1].internalAddress = slot[1].userAddress
     = ((char *)slot[0].internalAddress) + slot[0].internalSize;
    slot[1].internalSize
     = slot[1].userSize = size - slot[0].internalSize;
    slot[1].mode = FREE;
}
/*
 * Deny access to the free page, so that we will detect any software
 * that treads upon free memory.
 */
 // mprotect将刚申请的内存部分设置为PROT_NONE，防止非法读写行为
     // 实际，slot[0]对应的那块内存也要保护的，但没有，暂未发现会导致什么问题。
Page_DenyAccess(slot[1].internalAddress, slot[1].internalSize);
/*
 * Account for the two slot structures that we've used.
 */
    // 消耗了两个slot，因此计数计数减一，efence中Hardcode了当unUsedSlots < 7时
    // 需要重新在这个基础上申请多一个页来存放新的slot
unUsedSlots = slotCount - 2;

memalign函数 – allocator核心

　　memalign是整个efence的核心部分，所申请的内存块都是以页为单位（受限于mprotect），其取值为最小满足用户申请的内存大小的页数再加一，多加的这一页就是eFence工作的根本。如下例子中，用户申请的内存小于一页，而memalign实际申请了两页内存。

当EF_PROTECT_BELOW为0、查高地址越界访问时，allocator会设置Page 1为PROT_NONE，Page 0为RW，然后Page 1的起始地址addr - sizeof(Variables)即为返回的地址（落在Page0中）。

当EF_PROTECT_BELOW非0、查低地址越界访问时，Page 0会被设置为PROT_NONE，Page 1起始地址为返回地址。

完成如上设置，当出现内存越界访问时，就会对PROTECTED区域进行读写，导致页错误而coredump，这也就是efence实现的原理了。

　　memalign函数的具体实现分为下面几步，显然，在大量申请小内存的场景中，efence如果没有及时释放申请的内存，内存将会严重碎片化。

对用户申请内存的大小做预处理。检查是否malloc(0)->按函参alignment大小对齐->增加1个页的大小，然后取最小满足该内存大小的最小页数（即internalSize）

查询空闲的slot用于记录本次申请。如果当前未使用的slot数小于7，申请一块大slot数组一个页的内存，扩展slot数组。

内存分配，从slot数组FREE的记录中查找满足internalSize的内存，最终可分为如下三种case：

1)刚好有一块大小满足internalSize的内存，直接使用；
2)有不少于一块大于申请大小的内存。选取最小的一块，将这块内存分割为两部分，一部分作为结果返回给用户，剩余部分用一个NOT_IN_USE的slot记录供下次申请；
3)所有记录都小于申请的大小。重新申请一块不小于1M且页对齐的内存，并用一个空闲的slot（NOT_IN_USE）记录下来，然后执行第2个case的操作。

/*
 * This is the memory allocator. When asked to allocate a buffer, allocate
 * it in such a way that the end of the buffer is followed by an inaccessable
 * memory page. If software overruns that buffer, it will touch the bad page
 * and get an immediate segmentation fault. It's then easy to zero in on the
 * offending code with a debugger.
 *
 * There are a few complications. If the user asks for an odd-sized buffer,
 * we would have to have that buffer start on an odd address if the byte after
 * the end of the buffer was to be on the inaccessable page. Unfortunately,
 * there is lots of software that asks for odd-sized buffers and then
 * requires that the returned address be word-aligned, or the size of the
 * buffer be a multiple of the word size. An example are the string-processing
 * functions on Sun systems, which do word references to the string memory
 * and may refer to memory up to three bytes beyond the end of the string.
 * For this reason, I take the alignment requests to memalign() and valloc()
 * seriously, and
 *
 * Electric Fence wastes lots of memory. I do a best-fit allocator here
 * so that it won't waste even more. It's slow, but thrashing because your
 * working set is too big for a system's RAM is even slower.
 */
extern C_LINKAGE void *
memalign(size_t alignment, size_t userSize)
{
    register Slot * slot;
    register size_t count;
    Slot *      fullSlot = 0;
    Slot *      emptySlots[2];
    size_t      internalSize;
    size_t      slack;
    char *      address;
    if ( allocationList == 0 )
        initialize();
    // EF_ALLOW_MALLOC_0为0时开启malloc(0)检查
    if ( userSize == 0 && !EF_ALLOW_MALLOC_0 )
        EF_Abort("Allocating 0 bytes, probably a bug.");
    /*
     * If EF_PROTECT_BELOW is set, all addresses returned by malloc()
     * and company will be page-aligned.
     */
    /* 
     *   如果未开启低地址越界保护，则需要做内存对齐
     */
    if ( !EF_PROTECT_BELOW && alignment > 1 ) {
        if ( (slack = userSize % alignment) != 0 )
            userSize += alignment - slack;
    }
    /*
     * The internal size of the buffer is rounded up to the next page-size
     * boudary, and then we add another page's worth of memory for the
     * dead page.
     */
     /*
      * 将用户申请的大小调整到如下数量的页
      * (userSize / bytesPerPage) + (int)(userSize % bytesPerPage != 0) 页
      */
    internalSize = userSize + bytesPerPage;
    if ( (slack = internalSize % bytesPerPage) != 0 )
        internalSize += bytesPerPage - slack;
    /*
     * These will hold the addresses of two empty Slot structures, that
     * can be used to hold information for any memory I create, and any
     * memory that I mark free.
     */
    /*
     * 准备记录两个NOT_IN_USE的slot，主要应对上面提及的case 2和case 3
     * case 2：emptySlots[0]记录分配给用户后，剩余的那块内存（FREE）。
     * case 3: emptySlots[0]记录新申请的内存，emptySlots[1]执行case 2中的功能
     */
     emptySlots[0] = 0;
     emptySlots[1] = 0;
    /*
     * The internal memory used by the allocator is currently
     * inaccessable, so that errant programs won't scrawl on the
     * allocator's arena. I'll un-protect it here so that I can make
     * a new allocation. I'll re-protect it before I return.
     */
    if ( !noAllocationListProtection )
        Page_AllowAccess(allocationList, allocationListSize);
    /*
     * If I'm running out of empty slots, create some more before
     * I don't have enough slots left to make an allocation.
     */
    /* 如果当前未使用的slot少于7个，则需要申请多一个页来扩展slot数组以满足后续
     * 使用，internalUse是用来防重入的，因memalign -> allocateMoreSlots->
     * internalUse = 1; -> malloc -> memalign，如果不做这判断就会重入爆栈。
     * 这里通过malloc来做会占用多一个页，造成浪费，这也就有一个优化点。
     */
    
    if ( !internalUse && unUsedSlots < 7 ) {
        allocateMoreSlots();
    }
    /*
     * Iterate through all of the slot structures. Attempt to find a slot
     * containing free memory of the exact right size. Accept a slot with
     * more memory than we want, if the exact right size is not available.
     * Find two slot structures that are not in use. We will need one if
     * we split a buffer into free and allocated parts, and the second if
     * we have to create new memory and mark it as free.
     *
     */
    /* 遍历slot数组，查找FREE slot中是否有一块不小于所需的内存，循环的终止条件为
     * 1) 找到恰好满足所需大小内存的FREE slot和一个NOT_IN_USE slot；
     * 2) 找不到恰好满足的大小，但找到两个NOT_IN_USE的slot
     * 满足一个即可。
     */
    for ( slot = allocationList, count = slotCount ; count > 0; count-- ) {
        if ( slot->mode == FREE
         && slot->internalSize >= internalSize ) {
            if ( !fullSlot
             ||slot->internalSize < fullSlot->internalSize){
                fullSlot = slot;
                if ( slot->internalSize == internalSize
                 && emptySlots[0] )
                    break;  /* All done, */
            }
        }
        else if ( slot->mode == NOT_IN_USE ) {
            if ( !emptySlots[0] )
                emptySlots[0] = slot;
            else if ( !emptySlots[1] )
                emptySlots[1] = slot;
            else if ( fullSlot
             && fullSlot->internalSize == internalSize )
                break;  /* All done. */
        }
        slot++;
    }
    if ( !emptySlots[0] )
        internalError();
    /* 
     * 如果通过第2个条件退出循环，且没找到一个不小于申请所需大小的内存，
     * 则申请一块不小于1M大小的内存，使用emptySlots[0]记录该buffer信息，
     * 然后将emptySlots[0]指向emptySlots[1]，供后续使用
     */
    if ( !fullSlot ) {
        /*
         * I get here if I haven't been able to find a free buffer
         * with all of the memory I need. I'll have to create more
         * memory. I'll mark it all as free, and then split it into
         * free and allocated portions later.
         */
        size_t  chunkSize = MEMORY_CREATION_SIZE;
        if ( !emptySlots[1] )
            internalError();
        if ( chunkSize < internalSize )
            chunkSize = internalSize;
        if ( (slack = chunkSize % bytesPerPage) != 0 )
            chunkSize += bytesPerPage - slack;
        /* Use up one of the empty slots to make the full slot. */
        fullSlot = emptySlots[0];
        emptySlots[0] = emptySlots[1];
        fullSlot->internalAddress = Page_Create(chunkSize);
        fullSlot->internalSize = chunkSize;
        fullSlot->mode = FREE;
        unUsedSlots--;
    }
    /*
     * If I'm allocating memory for the allocator's own data structures,
     * mark it INTERNAL_USE so that no errant software will be able to
     * free it.
     */
    if ( internalUse )
        fullSlot->mode = INTERNAL_USE;
    else
        fullSlot->mode = ALLOCATED;
    /*
     * If the buffer I've found is larger than I need, split it into
     * an allocated buffer with the exact amount of memory I need, and
     * a free buffer containing the surplus memory.
     */
    /*
     * 检查如上获取到的内存大小是否大于所需，若大于所需，则通过emptySlots[0]
     * 将多余部分记录下来，供下次申请使用，这个也是作者避免内存浪费的一个优化。
     */
    if ( fullSlot->internalSize > internalSize ) {
        emptySlots[0]->internalSize
         = fullSlot->internalSize - internalSize;
        emptySlots[0]->internalAddress
         = ((char *)fullSlot->internalAddress) + internalSize;
        emptySlots[0]->mode = FREE;
        fullSlot->internalSize = internalSize;
        unUsedSlots--;
    }
    
    // 实现保护的算法部分，也就是文字说明画的那个图，下面我也会给个文字图。
    if ( !EF_PROTECT_BELOW ) {
        /*
         * Arrange the buffer so that it is followed by an inaccessable
         * memory page. A buffer overrun that touches that page will
         * cause a segmentation fault.
         */
       /* 算法实现如下内存分布。当内存往高地址越界时，就会触发页错误导致coredump，
          也可见，用户申请的内存（MemForUser）越小，内存浪费越严重。
      low      +---------------+-----------------+
       |       |   WastedMem   |                 |
       |       +---------------+       page1     |
       |       |   MemForUser  |                 |
       |       +---------------+-----------------+
       |       |   PROT_NONE   |       page2     |
      high     +---------------+-----------------+
       */
        address = (char *)fullSlot->internalAddress;
        /* Set up the "live" page. */ // page1 部分使能读写
        if ( internalSize - bytesPerPage > 0 )
                Page_AllowAccess(
                 fullSlot->internalAddress
                ,internalSize - bytesPerPage);
        address += internalSize - bytesPerPage;
        /* Set up the "dead" page. */ // page2 禁止访写
        Page_DenyAccess(address, bytesPerPage);
        /* Figure out what address to give the user. */
        address -= userSize;
    }
    else {  /* EF_PROTECT_BELOW != 0 */
        /*
         * Arrange the buffer so that it is preceded by an inaccessable
         * memory page. A buffer underrun that touches that page will
         * cause a segmentation fault.
         */
       /* 内存往低地址写的时候，会触及PROT_NONE部分内存，导致页错误，引发coredump
      low      +---------------+-----------------+
       |       |   PROT_NONE   |      page2      |
       |       +---------------+-----------------+
       |       |   MemForUser  |                 |
       |       +---------------+      page1      |
       |       |   WastedMem   |                 |
      high     +---------------+-----------------+
       */
        address = (char *)fullSlot->internalAddress;
        /* Set up the "dead" page. */
        Page_DenyAccess(address, bytesPerPage);
        address += bytesPerPage;
        /* Set up the "live" page. */
        if ( internalSize - bytesPerPage > 0 )
            Page_AllowAccess(address, internalSize - bytesPerPage);
    }
    fullSlot->userAddress = address;
    fullSlot->userSize = userSize;
    /*
     * Make the pool's internal memory inaccessable, so that the program
     * being debugged can't stomp on it.
     */
    if ( !internalUse )
        Page_DenyAccess(allocationList, allocationListSize);
    return address;
}

free函数 – 碎片化问题处理

　　基本上，通过对memalign的解析，我们大概可以预想free的功能要怎么实现了。

如果开启EF_PROTECT_FREE，也就是UAF检测，则将slot->mode = PROTECTED，不再被使用，否则设置为FREE供后续使用

如果开启EF_FREE_WIPES，也就是poison memory的话，则将对应内存memset为0xbd

将free的内存区域mprotect为PROT_NONE

　　然后我没想到的是上面提及的内存碎片化，作者在这里做了处理，即在free时候尝试合并前后同类块以降低碎片化，不过看起来有bug。

extern C_LINKAGE void free(void * address)
{
    Slot *  slot;
    Slot *  previousSlot = 0;
    Slot *  nextSlot = 0;
    lock();
    if ( address == 0 ) {
        unlock();
        return;
    }
    if ( allocationList == 0 )
        EF_Abort("free() called before first malloc().");
    if ( !noAllocationListProtection )
        Page_AllowAccess(allocationList, allocationListSize);
    // 查找对应地址的所在slot
    slot = slotForUserAddress(address);
    if ( !slot )
        EF_Abort("free(%a): address not from malloc().", address);
    /*
     *  1) 如果是internalUse == 1，即allocateMoreSlots() -> free()，则正常释放，
     *  不abort退出，如果internalUse == 0，且为INTERNAL_USE内存，说明有非法尝试
     *  释放库内部地址，此时应abort退出。
     *  2) 如果是非INTERNAL_USE且不为ALLOCATED则是double free，因为PROTECTED/FREE
     *  都是free后标记的。
     */
    if ( slot->mode != ALLOCATED ) {
        if ( internalUse && slot->mode == INTERNAL_USE )
            /* Do nothing. */;
        else {
            EF_Abort(
             "free(%a): freeing free memory."
            ,address);
        }
    }
    if ( EF_PROTECT_FREE )
        slot->mode = PROTECTED;
    else
        slot->mode = FREE;
    if ( EF_FREE_WIPES )
      memset(slot->userAddress, 0xbd, slot->userSize);
    /*
     *  这里是作者为降低内存碎片化做的努力。分别查找当前slot内存块前后的内存
     *  是否在记录内，如果存在且为PROTECT或FREE的内存，将slot对应内存合并，
     *  并释放其中一个slot结构体。
     */
    previousSlot = slotForInternalAddressPreviousTo(slot->internalAddress);
    nextSlot = slotForInternalAddress(
     ((char *)slot->internalAddress) + slot->internalSize);
    /* 
     * bug case: 如果mmap多次分配内存的地址是连续的，即previousSlot/nextSlot
     * 地址刚好与当前slot地址连续，但他们是FREE的，而当前slot的是PROTECTED的，
     * 那会导致未使用的内存被污染了。
     */
    if ( previousSlot
     && (previousSlot->mode == FREE || previousSlot->mode == PROTECTED) ) {
        /* Coalesce previous slot with this one. */
        previousSlot->internalSize += slot->internalSize;
        if ( EF_PROTECT_FREE )
            previousSlot->mode = PROTECTED;
        slot->internalAddress = slot->userAddress = 0;
        slot->internalSize = slot->userSize = 0;
        slot->mode = NOT_IN_USE;
        slot = previousSlot;
        unUsedSlots++;
    }
    if ( nextSlot
     && (nextSlot->mode == FREE || nextSlot->mode == PROTECTED) ) {
        /* Coalesce next slot with this one. */
        slot->internalSize += nextSlot->internalSize;
        nextSlot->internalAddress = nextSlot->userAddress = 0;
        nextSlot->internalSize = nextSlot->userSize = 0;
        nextSlot->mode = NOT_IN_USE;
        unUsedSlots++;
    }
    slot->userAddress = slot->internalAddress;
    slot->userSize = slot->internalSize;
    /*
     * Free memory is _always_ set to deny access. When EF_PROTECT_FREE
     * is true, free memory is never reallocated, so it remains access
     * denied for the life of the process. When EF_PROTECT_FREE is false,
     * the memory may be re-allocated, at which time access to it will be
     * allowed again.
     */
    // 保护FREE和PROTECT的内存，防止越界和UAF
    Page_DenyAccess(slot->internalAddress, slot->internalSize);
    if ( !noAllocationListProtection )
        Page_DenyAccess(allocationList, allocationListSize);
        unlock();
}

总结

　　经过如上分析，我们可以发现efence并不是很完善，还有一些优化的空间。另外他也有如下局限性，

1、malloc和free在slot都是线性查找，复杂度为O(n)，性能较低。
2、内存消耗大，特别对于频繁的小内存分配，每次至少申请两个页，内存利用率低。
3、无法对同一块内存同时做上下越界检查。

　　按照我对eFence这版code的理解和阅读过程中产生的疑问，我做了部分修改，具体参考commit，还有待验证。而我也有想法尝试优化这个实现，一方面是锻炼自己，一方面也是做一个新的挑战，虽然efence跟ASAN确实还是天壤之别的。这次也是因为看到一个团队对tcmalloc进行客制化，实现很好的内存管控，也就有想法多了解些开源方案，当后续遇到类似问题可进行定制，进而提高工作效率。具体就是知乎上这个例子了。

我们团队的同事搞出了一套终极解决方案用于解决各种内存相关问题（例如内存泄漏，内存被踩坏等），很好用。简单来说就一句话修改tcmalloc，加入audit信息。具体修改包括如下2个方面：
1.在每次分配的时候，多申请12个字节，用于记录分配者的线程ID,栈ID,本次操作是分配还是释放的标志位，分配时间等信息。
2.在tcmalloc 上外挂了一个ringbuffer, 每次内存分配的时候记录下地址信息，线程ID,栈ID,长度，申请还是释放标志位
额外占用的内存其实很小。
对于题主说的内存被踩坏的情况，按照我们的经验，大多数都是野指针导致的，遇到后分析core文件，在ringbuffer中查找这个地址的前几代分配释放记录就一目了然了。
——[如何排查大型C程序中的内存写越界导致的coredump？](https://www.zhihu.com/question/51735480/answer/127297709)