Wednesday, 5 February 2014

Use of Assembly in packages

Today we're looking at two packages, lua and samba, and determining whether assembly has been used in them, and if so, what is its purpose.

Samba

As stated on their website "Samba is the standard Windows interoperability suite of programs for Linux and Unix.", which essentially allows seamless network communications between machines with different operating systems (Windows, Unix and Linux in particular).

While searching for assembly code, I found quite a bit in the file called inffas86.c, located in /lib/zlib/contrib/inflate86/. Assembly is architecture specific and is separated by ifdefs, each architecture specific chunk of code is about 300 lines. What it does is it loops and decodes some codes and writes them out as literals. Here is a brief explanation, found in the file, and a snippet of that code:


/*
   Decode literal, length, and distance codes and write out the
   resulting literal and match bytes until either not enough input
   or output is available, an end-of-block is encountered, or a data
   error is encountered. When large enough input and output buffers
   are supplied to inflate(), for example, a 16K input buffer and a
   64K output buffer, more than 95% of the inflate execution time is
   spent in this routine.
*/

#if defined( __GNUC__ ) && defined( __amd64__ ) && ! defined( __i386 )
    __asm__ __volatile__ (
"        leaq    %0, %%rax\n"
"        movq    %%rbp, 8(%%rax)\n"       /* save regs rbp and rsp */
"        movq    %%rsp, (%%rax)\n"
"        movq    %%rax, %%rsp\n"          /* make rsp point to &ar */
"        movq    16(%%rsp), %%rsi\n"      /* rsi  = in */
"        movq    32(%%rsp), %%rdi\n"      /* rdi  = out */
"        movq    24(%%rsp), %%r9\n"       /* r9   = last */
"        movq    48(%%rsp), %%r10\n"      /* r10  = end */
"        movq    64(%%rsp), %%rbp\n"      /* rbp  = lcode */
"        movq    72(%%rsp), %%r11\n"      /* r11  = dcode */
"        movq    80(%%rsp), %%rdx\n"      /* rdx  = hold */
"        movl    88(%%rsp), %%ebx\n"      /* ebx  = bits */
"        movl    100(%%rsp), %%r12d\n"    /* r12d = lmask */
"        movl    104(%%rsp), %%r13d\n"    /* r13d = dmask */
                                          /* r14d = len */
                                          /* r15d = dist */
"        cld\n"
"        cmpq    %%rdi, %%r10\n"
"        je      .L_one_time\n"           /* if only one decode left */
"        cmpq    %%rsi, %%r9\n"
"        je      .L_one_time\n"
"        jmp     .L_do_loop\n"

".L_one_time:\n"
"        movq    %%r12, %%r8\n"           /* r8 = lmask */
"        cmpb    $32, %%bl\n"
"        ja      .L_get_length_code_one_time\n"

"        lodsl\n"                         /* eax = *(uint *)in++ */
"        movb    %%bl, %%cl\n"            /* cl = bits, needs it for shifting */
"        addb    $32, %%bl\n"             /* bits += 32 */
"        shlq    %%cl, %%rax\n"
"        orq     %%rax, %%rdx\n"          /* hold |= *((uint *)in)++ << bits */
"        jmp     .L_get_length_code_one_time\n"
No fall-backs have been provided for this file and if the architecture is not supported, an error is thrown:
#else
#error "x86 architecture not defined"
Another file that contains assembly is byteorder.h, located in /lib/util/. In here we see assembly defined specifically for PowerPC that uses load/store instructions for a short or int conversion.
#if (defined(__powerpc__) && defined(__GNUC__))
static __inline__ uint16_t ld_le16(const uint16_t *addr)
{
 uint16_t val;
 __asm__ ("lhbrx %0,0,%1" : "=r" (val) : "r" (addr), "m" (*addr));
 return val;
}

static __inline__ void st_le16(uint16_t *addr, const uint16_t val)
{
 __asm__ ("sthbrx %1,0,%2" : "=m" (*addr) : "r" (val), "r" (addr));
}
Lua
Lua is a lightweight, powerful and embeddable scripting language used in a vast variety of applications and several well known games.

A quick search of Lua files for anything that resembles assembly return just one line of code in a single file. By taking a close look, the assembly looks very similar to what we saw in byteorder.h from Samba. In this case it is Microsoft specific and is used for an integer conversion. For any other architecture a line of C code is used instead. Here is what the assembly looks like:
#if defined(MS_ASMTRICK) || defined(LUA_MSASMTRICK) /* { */
/* trick with Microsoft assembler for X86 */

#define lua_number2int(i,n)  __asm {__asm fld n   __asm fistp i}
#define lua_number2integer(i,n)  lua_number2int(i, n)
#define lua_number2unsigned(i,n)  \
  {__int64 l; __asm {__asm fld n   __asm fistp l} i = (unsigned int)l;}


1 comment:

  1. Eugen - Good post. I'm not sure I agree with your comment about no fallbacks for the Samba file inffas86.c though -- it looks like it's a replacement for inffast.c for x86 only (i.e., inffast.c itself would be used in most cases). Also, quickly grepping through the Makefiles etc. it looks like it might not even be built on x86 unless you take some manual steps (I'd want to check that more carefully before saying for certain). It would be worthwhile looking at how some of the distros build this.

    ReplyDelete