Introduction
In this article I will show how to write a file packer/unpacker and how to make a self-extracting version of the archive (SFX).
Please note this article and code has been written for learning purposes and not for complex functionality, thus the following limitations apply:
- Only packing of files (binding them into one file) and no compression
- Packer doesn't pack files in subdirectories
- Packer header is not really optimized - just enough for our purposes
- All code presented here compiles as a console application and no GUI version is provided
The Archive File Format
The idea is to build a structure/format that will allow us to hold a file list and file contents in one file in such a way that we will be able to restore the files to their original state.
Thus this design of the pack header:
Signature
- Offset 0x02/DWORD
This will occupy the first 4 bytes of the header. It will contain a simple signature that will allow us to identify our packed files.NumOfFiles
- Offset 0x04/DWORD
Here we stored aDWORD
holding the number of files in a subject.FilesInfo
- Offset 0x08/sizeof(packdata_t)
Here we start storing the file information in a sequence defined as the arraypackdata_t FileInfo[NumOfFiles]
.Thepackdata_t
structure is defined as:Collapse | Copy Codestruct packdata_t { char FileName[MAX_PATH]; long filesize; }
As you noticed, we simply save the file's size and name. Thepackdata_t
structure is not the optimal way of storing file names or information, because we could have used a variable lengthpackdata_t
struct defined asCollapse | Copy Codestruct packdata_t { long filesize; // Other file info, such as creation date , attributes, ... char filenameLength; char FileName[1]; }
But, of course, managing this last struct is beyond the scope of this article.
After the pack header we have the files' contents stored in sequence. So the whole archive file format will look like this:
Signature |
NumOfFiles |
packdata_t Files[NumOfFiles] |
File1 content |
File2 content |
. |
. |
. |
File(NumOfFiles) content |
Writing the Packer
In order to make the code a little extensible, I have defined a structure that will hold callback functions triggered from inside the packer/unpacker routines. These callbacks are used for visual notifications and updates.
The callback struct is defined as:
Collapse | Copy Code
typedef struct
{
void (*newfile)(char *name, long size);
void (*fileprogress)(long pos);
} packcallbacks_t;
The
newfile()
callback is called whenever the packer/unpacker encounters or processes a new file. It will be passed the file's name and size.
The
fileprogress()
callback is called whenever an operation is in progress. It will be passed the current position that the packer/unpacker is currently processing.
Now, let us define the packfiles function prototype:
Collapse | Copy Code
int packfilesEx(char *path, char *mask, char *archive,
packcallbacks_t * pcb = NULL);
- We need a
path
that will designate the source directory. - The
mask
which will tell us what files to search for and pack. - The
archive
which will hold the archive file name. - An optional
pcb
which will hold a list of callbacks used for visual notifications.
Before going to the code, here is the
packfilesEx()
code flow:- Build
packdata_t
array of all files to be packed (storing their names and size) - Create the archive file and write in it the
Signature
and file count - Write the
packdata_t
array into the archive - Start reading every file and write its content in the archive
- Loop (4) until all files are stored
- Close the archive file
This operation is enough to pack all files into one single archive file. Now we go straight to the code:
Collapse | Copy Code
int packfilesEx(char *path, char *mask, char *archive, packcallbacks_t *pcb) { TCHAR szCurDir[MAX_PATH]; // define a vector that will hold the packdata_t array. // STL Vectors are stored in contiquous memory. std::vector<packdata_t> filesList; // make sure the current source directory is valid // and change working directory to it if so. // save current directory GetCurrentDirectory(MAX_PATH, szCurDir); // go to new working directory if (!SetCurrentDirectory(path)) return packerrorPath; WIN32_FIND_DATA fd; HANDLE findHandle; packdata_t pdata; findHandle = FindFirstFile(mask, &fd); if (findHandle == INVALID_HANDLE_VALUE) return packerrorNoFiles; long lTemp; // this loop is for storing file's headers only // directories are omitted do { // skip directory entries if ((fd.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) == FILE_ATTRIBUTE_DIRECTORY) continue; // clear record memset(&pdata, 0, sizeof(pdata)); // fill packdata entry strcpy(pdata.filename, fd.cFileName); pdata.filesize = fd.nFileSizeLow; // save entry filesList.push_back(pdata); } while(FindNextFile(findHandle, &fd)); FindClose(findHandle); FILE *fpArchive = fopen(archive, "wb"); if (!fpArchive) return packerrorCannotCreateArchive; // write signature lTemp = 'KCPL'; // lallous pack! (L-PCK) fwrite(&lTemp, sizeof(lTemp), 1, fpArchive); // write entries count lTemp = filesList.size(); fwrite(&lTemp, sizeof(lTemp), 1, fpArchive); // store files entries (since std::vector stores elements // in a linear manner) fwrite(&filesList[0], sizeof(pdata), filesList.size(), fpArchive); // process all files to copy for (unsigned int cnt=0;cnt<filesList.size();cnt++) { FILE *inFile = fopen(filesList[cnt].filename, "rb"); long size = filesList[cnt].filesize; // if callback assigned then trigger it if (pcb && pcb->newfile) pcb->newfile(filesList[cnt].filename, size); // copy file name long pos = 0; while (size > 0) { char buffer[4096]; long toread = size > sizeof(buffer) ? sizeof(buffer) : size; fread(buffer, toread, 1, inFile); fwrite(buffer, toread, 1, fpArchive); pos += toread; size -= toread; if (pcb && pcb->fileprogress) pcb->fileprogress(pos); } fclose(inFile); } // close archive and restore working directory fclose(fpArchive); SetCurrentDirectory(szCurDir); return packerrorSuccess; }
Writing the Unpacker
As the packing process has been explained in details, the unpacking part become more obvious; therefore, only the code flow will be presented:
- Open archive file
- Read pack header
- Verify signature - if not valid - report and exit
- Having read the pack header (
Signature
,NumOfFiles
,packdata_t
array) start extracting the files - Create a new file named
packdata_t[idx].FileName
and write its contents from the archive file - Process next file
- close archive file and exit
Collapse | Copy Code
int unpackfileEx(char *archive, char *dest, packcallbacks_t * pcb, long startPos) { FILE *fpArchive = fopen(archive, "rb"); // failed to open archive? if (!fpArchive) return packerrorCouldNotOpenArchive; long nFiles; if (startPos) fseek(fpArchive, startPos, SEEK_SET); // read signature fread(&nFiles, sizeof(nFiles), 1, fpArchive); if (nFiles != 'KCPL') return (fclose(fpArchive), packerrorNotAPackedFile); // read files entries count fread(&nFiles, sizeof(nFiles), 1, fpArchive); // no files? if (!nFiles) return (fclose(fpArchive), packerrorNoFiles); // read all files entries std::vector<packdata_t> filesList(nFiles); fread(&filesList[0], sizeof(packdata_t), nFiles, fpArchive); // loop in all files for (unsigned int i=0;i<filesList.size();i++) { FILE *fpOut; char Buffer[4096]; packdata_t *pdata = &filesList[i]; // trigger callback if (pcb && pcb->newfile) pcb->newfile(pdata->filename, pdata->filesize); strcpy(Buffer, dest); strcat(Buffer, pdata->filename); fpOut = fopen(Buffer, "wb"); if (!fpOut) return (fclose(fpArchive), packerrorExtractError); // how many chunks of Buffer_Size is there is in filesize? long size = pdata->filesize; long pos = 0; while (size > 0) { long toread = size > sizeof(Buffer) ? sizeof(Buffer) : size; fread(Buffer, toread, 1, fpArchive); fwrite(Buffer, toread, 1, fpOut); pos += toread; size -= toread; if (pcb && pcb->fileprogress) pcb->fileprogress(pos); } fclose(fpOut); nFiles--; } fclose(fpArchive); return packerrorSuccess; }
Writing the Self-Extractor (SFX)
The SFX is simply a special version of the unpacker (we will call it UnpackerStub) that instead of taking the archive file as command line it will look for an archive file that is embedded into it.
If you are a math geek you can think of an SFX as "UnpackerStub.exe + Archive.bin = UnpackerArchive.exe".
If you are a math geek you can think of an SFX as "UnpackerStub.exe + Archive.bin = UnpackerArchive.exe".
Now how to embed the archive file into the unpacker to form an SFX?
In order to do that we need to write some information in the UnpackerStub that will help it locate the Archive.binbody.
For this purpose I use the
Every executable has a well documented and defined format that will instruct and tell the OS how to load/run it. The
e_res2
field in the IMAGE_DOS_HEADER
to store a pointer to the archive data inside the unpacker stub.Every executable has a well documented and defined format that will instruct and tell the OS how to load/run it. The
IMAGE_DOS_HEADER
(defined in WINNT.H) is located at offset zero of every exectuable and has the following fields:
Collapse | Copy Code
typedef struct _IMAGE_DOS_HEADER { // DOS .EXE header
WORD e_magic; // Magic number
WORD e_cblp; // Bytes on last page of file
WORD e_cp; // Pages in file
WORD e_crlc; // Relocations
WORD e_cparhdr; // Size of header in paragraphs
WORD e_minalloc; // Minimum extra paragraphs needed
WORD e_maxalloc; // Maximum extra paragraphs needed
WORD e_ss; // Initial (relative) SS value
WORD e_sp; // Initial SP value
WORD e_csum; // Checksum
WORD e_ip; // Initial IP value
WORD e_cs; // Initial (relative) CS value
WORD e_lfarlc; // File address of relocation table
WORD e_ovno; // Overlay number
WORD e_res[4]; // Reserved words
WORD e_oemid; // OEM identifier (for e_oeminfo)
WORD e_oeminfo; // OEM information; e_oemid specific
WORD e_res2[10]; // Reserved words
LONG e_lfanew; // File address of new exe header
} IMAGE_DOS_HEADER, *PIMAGE_DOS_HEADER;
I store a pointer to the archive file address into the
e_res2
field which is large enough to hold a DWORD. After storing the pointer to the archive, I make sure to append the archive content into the UnpackerStub at that pointer location.
Two functions has been written to get/store the pointer of the archive data:
Collapse | Copy Code
int SfxSetInsertPos(char *filename, long pos) { FILE *fp = fopen(filename, "rb+"); if (fp == NULL) return packerrorCouldNotOpenArchive; IMAGE_DOS_HEADER idh; // read dos header fread((void *)&idh, sizeof(idh), 1, fp); // adjust position value in an unused MZ field *(long *)&idh.e_res2[0] = pos; // update header rewind(fp); fwrite((void *)&idh, sizeof(idh), 1, fp); fclose(fp); return packerrorSuccess; }
This function will store the pointer. First it reads the header, updates the
e_res2
field then writes the header back again.
Collapse | Copy Code
int SfxGetInsertPos(char *filename, long *pos) { FILE *fp = fopen(filename, "rb"); if (fp == NULL) return packerrorCouldNotOpenArchive; IMAGE_DOS_HEADER idh; fread((void *)&idh, sizeof(idh), 1, fp); fclose(fp); *pos = *(long *)&idh.e_res2[0]; return packerrorSuccess; }
This function will read the header and extract the value from the e_res2 field.
In short, the unpacker stub works like this:
- Call
SfxGetInsertPos()
to get the position of the archive file - Call the
UnpackFilesEx()
while passing the position (start of embedded archive.bin) of the archive file and the archive filename which is itself (computed by callingGetModuleFileName(NULL, ...)
Now I continue to describe how the Packer builds the SFX:
Collapse | Copy Code
// check if unpackerstub.exe exists if (GetFileAttributes(sfxStubFile) == (DWORD)-1) { printf("SFX stub file not found!"); return 1; } // open archive file FILE *fpArc = fopen(argv[3], "rb"); if (!fpArc) { printf("Failed to open archive!\n"); return 1; } // get archive size fseek(fpArc, 0, SEEK_END); long arcSize = ftell(fpArc); rewind(fpArc); // form output sfx file name char sfxName[MAX_PATH]; strcpy(sfxName, argv[3]); strcat(sfxName, ".sfx.exe"); // take a copy from SFX if (!CopyFile(sfxStubFile, sfxName, FALSE)) { fclose(fpArc); printf("Could not create SFX file!\n"); return 1; } // append data to SFX FILE *fpSfx = fopen(sfxName, "rb+"); fseek(fpSfx, 0, SEEK_END); // get SFX size before archive appending long sfxSize = ftell(fpSfx); // start appending from archive file to the end of SFX file char Buffer[4096 * 2]; while (arcSize > 0) { long rw = arcSize > sizeof(Buffer) ? sizeof(Buffer) : arcSize; fread(Buffer, rw, 1, fpArc); fwrite(Buffer, rw, 1, fpSfx); arcSize -= rw; } fclose(fpArc); fclose(fpSfx); // mark archive data position inside SFX SfxSetInsertPos(sfxName, sfxSize); // delete archive file while keeping only the SFX DeleteFile(argv[3]); printf("SFX created: %s\n", sfxName);
That's all!
沒有留言:
張貼留言