Sunday, August 31, 2008

How to Migrate a Subversion Repository

Versioning system such as subversion is a very important tool for developer. It tracks every changes that you made to the base code. However, at some points an svn checkout and then an svn commit is not enough. For example, when you want to backup the whole repository along with its tracked changes and then move it into a different svn server. With the svnadmin tool, this kind of task is almost a walk in the park. But you have to pay attention to some details.

Now, let's see a simple real world example. Let's say you have downloaded an important piece of code and created a local svn repository in your laptop because somehow your central svn server is not reachable. You have worked with this local repository and have got your version number up to version 20. Later on, you plan to migrate this repository into the central svn server along with the changes recorded in the local repository. The keyword here is migration. This process is explained in the Subversion Book for Subversion 1.2 Chapter 5 Repository Administration in section Repository Maintenance. The tool to migrate the local repository to the central server is svnadmin. These are steps in detail.

1. Using your current version of svnadmin, dump your repositories to dump files. This accomplished with:

$ svnadmin dump path_to_repository > dump_filename

The output of this command would be similar to:

$ svnadmin dump path_to_repository > dump_filename
* Dumped revision 0.
* Dumped revision 1.
* Dumped revision 2.

* Dumped revision 19.
* Dumped revision 20.


2. Create a new empty repository in the central server with:

$ svnadmin create


3. Transfer the file to the central svn server.

4. Using svnadmin in the central server, load your dump files into their respective, just-created repositories.

$ svnadmin load path_to_new_repository < dump_filename

The output of this command would be similar to:

$ svnadmin load path_to_new_repository < dump_filename
<<< Started new txn, based on original revision 1
* adding path : A ... done.
* adding path : A/B ... done
...
<<< Started new txn, based on original revision 20
* adding path : A/Z/zeta ... done.
* editing path : A/mu ... done.
------- Committed new rev 20 (loaded from original rev 20) >>>


If you are using a different version of subversion in the central server, you have to pay attention to the following issue.

Be sure to copy any customizations from your old repositories to the new ones, including
DB_CONFIG files and hook scripts. You'll want to pay attention to the release notes for the new release of Subversion to see if any changes since your last upgrade affect those hooks or configuration options.


It's a good idea to test the just migrated svn repository using svn co and then try to compile the source code to ensure everything went ok.

Now, say you are away again with your laptop and have committed some changes to the local repository because you cannot access the central server. Then you want to reflect the changes you have made since the last time you migrate the svn repository. This can be accomplished with:

$ svn dump path_to_repository --revision 20:25 --incremental > dump_filename

In the command above, you inform subversion that you want to backup only the changes from revision 20 to 25. You can then load these changes using svnadmin load command as before.

That's it. Now you can work offline with your laptop and have the local svn repository synchronized with your central server with the help of svnadmin.

Tuesday, August 26, 2008

The Incompatibility of 64-bit GCC With 32-bit Packed Data Structures

There are certain times when we, C programmers take it for granted that when you declare:

unsigned long x;

you expect to have 32-bit unsigned variable. You would expect that you need to declare the variable as:

unsigned long long x;

to have 64-bit unsigned variable.

Well, at some points these are not problematic. But, when you have a packed data structure, say a structure that describes the header of a binary file. You're in a big trouble if the seemingly innocent code:

unsigned long x;

turns out to declare a 64-bit unsigned long instead of the expected 32-bit unsigned long.

Let's look at a real life example. This code:

struct img_header {
unsigned char signature[SIGNATURE_LEN];
unsigned long startAddr;
unsigned long burnAddr;
unsigned long len;
}__attribute__((packed));

typedef struct img_header IMG_HEADER_T, *IMG_HEADER_Tp;

Would create an 16-bytes IMG_HEADER_T structure with the 32-bit GCC compiler. On the other hand, it would create a 28-bytes IMG_HEADER_T structure with the 64-bit GCC compiler. This is very dangerous when dealing with firmware binary. The workaround on 64-bit GCC compiler is to force the compiler to use the 32-bit "compatibility" mode by using the "-m32" compiler flags. Most 64-bit GCC compiler in 64-bits Linux distributions have this "compatibility" mode. Usually, the compiler flags are placed in the Makefile. You can force this intended behavior there, like this:

...
CC=gcc
CFLAGS = -m32
...


It takes me more than one month to spot this bug :-(. Which is a pity. It's known only after I made a very simple 8-bit checksum utility which spots an excess of 12-bytes in the header file of an intermediate file in the SDK that I worked with. A couple of lessons learned from this incident.

1. Never trust your seemingly innocent Makefile when you're working on a 64-bit system with 64-bit compiler or multilib compiler. Use any "force" options to enforce your intended output from the compiler because in most cases you don't know what the default is unless you do some tests.

2. Always build test stubs to verify intermediate results when something goes awry in the output file of an SDK. This will save your development time.

3. Use common sense to track down the bug. Add debug statements to watch the output of a binary utility in an SDK when using it on 64-bit systems because you won't know whether it will behave as intended or not. Most of todays SDKs has been tested only in 32-bit systems and assumed to be running on the very same architecture.

This is a very hard lesson for me because it wasted time plus a lot of resources to spot the bug. I should've build the "test stub" application which only takes very short time to create in the very beginning. The "test stub":

#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

#define DEBUG
#undef DEBUG


/*
* TODO:
*
* 1. Add default end offset handling (i.e., EOF equ end offset)
* 2. Add more robust error handling
* 3. Add GNU gettext handler for input parameters
*
*/


static unsigned char sum(char buf[], unsigned long len, unsigned long start)
/*
* @param buf pointer to buffer to be 8-bit summed
* @param start starting offset in buf to calculate the sum
* @param len length of the buffer to be summed (in bytes)
*/
{
unsigned char sum;
unsigned long i;

sum = 0;

for(i=0; i<len; i++ ) {
sum = sum + buf[start+i];
}

return sum;
}

static void show_help(char* argv[])
{
printf("Usage: %s filename start_offset_in_file(hex) "
" end_offset_in_file(hex) \n", argv[0]); // TODO: use GNU's gettext
exit(1);
}

int main(int argc, char* argv[])
{
int stream;
char* buf;
struct stat st;
unsigned long start_offset, end_offset;


if(argc != 4) {
show_help(argv);
}

stream = open( argv[1], O_RDONLY);

if (stream == -1 ) {
printf("Error opening input file!\n");
exit(1);
}


// get file size
if( fstat(stream, &st) != 0 ) {
printf("Error, unable to get file size!\n");
exit(1);
}

printf("Input file size = 0x%X bytes\n", st.st_size);

// allocate buffer for the file size obtained above
buf = (char*) malloc(st.st_size);
if( buf == NULL ) {
printf("Unable to allocate memory for file buffer!\n");
exit(1);
}

// read the opened file to buffer
read( stream, (void*) buf, st.st_size );

// calculate checksum, passing in the buffer and start offset
start_offset = strtol(argv[2], NULL, 16);
end_offset = strtol(argv[3], NULL, 16);

if ( start_offset >= end_offset ) {
printf("Error! Wrong parameter. "
"end_offset should be bigger than start_offset\n");

free(buf);
close(stream);
exit(1);
}

#ifdef DEBUG // check string conversion routine
printf("start_offset = 0x%X\n", start_offset );
printf("end_offset = 0x%X\n", end_offset );
#endif

printf("File checksum (from offset 0x%X to 0x%X)= 0x%X\n",
start_offset, end_offset,
sum(buf, end_offset - start_offset, start_offset) );

free(buf);
close(stream);

return 0;
}

This bug won't bite you if you do development on 64-bit Linux/Unix systems if pay attention to it.


Never take anything for granted when developing on 64-bit or multilib systems


This document is a good source of information on 64-bit/multilib systems portability, particularly for programmers and advanced sysadmins.