Wednesday, December 7, 2016

How to "interface" Runtime Error Handling between C++11 (Modern C++) and C Code

This post explore the idea to interface runtime error handling between C++11 and C. This is important because most OS API are exposed via C libraries. Moreover, there are countless C libraries out there that uses C-styled runtime error handling as well.

Anyway, before moving further I want to emphasize that assertion has a different goal compared to runtime error handling. Assertion is meant to be used to catch logic error in your code during development, before the program/executable is released/used in operational environment. Therefore, this post doesn't concern the use of assertion. This post will focus on runtime errors caused by "invalid" state of system resources, such as non-existing file, failure to allocate heap memory, etc. Preliminary information on when to use assertion can be found over at MSDN: Errors and Exception Handling (Modern C++). The MSDN article is rather centered on Microsoft-platform. But, the principles explained in it are applicable to any C/C++ code.

Lets get back to the main theme: runtime error handling interface between C++11 and C code. MSDN provided a sample solution to the problem as well: How to: Interface Between Exceptional and Non-Exceptional Code. Unfortunately, the sample provided by MSDN still doesn't use C++11 smart pointer to manage the file HANDLE resource that it uses. Moreover, it's Windows-centric. Therefore, it's not a "pure" C++11 solution yet. However, the idea presented by the MSDN sample is profound and has been adopted in the code for my previous post about using custom deleter.

The basic idea for runtime error handling in C and C++ is different:
  • In C, you have the errno variable from the standard C library or if your code runs in Windows you can query the error code via GetLastError(). Because I'm trying to be platform independent, let's focus on using errno. In C, your code checks the value of the errno variable after a call to a C library function to check for runtime error. Side note: This C runtime error approach is akin to "side-band" signaling in hardware protocol because you don't get the full picture from the return value of the called function. Instead, you need to check other variable via different means. 
  • In C++, your code should be using exception as the mechanism to propagate error "up-the-stack" until there is a "handler" that can handle the error. If the error is unhandled, the standard behavior is to call std::terminate which normally terminate the application.
Looking at the two different mechanisms for runtime error handling in both C and C++, you must have come up with the answer: wrap the C error code into C++ exception.  I provide a sample code that shows how to wrap C error code into C++ exception at https://bitbucket.org/pinczakko/custom-c-11-deleter--it's an updated version of my C++11 custom deleter sample code. Feel free to clone it. The rest of this post explains the code in that Bitbucket URL.

The steps to wrap C error code into C++11 exception are:
  1. Create an exception class that derives from runtime_error class.
  2. Store the error code/number in that exception class.
  3. Create a method in that exception class that transforms the error code into human readable error message (string).
  4. Throw an object of the exception class type in places where a runtime error might occur.
  5. Catch the exception in the right place in your code. 
The preceding steps are not difficult. Lets examine the sample code in more detail to understand the steps.

The exception class is FileHandlerException. It's defined as follows:
class FileHandlerException : public runtime_error
{
public:
    explicit FileHandlerException(int errNo, const string& msg):
        runtime_error(FormatErrorMessage(errNo, msg)), mErrorNumber(errNo) {}

    int GetErrorNo() const {
        return mErrorNumber;
    }

private:
    int mErrorNumber;
};
The FileHandlerException class is derived from runtime_error class--the latter is part of stdlibc++ in C++11. The FileHandlerException class uses its mErrorNumber member to store the error code (errno value) obtained from C. The FormaErrorMessage() function is a custom function that transforms errno value into human-readable string via the C strerror() function. This is FormaErrorMessage() function implementation:
string FormatErrorMessage(int errNo, const string& msg)
{
    static const int BUF_LEN = 1024;
    vector<char> buf(BUF_LEN);
    strncpy(buf.data(), strerror(errNo), BUF_LEN - 1);

    return string(buf.data()) + "   (" + msg + ")   ";
}
As you see, it's not difficult to implement the wrapper for C runtime error code. Lets proceed to see how the exception class is being used.
explicit FileHandler (const char* path, const char* mode)
try:
        mPath{path}, mMode {mode}
    {
        cout << "FileHandler constructor" << endl;

        FILE* f = fopen(path, mode);

        if (f != NULL) {
            unique_ptr<FILE, int (*)(FILE*)> file{f, closeFile};
            mFile = std::move(file);

        } else {
            throw FileHandlerException(errno, "Failed to open " + string(path));
        }

    } catch (FileHandlerException& e) {

        throw e;
    }
In the preceding code, if the call to fopen() failed to produce a usable FILE object, an exception object of type FileHandlerException is initialized and thrown. The catch part of the code simply re-throw the exception object higher-up the stack. The code that finally catches the exception object is shown below.
try {
        DeleterTest::FileHandler f(argv[1], "r");
        //.. irrelevant code omitted
    } catch (DeleterTest::FileHandlerException& e) {

        cout << "Error!!!" << endl;
        cout << e.what() << endl;
        cout << "errno: " << e.GetErrorNo() << endl;
    }
The final "handler" of the exception object simple shows the error string associated with the runtime error, i.e. what causes the failure to obtain a valid FILE pointer.

One final note about exception support in C++11: There is no comprehensive support for Unicode character set yet. I've looked up the web for explanation on the matter but all of them have the same conclusion. Please comment below if you know better answer or update to the problem.

Hopefully, this post is useful for those doing mixed C and C++ code development.

Friday, November 25, 2016

Dealing with Opaque C Pointer in C++11

This post explains how to deal with opaque C pointer in C++11--read: how to interface with C in C++11.  This is important to know for those working on real world C++11 code because there are many C libraries out there that uses opaque pointer as their "interface" to other code, which in this particular case: C++11 application. It's also important to know for C++11 programmers because the way C++11 implement an opaque "interface" is different.

Now, let's detour a bit to "native" C++11 interface. The "native" C++11 interface is known as Compilation Firewalls, an opaque C++ interface "on steroid". These are the relevant articles on it:

Just read both of the links to learn more about C++11-style "native" interface. I'm not going to talk about it here.

Let's get back to our problem. Let's restate the problem into a more manageable question: How to wrap opaque C pointer into C++11 smart pointer (specifically unique_ptr)? Short answer: Use the  opaque C pointer object type as parameter to unique_ptr template and provide a custom deleter. That's it. If you understand the short answer, then you're done. If it's still unclear to you, then read on.

There are two kinds of unique_ptr template, one with only one template parameter (because the second parameter has default value) and one with two template parameters--see: http://en.cppreference.com/w/cpp/memory/unique_ptr. You need to use the unique_ptr template with two template parameters to wrap an opaque C pointer because you need to provide a custom deleter. A custom deleter is a function that finalize/deallocate the resources allocated by the unique_ptr constructor. Let's look at an example:

unique_ptr<FILE, int (*)(FILE*)> mFile{nullptr, closeFile};

The preceding code snippet shows the second unique_ptr template parameter is a function pointer. The function pointer is initialized with closeFile function name. In this case, the closeFile function pointer is the custom deleter. closeFunction() is a simple function which logs a message to the screen and then call fclose(), as shown in the following code snippet:

int closeFile(FILE* f) {
    cout << "Calling fclose()" << endl;
    return fclose(f);
}

You might be thinking how to obtain a valid FILE pointer in the first place, before disposing it with fclose(). Well, of course with a call to fopen(). This is how I do it:

public:
    explicit FileHandler (const char* path, const char* mode)
try:
        mPath{path}, mMode {mode}
    {
        cout << "FileHandler constructor" << endl;

        FILE* f = fopen(path, mode);

        if (f != NULL) {
            unique_ptr <FILE, int (*)(FILE*)> file{f, closeFile};
            mFile = std::move(file);

        } else {
            throw FileHandlerException(errno, "Failed to open " + string(path));
        }
    } catch (FileHandlerException& e) {

        throw e;
    }

The preceding code snippet shows that a valid FILE pointer is passed as parameter to the unique_ptr object that will manage the FILE pointer upon its initialization. Then, the object is moved to the equivalent class member which will manage the FILE pointer. You can clone the complete code at: https://bitbucket.org/pinczakko/custom-c-11-deleter

The sample shows how to wrap the opaque FILE pointer in a unique_ptr smart pointer. The FILE  pointer is an opaque "standard" C library pointer. Despite this, the technique is applicable to other opaque C pointer which usually acts as interface to a third party C library that you want to use in your C++11 application.

NOTE: The syntax of the FileHandler class constructor might be a bit alien to you if you're not familiar with C++11. It's called function-try-block, a way to wrap the whole function inside a try-catch block. The complete explanation is at: http://en.cppreference.com/w/cpp/language/function-try-block, additional heavily commented sample is at: https://msdn.microsoft.com/en-us/library/e9etx778(v=vs.120).aspx.

I hope this post is helpful to those looking to unleash the power of C libraries in C++11.

Monday, October 31, 2016

Cross Compiling Unicode Windows Application with Mingw-w64

Cross compiling Unicode Windows application in Linux is quite straight forward if you are using mingw-w64 cross compiler. All you have to do is turn on the -municode compiler switch. Other than that, you need to change your program entry point from main() to wmain() if your program is a command line program. I provided a complete sample code over at https://bitbucket.org/pinczakko/windows-event-object-test that you can read and use.

Now, let's look at the most important parts of the sample code with respect to Unicode support. First the CMakeLists.txt file. These are the necessary changes to support Unicode:
if (MINGW)
 message(status: " ** MINGW detected.. **")
 set(CMAKE_CXX_FLAGS_RELEASE "-municode ${CMAKE_CXX_FLAGS_RELEASE}")
 set(CMAKE_C_FLAGS_RELEASE "-municode ${CMAKE_C_FLAGS_RELEASE}")

 set(CMAKE_CXX_FLAGS_DEBUG "-municode ${CMAKE_CXX_FLAGS_DEBUG}")
 set(CMAKE_C_FLAGS_DEBUG "-municode ${CMAKE_C_FLAGS_DEBUG}")

# ..
endif()
The preceding code snippet shows the C++ and C compiler switch has been modified to use -municode if mingw compiler is detected.

Second, the entry point of the command line program is also modified:
int wmain( void )

The third change is the code also make use of wprintf() function in place of printf() in some places. This is one of the example:
wprintf(L"All threads ended, cleaning up for application exit...\n");
wprintf() is the "wide-character" version of printf(), well you could argue that you want to use the "tchar" version and so on. However, in this short article, the point is just to see how mingw-w64 support, I'm not trying to be extremely correct.

Thursday, October 27, 2016

What operator int() means in C++?

Short answer to the title of this post: It's a user-defined conversion to int.
Long answer: read on ;-).  The code below is a sample of the user-defined conversion. 
class MyClass {
public:
   //..
   operator HANDLE() const { return(m_hCJ); }
   //..


private: 
   HANDLE m_hCJ;          // handle to volume
   //..

};

The code above would return MyClass object's m_hCJ member value if a conversion to HANDLE type is requested. For example:
MyClass testClass;
HANDLE testHandle = testClass;
In the preceding code, testClass operator HANDLE() will be called (at runtime?) to return testClass.m_hCJ value.

For more comprehensive example and explanation, see: http://en.cppreference.com/w/cpp/language/cast_operator.

I brought this issue up because it uses the C++ operator keyword which is usually used for operator overloading. It could confuse those who hasn't seen C++ code that uses user-defined conversion.

Sunday, October 9, 2016

"Replacing" C++ Virtual Function with Template (and More)

There are several ways you can replace C++ virtual function with template. These are some related examples to accomplish the task:
Curiously recurring template pattern

http://stackoverflow.com/questions/16988450/c-using-templates-instead-of-virtual-functions#16988933

https://en.wikipedia.org/wiki/Barton-Nackman_trick

https://en.wikipedia.org/wiki/Bounded_quantification#F-bounded_quantification

https://en.wikipedia.org/wiki/Template_metaprogramming#Static_polymorphism

However, I found that the philosophy of using virtual function itself is quite "flawed" when one already use template in his/her C++ code. Why? Because, the Standard Template Library (STL) or Boost, or other C++ template library for that matter has a very different approach to programming than Object Oriented (OO) philosophy. Most if not all of them are meant to provide generic programming in C++ (as opposed to OO)--Generic as in ADA (https://en.wikibooks.org/wiki/Ada_Programming/Generics) and in general https://en.wikipedia.org/wiki/Generic_programming. I might be stating a hard line here, but nonetheless, that was what STL was written for. See for yourself what Alexander Stepanov (the most prominent STL author) wrote: http://www.stepanovpapers.com/notes.pdf.

What I stated in the previous paragraph meant that, in order to use template in a substantial C++ code base, we need a paradigm shift. Instead of looking at the solution as related objects, I think we need to look at the solution as "interfaces to generic algorithms". I think that you should gain more understanding of what I meant once you've read Stepanov remark in his notes. This is an important excerpt from his notes:
It is essential to know what can be done effectively before you can start your design. Every programmer has been taught about the importance of top-down design. While it is possible that the original software engineering considerations behind it were sound, it came to signify something quite nonsensical: the idea that one can design abstract interfaces without a deep understanding of how the implementations are supposed to work. It is impossible to design an interface to a data structure without knowing both the details of its implementation and details of its use. The first task of good programmers is to know many specific algorithms and data structures. Only then they can attempt to design a coherent system. Start with useful pieces of code. After all, abstractions are just a tool for organizing concrete code.
If I were using top-down design to design an airplane, I would quickly decompose it into three significant parts: the lifting device, the landing device and the horizontal motion device. Then I would assign three different teams to work on these devices. I doubt that the device would ever fly. Fortunately, neither Orville nor Wilbur Wright attended college and, therefore, never took a course on software engineering. The point I am trying to make is that in order to be a good software designer you need to have a large set of different techniques at your fingertips. You need to know many different low-level things and understand how they interact.
The most important software system ever developed was UNIX. It used the universal abstraction of a sequence of bytes as the way to dramatically reduce the systems’ complexity. But it did not start with an abstraction. It started in 1969 with Ken Thompson sketching a data structure that allowed relatively fast random access and the incremental growth of files. It was the ability to have growing files implemented in terms of fixed size blocks on disk that lead to the abolition of record types, access methods, and other complex artifacts that made previous operating systems so inflexible. (It is worth noting that the first UNIX file system was not even byte addressable – it dealt with words – but it was the right data structure and eventually it evolved.) Thompson and his collaborators started their system work on Multics – a grand all-encompassing system that was designed in a proper top-down fashion. Multics introduced many interesting abstractions, but it was a still-born system nevertheless. Unlike UNIX, it did not start with a data structure!
One of the reasons we need to know about implementations is that we need to specify the complexity requirements of operations in the abstract interface. It is not enough to say that a stack provides you with push and pop. The stack needs to guarantee that the operations are taking a reasonable amount of time – it will be important for us to figure out what “reasonable” means. (It is quite clear, however, that a stack for which the cost of push grows linearly with the size of the stack is not really a stack – and I have seen at least one commercial implementation of a stack class that had such a behavior – it reallocated the entire stack at every push.) One cannot be a professional programmer without being aware of the costs of different operations. While it is not necessary, indeed, to always worry about every cycle, one needs to know when to worry and when not to worry. In a sense, it is this constant interplay of considerations of abstractness and efficiency that makes programming such a fascinating activity. 
I need to emphasize the last paragraph of Stepanov note because I have just encountered a not so "miserable" failure very closely related to what Stepanov said in that paragraph. I need to cleanup some left-over code which supposed to provide abstraction for some sort of file system operation in two very different OSes. Unfortunately, the previous code failed "quite" miserably to provide good abstraction on the task, precisely because it wasn't designed from the ground-up on both OSes as Stepanov suggested. It was only designed from the ground-up to work well in one of them. Therefore, the design lean more to one of them. Fortunately, not all hope is lost because I think the task could still be salvaged through several iteration to fix the abstraction. I said "quite" miserably because the state of the matter could still be salvaged/fixed somehow. It's not a total disaster.

Let's put the theory aside and take a look at one of the alternative to replace C++ virtual function with its C++ template analog. The code below illustrate one of the approach you can use to replace virtual function with template-based solution.
#include <iostream>

using namespace std;

template <class T> class Compute
{
public:
    T multiply(T x, T y);
    T add(T x, T y);
};

template <class T> T Compute<T>::multiply(T x,T y)
{
    cout << "Inside function: " << __func__ << "()" << endl;

    return x*y;
}

template <> double Compute<double>::multiply(double x,double y)
{
    cout << "Inside function: " << __func__ << "() -- double version" << endl;

    return x*y;
}

template <class T> T Compute<T>::add(T x, T y)
{
    cout << "Inside function: " << __func__ << "()" << endl;

    return x+y;
}

int main()
{
    Compute <int> test;
    Compute <double> testFp;

    cout << "12 x 3 = " << test.multiply(12, 3) << endl;
    cout << "1.25 x 3 = " << testFp.multiply(1.25, 3) << endl;
}

The output of the code above is as follows:
Inside function: multiply()
12 x 3 = 36
Inside function: multiply() -- double version
1.25 x 3 = 3.75

The code above demonstrate the use of "function overloading" with C++ template, as explained by Herb Sutter over at http://www.gotw.ca/publications/mill17.htm. The "specialized"/"overloaded" version of the multiply() method is called to handle double data type. This "overloaded" implementation is slightly different compared to other generic types handled by the template, it shows a different string in the output of the program.
Anyway, you could replace double with your own custom data type as long as the data type implements the required operator, i.e. + and * in the example above. You could use and extend the technique shown in the example above to handle many cases that previously requires virtual function in C++. The basic philosophy is: instead of inheriting from parent class(es) and implementing virtual function(s), use a class "instance" that behaves as required based on the template instantiation parameter(s)--or simply said: template parameter(s).

On another note, it's rather disappointing that present C++ standard doesn't yet impose adequate template instantiation error checking. One of the most promising avenue to address this issue is the so-called C++ concepts that could be helpful, but not yet ratified in C++ standard.

Last but not least, I hope this post is a good food for thought for C++ programmers out there.

Tuesday, September 27, 2016

What are 0xDEADBEEF, 0xFEEEFEEE, 0xCAFEFEED & co. ?

If you stumbled in this post looking looking for detailed answer for any of those mentioned in the title, without further ado, there are more complete explanation at:


But, if you want to know the big picture, read on ;-)

Chances are, you stumbled here after doing some hardcore debugging and found yourself baffled at the values that showed-up in the CPU registers or in the heap/stack memory. I found the first two values in the title (0xDEADBEEF and a variant of the second, i.e. 0xFEEEFEEEFEEEFEEE) while doing debugging in two different systems. The 0xDEADBEEF was on a System i (Power 5) system and the second one was on a Windows 64-bit machine.

All of these values are debugging-aid value, so to speak. It makes them very visible in the debugger (for those who already know). The purpose is to signal that something went wrong and to give an idea what possibly wrong, i.e. where the error possibly comes from, just with a glance on the debugger. For example, 0xDEADBEEF could mean either the program accessed unitialized (heap?) memory or a NULL pointer is encountered (pointing to uninitialized memory). Anyhow, it means something is wrong with one of your pointer. Similar case is indicated by 0xFEEEFEEE or its 64-bit variant.

These "readable" hexadecimal values are categorized as hexspeak because it looks like a "language" despite being hexadecimal value, i.e. you can read them aloud in English or other intended human language. The most hilarious of them all is 0xB16B00B5 ("Big Boobs"). I wonder who was the Hyper-V project manager at the time this Linux guest signature was determined at Microsoft LoL.

Tuesday, September 6, 2016

Debugging Cross-Compiled Windows Application (Executable and DLL)

I explained how to cross compile Windows application and DLL in Arch Linux in another post. Now, let's proceed on techniques that you can use to debug the result of the cross compilation. The general steps are as follows:

  1. Test the cross-compilation result in Wine (running on Linux of course). If the executable can run in Wine or the DLL can be loaded and (at least) partially executed, then, you may proceed to the next step. Otherwise, double check your cross-compiler as it may emit the wrong kind of executable.
  2. Run the executable (and if required all the DLLs) in Windows. First, without a debugger and then within a debugger, should an anomaly (or more) is found during the run(s).
  3. In the event that you need a debugger, make sure that the cross compiled version of the code contains debugging symbols. You can use "-g" switch in gcc/g++ to generate the debugging symbol in your GNU cross compiler. 
  4. In the event that you need a debugger, make sure your Windows debugger is recent enough that it can parse the debugging symbols in your cross-compiled executables and/or DLLs. Also, make sure that it can handle local variable(s), missing local variable debugging support or inability to display function parameter value(s) indicates that your debugger version probably isn't compatible with the cross-compiler. This is particularly true for gcc/g++ and gdb combination. For gcc/g++ cross compiler, you can use gdb from the nuwen "distribution". It has very recent GDB version. Note: I was caught off-guard by older version of gdb in Windows before because it was still quite usable.
To validate that your gdb version, make sure that your debugger output is similar to this:
Valid GDB output
As you can see in the screenshot above, you can inspect all local variable(s) while inside a breakpoint in a function that clearly has local variable. The debugger also shows the value(s) of the parameter passed to the function (where you set the breakpoint), including the function's implicit this parameter.  If you can't see any of that, it means you are using gdb which is incompatible with the gcc/g++ cross-compiler used to create the executable/DLL. Try finding newer gdb version than the one you're currently using.

You can use gdb "script" to carry-out semiautomatic debugging. The screenshot above shows how to use a gdb script, i.e. by using the source command in gdb. The source command basically tell gdb to parse the command file, i.e. the debugging script as if you're typing the debugging command yourself in gdb. See: https://sourceware.org/gdb/onlinedocs/gdb/Command-Files.html for more info on using command file in gdb. This is the gdb command file used in the screenshot above:
b main.cc:23
b main.cc:24
b main.cc:11
b main.cc:12

Hopefully, this post is helpful for those cross compiling applications to Windows from Linux.

Wednesday, August 17, 2016

Cross Compiling Windows Application and DLLs in (Arch) Linux

Cross compiling 32-bit and 64-bit Windows application in Linux is much easier these days than in the past. Thanks to the Mingw-w64 project.  It's even a little more easier in Arch Linux because most of what you need--including extensive amount of libraries--are already in AUR. For starter, install the cross compiler: https://www.archlinux.org/packages/community/x86_64/mingw-w64-gcc/. Then you can continue to install all other stuff (libraries and their dependencies) that you need. In most cases, you can just build and install the package by using the PKGBUILD file from AUR directly (via: cd ${src_dir}; makepkg -sri ). However, in some cases, you need to make adjustment(s) to the PKGBUILD file.

Let's focus on mingw-w64 in Arch Linux. There are several important matters that you need to take care of to cross compile opensource projects that uses Cmake in Arch Linux to build Windows executables and DLLs:
  • Opensource projects that uses CMAKE build system, need to use the mingw-w64-specific cmake (look at the example PKGBUILD for cmake below).
  • You need to set the include path to the cross compiler toolchain environment include path, not the host include path. 
This is an example PKGBUILD file for a simple Helloworld application that uses boost. It assumes that you have build and install the cross compiled boost DLL in your Arch Linux mingw-w64 environment.
#!/bin/bash

_architectures="x86_64-w64-mingw32 i686-w64-mingw32"

rm -rvf build-*

for _arch in ${_architectures}; do
 mkdir -p build-${_arch} && pushd build-${_arch}
 CMAKE_INCLUDE_PATH="/usr/"${_arch}"/include"
 echo "CMAKE_INCLUDE_PATH = "${CMAKE_INCLUDE_PATH}
 export CMAKE_INCLUDE_PATH
   ${_arch}-cmake ..
 make VERBOSE=1
 popd
done 
The example above is the PKGBUILD file for the sample Helloworld project. You can clone the project over at: https://bitbucket.org/pinczakko/cross_hello_world.

There are also some things to take care if you cross compile opensource projects that uses autotools in Arch Linux to build Windows executables and DLLs:
  • Opensource projects that uses autotools build system, need to use the mingw-w64-specific configure script (look at the example PKGBUILD for configure below).
  • In some cases, you need to "fool" the libtool script to pass the "dynamic/static library integrity" check. You don't need to be afraid with this step because you could always use Linux file utility to verify the compiler output along with wine before testing/using it in real Windows installation.
This is an example PKGBUILD file for popt library:
# Maintainer: Sebastian Morr 
# Modified by Pinczakko for Mingw-w64 cross compilation to 64-bit Windows

pkgname=mingw-w64-popt
_pkgname=popt
pkgver=1.16
pkgrel=1
arch=('any')
pkgdesc="A commandline option parser (mingw-w64)"
makedepends=('mingw-w64-gcc')
license=('custom')
url="http://rpm5.org"
options=(!strip !buildflags staticlibs)
source=("http://rpm5.org/files/${_pkgname}/${_pkgname}-${pkgver}.tar.gz"
        "0001-nl_langinfo.mingw32.patch"
        "197416.all.patch"
        "217602.all.patch"
        "278402-manpage.all.patch"
        "318833.all.patch"
        "356669.all.patch"
        "367153-manpage.all.patch"
        "get-w32-console-maxcols.mingw32.patch"
        "no-uid-stuff-on.mingw32.patch"
        )
sha1sums=('cfe94a15a2404db85858a81ff8de27c8ff3e235e'
          '62640c0a0845cea5f3cd5646d26fd681ea36cadf'
          'bd7c8872f0bb80ec2a8b78596eb3ba5706795133'
          '977fbbe108cf817103f706dd314236e6bace7557'
          '18d169ff43b6ef4ee613272fdb2bbdc01df1f166'
          'a446c763439fe97459c6ea9bea22054a69ea9cc6'
          '2664b32cd6882e3c7da2d1ed3d40b14807a2c604'
          '63e5fdae8160445794458b03fc5a61e7354efada'
          '6599adf3797d7bfb4534bc910372c431fc0efced'
          '4c3b7b302044bd45decec78f7f7d4ece15d9f3f7')

_architectures="i686-w64-mingw32 x86_64-w64-mingw32"

prepare() {
  cd "$srcdir/${_pkgname}-$pkgver"
  patch -p1 -i ../0001-nl_langinfo.mingw32.patch
  patch -p1 -i ../197416.all.patch
  patch -p1 -i ../217602.all.patch
  patch -p1 -i ../278402-manpage.all.patch
  patch -p1 -i ../318833.all.patch
  patch -p1 -i ../356669.all.patch
  patch -p1 -i ../367153-manpage.all.patch
  patch -p1 -i ../get-w32-console-maxcols.mingw32.patch
  patch -p1 -i ../no-uid-stuff-on.mingw32.patch
}

build() {
  # We assume that libtool check on 64-bit Windows DLL is broken
  # in mingw-w64 Linux cross compiler. So, force it to pass all checks
  export lt_cv_deplibs_check_method='pass_all'

  cd "$srcdir/${_pkgname}-$pkgver"
  for _arch in ${_architectures}; do
    mkdir -p build-${_arch} && pushd build-${_arch}
 ${_arch}-configure --enable-shared --enable-static 
    make
 popd
  done
}

package () {
  for _arch in ${_architectures}; do
    cd "${srcdir}/${_pkgname}-${pkgver}/build-${_arch}"
    make install DESTDIR="${pkgdir}"
    rm -rf "${pkgdir}/usr/${_arch}/share/man"
    ${_arch}-strip -x -g "${pkgdir}/usr/${_arch}/bin/"*.dll
    ${_arch}-strip -g "${pkgdir}/usr/${_arch}/lib/"*.a
  done

  install -D -m644 "${srcdir}/${_pkgname}-${pkgver}/COPYING" "$pkgdir/usr/share/licenses/$pkgname/LICENSE"
}

You can clone the files required to "cross build" popt library at: https://bitbucket.org/pinczakko/cross_mingw-w64-popt.

Hopefully, this is useful for those developing Windows application in Linux.

Wednesday, July 27, 2016

Java JAR Reverse Engineering Walkthrough

There are many ways to reverse engineer Java JAR file. However, I found the following steps are the fastest for me to understand the inner working of the Java code that I try to understand:
  1. Extract the target *.class file(s) from the Jar file with: jar -x command.
  2. If the class file(s) is/are recent one (>= java 1.5), use jadretro to condition the class file(s) before passing it through the jad java decompiler. jadretro is at: http://jadretro.sourceforge.net.
  3. Decompile the java class(es) with jad. You can download jad at: http://varaneckas.com/jad/.
  4. Use doxygen (http://www.stack.nl/~dimitri/doxygen/) plus graphviz(http://www.graphviz.org/) to generate the class inheritance and function call graph(s). This should give you an overview of how the class(es) works.
  5. Read the decompilation result as needed. I found that, step 4 will made this step easier as it gives you the hint(s) as to where to start reading the code.
Another approach is to use Radare2 (http://radare.org/r/). But, I've never used Radare for Java decompilation. Therefore, I don't know yet how mature its support. 

Anyway, sometimes interoperability needs forced us to rely on reverse engineering to get insight into how things work. This also applies to Java.

Monday, June 13, 2016

GraphViz Tutorial for The Uninitiated

This is not a tutorial per se. But, it's an example of how a complex graph can be generated by GraphViz DOT. You need to head over to  https://github.com/pinczakko/GraphViz-Samples for the source code. But, for the impatient, this is the result:
Rather complex graph generated from GraphViz DOT
At least this sample shows you how powerful GraphViz is, after investing even just a couple of hours learning the ropes. The bonus is, you can combine your GraphViz DOT code to your Doxygen comments and generate the graph in your code documentation. Isn't that powerful? Head over to http://www.stack.nl/~dimitri/doxygen/manual/diagrams.html for that.

Hopefully, this ease the pain creating your code documentation ;-)


Monday, June 6, 2016

Arch Linux cpupower Missing Library Temporary Fix

If you are experiencing the issue described at https://bbs.archlinux.org/viewtopic.php?id=213404 (copied here for your convenience--courtesy of bartbkr):
After a recent upgrade, when I attempt to use cpupower, I get the following
message:
    cpupower: /usr/lib/libpci.so.3: verion `LIBPCI_3.5' not found (required by cpupower)
I haven't changed any of the setting for cpupower recently and everything was
running smoothly before. Now I can't query the cpu settings any longer.
$ ls /usr/lib/libpci.so*
        /usr/lib/libpci.so
        /usr/lib/libpci.so.3
        /usr/lib/libpci.so.3.4.1

Then, the temporary solution is to downgrade cpupower to version 4.6.1. Follow the general downgrade guide at https://wiki.archlinux.org/index.php/Downgrading_packages. For comparison sake, this is the log of failed cpupower at start-up in my machine:
root@jeez /var/cache/pacman/pkg
 # systemctl status cpupower.service
— cpupower.service - Apply cpupower configuration
   Loaded: loaded (/usr/lib/systemd/system/cpupower.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Mon 2016-06-06 22:07:38 SGT; 44min ago
  Process: 328 ExecStart=/usr/lib/systemd/scripts/cpupower (code=exited, status=1/FAILURE)
 Main PID: 328 (code=exited, status=1/FAILURE)

Jun 06 22:07:37 jeez systemd[1]: Starting Apply cpupower configuration...
Jun 06 22:07:37 jeez cpupower[328]: cpupower: /usr/lib/libpci.so.3: version `LIBPCI_3.5'
Jun 06 22:07:38 jeez systemd[1]: cpupower.service: Main process exited, code=exited, sta
Jun 06 22:07:38 jeez systemd[1]: Failed to start Apply cpupower configuration.
Jun 06 22:07:38 jeez systemd[1]: cpupower.service: Unit entered failed state.
Jun 06 22:07:38 jeez systemd[1]: cpupower.service: Failed with result 'exit-code'.

This is the excerpt the steps that I did to fix the issue via package downgrade:
root@jeez /var/cache/pacman/pkg
 # pacman -U cpupower-4.6-1-x86_64.pkg.tar.xz
loading packages...
warning: downgrading package cpupower (4.6-2 => 4.6-1)
resolving dependencies...
looking for conflicting packages...

Packages (1) cpupower-4.6-1

Total Installed Size:   0.41 MiB
Net Upgrade Size:      -0.10 MiB

:: Proceed with installation? [Y/n] Y
(1/1) checking keys in keyring                            [###############################] 100%
(1/1) checking package integrity                          [###############################] 100%
(1/1) loading package files                               [###############################] 100%
(1/1) checking for file conflicts                         [###############################] 100%
(1/1) checking available disk space                       [###############################] 100%
:: Processing package changes...
(1/1) downgrading cpupower                                [###############################] 100%
:: Running post-transaction hooks...
(1/1) Updating manpage index...
root@jeez /var/cache/pacman/pkg
 # systemctl restart cpupower.service
root@jeez /var/cache/pacman/pkg
 # journalctl -xe
Jun 06 22:51:51 jeez systemd[1]: Starting Apply cpupower configuration...
-- Subject: Unit cpupower.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit cpupower.service has begun starting up.
Jun 06 22:51:52 jeez systemd[1]: Started Apply cpupower configuration.
-- Subject: Unit cpupower.service has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit cpupower.service has finished starting up.
--
-- The start-up result is done.
pinczakko@jeez Mon Jun 06 10:52:07pm
~/ systemctl status cpupower.service
— cpupower.service - Apply cpupower configuration
   Loaded: loaded (/usr/lib/systemd/system/cpupower.service; enabled; vendor preset: disabled)
   Active: active (exited) since Mon 2016-06-06 22:51:52 SGT; 26s ago
  Process: 2672 ExecStart=/usr/lib/systemd/scripts/cpupower (code=exited, status=0/SUCCESS)
 Main PID: 2672 (code=exited, status=0/SUCCESS)

Jun 06 22:51:51 jeez systemd[1]: Starting Apply cpupower configuration...
Jun 06 22:51:52 jeez systemd[1]: Started Apply cpupower configuration.

As you see in the shell log above, after downgrading cpupower to version 4.6.1, everything went back to normal. This is only temporary fix until libpci 3.5 promoted from testing to stable. We can go back to cpupower version 4.6.2 by then.

Monday, May 9, 2016

Using DBX Debugger on AIX -- How to pass program arguments to DBX

DBX is the default debugger in IBM AIX OS. This debugger's user interface is rather unusual compared to other debugger. It is a command line debugger just like GNU GDB. But, it has a different philosophy. The user guide for DBX is at: https://www.ibm.com/support/knowledgecenter/ssw_aix_61/com.ibm.aix.cmds2/dbx.htm. The user guide is exhaustive to be read all at once. I recommend you to focus on your goal, i.e. debugging requirements and read the user guide to suite the requirements.

Let's start with a basic requirements:

  • You have a command line program 
  • The program has several arguments that must be passed at startup time.
Now, let's look at the steps to fulfill the requirements above. Let's start with DBX philosophy. The DBX philosophy is as follows:

  1. Running dbx without any arguments in the shell only starts the debugging environment, nothing more, nothing less.
  2. Running dbx with only the executable (program) file name will load the executable into memory but doesn't run the program. This step also doesn't pass any argument(s) to the program.  
  3. DBX has the so-called "subcommands" which are "commands" that you can type in the DBX debugging environment to instruct the debugger to do something. Another way to pass subcommand to DBX is via a text file known as "command file". A command file contains subcommand and the parameters/arguments required by the subcommand.

The gist of the philosophy is: useful thing can be done mostly via DBX subcommands. The following diagram illustrates this philosophy.
Figure 1 IBM AIX DBX debugger principle of working
The DBX Command Line Interface (CLI) is similar to GNU GDB. Therefore, I'm not going to explain it here. I'll proceed to DBX Scripting Interface. The scripting interface is invoked via DBX's "-c" flag. This is the excerpt from DBX user guide:
-c CommandFileRuns the dbx subcommands in the file before reading from standard input. The specified file in the $HOME directory is processed first; then the file in the current directory is processed. The command file in the current directory overrides the command file in the $HOME directory. If the specified file does not exist in either the $HOME directory or the current directory, a warning message is displayed. The source subcommand can be used once the dbx program is started.
I'll show you how to use -c flag to pass your program arguments at the start of a DBX debugging session. The first thing to do before we can use -c flag is to prepare the Command File. If you just want to pass your program arguments, then the contents of the command file is simply the run subcommand and your program arguments. Below is an example of a valid Command File. Let's name the command file as my_cmd.
run -a 10.10.10.254 -p 8000 -n 2  
In the command file above, the program (to be debugged) arguments starts at -a, the run statement in the beginning refers to DBX run subcommand. The following diagram shows this:
Figure 2 Using run subcommand in your command file
This is the verbatim explanation for run subcommand  from DBX user guide:
run Subcommand
run [ Arguments ] [ <File ] [ >File ] [ > >File ] [ 2>File ] [ 2> >File ] [ >&File ] [ > >&File ]
The run subcommand starts the object file. The Arguments are passed as command-line arguments.
Flags
ItemDescription
<FileRedirects input so that input is received fromFile.
>FileRedirects output to File.
2>FileRedirects standard error to File.
> >FileAppends redirected output to File.
2> >FileAppends redirected standard error to File.
>&FileRedirects output and standard error to File.
> >&FileAppends output and standard error to File.
Example
To run the application with the arguments blue and 12, enter:
run blue 12

Therefore, to start DBX debugger to use my_cmd command file above, we enter this in the shell:
$ dbx -c my_cmd [your_program_name]
The -c flag instruct dbx to use my_cmd as the command file, after that you just need to enter your program executable name. Anyway, after dbx parses the command file, it runs your program as if you type run subcommand in a DBX debugging session.

That's it. I hope this post helps those who just started using DBX in AIX or other AIX-like environment.

Thursday, April 21, 2016

Blank Character is Not Null Character

Perhaps, it is partly because I'm not a native English speaker and partly because I forgot that the devil is in the details that I inadvertently wrote code that supposed to initialize a variable with blank characters with null characters.

Blank characters refer to whitespace character (https://en.wikipedia.org/wiki/Whitespace_character), not the null character (https://en.wikipedia.org/wiki/Null_character). In many cases, acceptable blank character for program input is space, which has a value of 20h in ASCII and 40h in EBCDIC. Well, this is not obvious for me at first until I'm debugging some code in an EBCDIC environment. 

So, next time you read an API documentation that says *BLANK*, it doesn't refer to NULL ('\0') character, but it refers to one of the whitespace character which in many cases refer to SPACE (' ').

As a bonus, these are some usable character conversion tables:
http://www.astrodigital.org/digital/ebcdic.html (this one explicitly states SPACE character as BLANK)

Sunday, March 20, 2016

[How-to] Copy Contents of Tmux Pane to File

There are many cases where you might want to copy part of (or the entire) contents of a tmux pane into file for further usage. This post explains how to that. Before we proceed, you need to be aware that my tmux key binding is probably different from yours, see: http://darmawan-salihun.blogspot.co.id/2015/02/zsh-tmux-configuration-for-arch-linux.html. I'm using vi mode-keys.

The steps to copy contents of a tmux pane are as follows:
  1. Enter copy mode. The key combo to enter copy mode depends on your tmux configuration, but the principle never change: press your tmux prefix key combo--in my case, this is Ctrl+A--and then press your tmux copy mode key-in my case the key is Escape. If you look at my tmux configuration file linked above, I should press: Ctrl+A, then Escape to enter tmux copy mode
  2. Select the text that you want to copy. I'm using the v key to select the text because that's how my tmux key binding is configured (~/.tmux.conf: bind -t vi-copy 'v' begin-selection). This is how it looks like once I have selected some text in copy mode with vi-like keys (the selected text is in yellow background):
    tmux copy mode
  3. Copy the selected text into tmux buffer. I'm using y (yank) key to select the text because that's how my tmux key binding is configured (~/.tmux.conf: bind -t vi-copy 'y' copy-selection). Now the selected text is in tmux buffer.
You can "save" contents of the tmux buffer to a file with this command:
$ tmux save-buffer -b [the_buffer_yo_want_to_save] -a [name_of_the_file_you_want_to_append_the_buffer_to]
If you want to just overwrite the contents of the "target" file with the tmux buffer contents, you can omit the -a flag. You can select your "source" tmux buffer before saving the tmux buffer contents by using the command completion key in your shell (in my case TAB -- I'm using zsh). Tmux buffer possibly contains more than one buffer if you previously "copy" something into the buffer, either inadvertently or on-purpose. This is how tmux buffer looks like in my shell as I'm trying to save one of its entries:
Tmux "copy buffer" number 0 to number 10

Anyway, you can also bind keys to the tmux save buffer command in order to make the text available for GUI application as in my tmux.conf shown in the link above. This is the configuration snippet (I'm using icccm clipboard for that):
# extra commands for interacting with the ICCCM clipboard
bind C-c run "tmux save-buffer - | xclip -i -sel clipboard"
bind C-v run "tmux set-buffer \"$(xclip -o -sel clipboard)\"; tmux paste-buffer"
That's it. I hope the explanation is clear enough because there are many explanation on the web that's not quite well-suited for newbie.

Thursday, February 18, 2016

C Macro 101: Stringizing Operator and Token Pasting Operator

Present day C language has evolved to the point where we can be productive writing code in it. Perhaps, many still views that writing code in C is tedious compared to writing code in other higher-level language. I think, that's a subjective view. Perhaps, only functional languages have higher productivity compared to C for most non-system-programming.

Two of the "productivity features" in C that I found indispensable at the moment are the stringizing operator (#) and token-pasting operator (a.k.a concatenation operator) (##). Both of these operators can be used in C macros only, you cannot use it outside of C macros. However, both are very powerful tools to create function templates in C. Yes, you read that right. Function templates are not just for C++ programmers. C programmers also has a sort of function template via C macros, despite it's a little "rudimentary" compared to C++.

Most stringizing and token-pasting tutorials out there don't provide useful code snippets with regard to the "real" power of these operators. This post aims to fill that gap. Without further ado, let's get to the code. You can clone the complete sample code used in this post from https://github.com/pinczakko/sample_token_pasting
#include <stdio.h>
#include <assert.h>

typedef enum {
 PEEK_REQUEST_ITEM,
 PEEK_REPLY_ITEM,
 MOD_REQUEST_ITEM,
 MOD_REPLY_ITEM
}ITEM_TYPE;

struct queue_item {
 ITEM_TYPE type;
 char payload[32];
};

struct handler {
 int identifier;
 int (*process_data) (void* data);
};

#define PRINT_FUNC_NAME \
do { \
 printf("In function: %s() \n", __func__); \
} while (0);

static inline int process_peek_request(const struct queue_item *const
          peek_req,
          struct handler *p)
{
 /** Algorithm A ...  */
 PRINT_FUNC_NAME
 return 0;
}

static inline int process_peek_reply(const struct queue_item *const
        peek_rep,
        struct handler *p)
{
 /** Algorithm B ...  */
 PRINT_FUNC_NAME
 return 0;
}

static inline int process_modification_request(const struct queue_item *const
          modification_req,
          struct handler *p)
{
 /** TODO: Invalidate cached items taking part in the MOD transaction **/

 /** TODO: Enqueue the MOD request to egress_port_output_queue */

 /** TODO: Notify egress_port thread to consume the MOD request */

 PRINT_FUNC_NAME

 return 0;/** Success */

 error:
 return -1;/** Failed */
}

static inline int process_modification_reply(const struct queue_item *const
        modification_rep,
        struct handler *p)
{
 /** TODO: Enqueue the MOD reply to ingress_port_output_queue */

 /** TODO: Notify ingress_port thread to consume the MOD reply */

 PRINT_FUNC_NAME

 return 0;/** Success */

 error:
 return -1;/** Failed */
}

#define PROCESS_DEQUEUED_ITEM(MESSAGE, TYPE) \
static inline int process_dequeued_##MESSAGE(const struct queue_item *const MESSAGE,\
         struct handler *p) \
{ \
 assert((MESSAGE != NULL) && (p != NULL)); \
 \
 assert((MESSAGE->type == PEEK_##TYPE) || \
        (MESSAGE->type == MOD_##TYPE)); \
 \
 PRINT_FUNC_NAME \
 \
 if (MESSAGE->type == PEEK_##TYPE) { \
  printf("Processing PEEK " #MESSAGE "\n"); \
  return process_peek_##MESSAGE(MESSAGE, p); \
 \
 } else if (MESSAGE->type == MOD_##TYPE) { \
  printf("Processing MOD " #MESSAGE "\n"); \
  return process_modification_##MESSAGE(MESSAGE, p); \
 \
 } else { \
  printf("Warning: Unknown " #MESSAGE " type!\n"); \
  return -1; /** Failed */ \
 } \
}

/** Token-pasted function instance to handle request message */
PROCESS_DEQUEUED_ITEM(request, REQUEST_ITEM)

/** Token-pasted function instance to handle reply message */
PROCESS_DEQUEUED_ITEM(reply, REPLY_ITEM)

int main (int argc, char * argv[])
{
 int i; 

 struct queue_item req_item[2], rep_item[2];
 struct handler h;

 req_item[0].type = PEEK_REQUEST_ITEM;
 req_item[1].type = MOD_REQUEST_ITEM;

 rep_item[0].type = PEEK_REPLY_ITEM;
 rep_item[1].type = MOD_REPLY_ITEM;

 for (i = 0; i < 2; i++) {
  process_dequeued_request(&req_item[i], &h);
 }

 for (i = 0; i < 2; i++) {
  process_dequeued_reply(&rep_item[i], &h);
 }

 return 0;
}
The code above will produce two different functions,  process_dequeued_request() and process_dequeued_reply(), respectively, to  handle request and reply. The algorithm used by both functions is very similar, the differences are only in function naming, parameters naming and constant naming. Therefore, it is natural to use token-pasting and stringizing operators in the code. In C++, you would use  C++ template. You can achieve the same thing in C with token-pasting (##) and stringizing operator (#).

The stringizing operator (#) basically creates a C string from the C macro parameter. For example, if you pass reply as parameter to a C macro, the C preprocessor will produce "reply" (C string -- including the double quotes) as output if the stringizing operator is applied to the macro parameter. Perhaps, it's a bit hard to understand. Let's look at the sample code above. In this line:
PROCESS_DEQUEUED_ITEM(reply, REPLY_ITEM)
we asked the preprocessor to instantiate the process_dequeued_reply() function. In the process_dequeued_reply() function, the code uses the stringizing operator like so:
printf("Processing PEEK " #MESSAGE "\n");
After GCC preprocessing stage, this function call becomes:
printf("Processing PEEK " "reply" "\n");
As you see, the reply macro input parameter is transformed into "reply", i.e. stringized.
Perhaps, you asked, how can I obtain the preprocessor output? Well, in most compiler, you can obtain the preprocessor output via certain compiler switch(es). In GCC, you can use the -save-temps switch to do so. The GCC preprocessor output is a *.i file with the same name as the source file. In my sample code, the Makefile uses this switch to instruct GCC to place the preprocessor output in the source code directory. I used the indent utility (indent -linux sample_token_pasting.i) to beautify the preprocessor output.
This is an example snippet of the "beautified" preprocessor output from sample_token_pasting.i file:
static inline int process_dequeued_reply(const struct queue_item *const reply,
      struct handler *p)
{
#107 "sample_token_pasting.c" 3 4
 ((
#107 "sample_token_pasting.c"
   (reply !=
#107 "sample_token_pasting.c" 3 4
    ((void *)0)
#107 "sample_token_pasting.c"
   ) && (p !=
#107 "sample_token_pasting.c" 3 4
         ((void *)0)
#107 "sample_token_pasting.c"
   )
#107 "sample_token_pasting.c" 3 4
  )? (void)(0) : __assert_fail(
#107 "sample_token_pasting.c"
          "(reply != ((void *)0)) && (p != ((void *)0))"
#107 "sample_token_pasting.c" 3 4
          , "sample_token_pasting.c", 107,
          __PRETTY_FUNCTION__))
#107 "sample_token_pasting.c"
     ;
#107 "sample_token_pasting.c" 3 4
 ((
#107 "sample_token_pasting.c"
   (reply->type == PEEK_REPLY_ITEM)
   || (reply->type == MOD_REPLY_ITEM)
#107 "sample_token_pasting.c" 3 4
  )? (void)(0) : __assert_fail(
#107 "sample_token_pasting.c"
          "(reply->type == PEEK_REPLY_ITEM) || (reply->type == MOD_REPLY_ITEM)"
#107 "sample_token_pasting.c" 3 4
          , "sample_token_pasting.c", 107,
          __PRETTY_FUNCTION__))
#107 "sample_token_pasting.c"
     ;
 do {
  printf("In function: %s() \n", __func__);
 } while (0);
 if (reply->type == PEEK_REPLY_ITEM) {
  printf("Processing PEEK " "reply" "\n");
  return process_peek_reply(reply, p);
 } else if (reply->type == MOD_REPLY_ITEM) {
  printf("Processing MOD " "reply" "\n");
  return process_modification_reply(reply, p);
 } else {
  printf("Warning: Unknown " "reply" " type!\n");
  return -1;
 }
}
It's a bit unwieldy. However, sometimes you need to be sure that you don't make any silly mistake with your C macro by looking into the preprocessor output.

Let's move to the other operator, the token-pasting operator. This operator basically "paste and concatenate" the macro parameter to create the "target" C token from both the macro parameter and the C token "fragment" in your macro. If you don't truly understand what a C language token yet, please read http://www.help2engg.com/c_tokens and https://msdn.microsoft.com/en-us/library/c6sb2c6b.aspx. The sample code uses the token-pasting operator to create "configurable" C function name and constants. This code:
PROCESS_DEQUEUED_ITEM(reply, REPLY_ITEM)
produces three C tokens: process_dequeued_reply function name, PEEK_REPLY_ITEM constant and MOD_REPLY_ITEM constant. You can see the process clearly in the GCC preprocessor output snippet above. The process_dequeued_ C token "fragment" is concatenated with the value of the MESSAGE macro parameter, which in this macro invocation:
PROCESS_DEQUEUED_ITEM(reply, REPLY_ITEM)
has a value equal to reply. Therefore, the concatenated ("target") C token is process_dequeued_reply. The constants also undergo similar transformation via the TYPE macro parameter.

Anyway, this is the output of the program (compiled from the sample code)
In function: process_dequeued_request() 
Processing PEEK request
In function: process_peek_request() 
In function: process_dequeued_request() 
Processing MOD request
In function: process_modification_request() 
In function: process_dequeued_reply() 
Processing PEEK reply
In function: process_peek_reply() 
In function: process_dequeued_reply() 
Processing MOD reply
In function: process_modification_reply() 
Well, the output just shows which functions are invoked and their order of invocation to clarify the inner working of both stringizing operator and token-pasting operator.

Hopefully, the explanation in this post clarify the power of C stringizing and token-pasting operator.

Tuesday, January 19, 2016

Using Boost C++ Library from C

This post is related to another post: Building C++ Application with Boost Library and Autotools in Linux. If you haven't know how to use Boost C++library in an autotools project, please read that post.

The purpose of this post is to explain what you need to do to use use Boost from your C language code. The following are the steps required to use Boost from C:
  1. Decide which part of Boost that you require in your C application.
  2. Wrap that part of Boost as "convenience" C library.
  3. Link your C application code to the "convenience" library. 
The steps above is easier said than done. Don't worry, I have provided a sample project over at github: https://github.com/pinczakko/boost_spsc_queue_c_wrapper. Just download the code and try to make sense of it.
DISCLAIMER
-----------
- The code assumes that the platform in which it runs has a working pthread implementation.
- The code is not production quality code. Use it at your own risk.
The sample code basically wraps Boost SPSC (single producer single consumer) lockfree queue into a convenience C library and the sample application links to the convenience library.

Anyway, the most important part of the code that you need to understand is the part that "returns" a C++ class as an "opaque" C structure. Probably, it's a quite alien concept. But, I assure you that this alien concept is the core of C<-->C++ interoperability. Below is the relevant code snippets:


//---- START spsc_interface.hpp file -----------------
#ifndef  SPSC_WRAPPER_H
#include "spsc_wrapper.h"
#endif //SPSC_WRAPPER_H 

class spsc_interface {
public:
    explicit spsc_interface();
    explicit spsc_interface(const spsc_interface &);
    ~spsc_interface() {};
//...

};
//---- END spsc_interface.hpp file -----------------

//---- START spsc_wrapper.h file -----------------
#ifdef __cplusplus
extern "C" {
#endif
//...
 typedef struct spsc_interface spsc_interface;

 spsc_interface *create_spsc_interface();

//---- END spsc_wrapper.h file -----------------

//---- START spsc_wrapper.cc file -----------------

#include "spsc_interface.hpp"
#include "spsc_wrapper.h"

#ifdef __cplusplus
extern "C" {
#endif

spsc_interface * create_spsc_interface()
{
    return new spsc_interface();
}

...
#ifdef __cplusplus
}
#endif
//---- END spsc_wrapper.cc file -----------------

As you see above, the wrapper function create_spsc_interface() returns an opaque object of type struct spsc_interface. In fact, this opaque object is a C++ class in disguise as you can see in the spsc_wrapper.cc snippet above. If you ask: How's that possible? Well, the answer lies in the linking process. Let's see the relevant snippets from src/Makefile.am:

wrapper_test_SOURCES = wrapper_test.c 
### Below is just a trick to force using C++ linker
nodist_EXTRA_wrapper_test_SOURCES = dummy.cc 

As you can see, we're not using a C linker to link the entire project, but we use a C++ linker (by tricking autotools ;) . A C++ linker "understand" the idiom used in spsc_wrapper.cc above.

Monday, January 18, 2016

Little Known C Language "Feature"s

There are several useful "feature"s in C language which are not widely known. Only those digging deep enough to C libraries or kernel code knows these "feature"s. These are some of them:
  • The C offsetof() macro. In many cases, this macro is utilized to help implement a sort of object oriented relationship (akin to sub-class<-->super-class in C++) between related C structures (or object if you prefer). You can use it to "obtain" pointer to the super-class given pointer to the sub-class and vice versa. Some useful explanation about this feature: Offsetof (Wikipedia), Greg KH explanation on offsetof in Linux Kernel code. Anyway, in some cases you can use compiler extension (built-ins) to achieve the same effect. In GCC there is __builtin_offsetof built-in that you can use. In fact, offsetof() translates to __builtin_offsetof in GCC.
  • const keyword can be used to designated input and output arguments in C/C++ function. You can use this keyword to force the compiler to notify you or outright balked out of compilation when some code inadvertently change the value of a function argument designated as const. Basically, you use it to enforce whether the function can modify the argument or not. An argument which strictly used as input (in a sense, must not be modified) can be designated as such with const. This is an excellent writeup on the matter: The C++ 'const' Declaration: Why & How.
  • Pointer aliasing issue, i.e. a situation where more than one pointer refer to the same object. Understanding this "feature" is important when you want to write high-performance C/C++ code. It's a quite complicated but important matter because it gives you the capability to "tell" the C/C++ compiler about your intention and avoid expensive memory accesses. See:
I think knowing these "feature"s is critical to create high quality and high performance C/C++ code.

Wednesday, January 13, 2016

Sanitizing Your C/C++ Code

GCC already has the capability to help with sanitizing your C/C++ code since version 4.9. This is probably one of GCC capability not known widely. Sanitizing in this context means cleaning up possible error, just in case you're not yet familiar with the terms. See: Defensive_programming for introduction.
This is a good introduction on GCC C/C++ "sanitizer":GCC Undefined Behavior Sanitizer – ubsan and for some details from GCC documentation (not really with exhaustive explanation, but complete) see: GCC Debugging Options -- see the section on -fsanitize=.. option.

Anyway, using the sanitizer option is not quite straightforward because you have to install the corresponding libraries for the sanitizer. These libraries will provide replacement for C library functions related to the sanitizer. For example: it provides malloc() replacement to provide memory leak detection. There are two libraries which you must install aside from GCC version >= 4.9, i.e. libasan (address sanitizer library) and libubsan (undefined behavour sanitizer library). I'll give you an example here in CentOS 7.

CentOS 7 comes with GCC 4.8 by default. Therefore, sanitizer support (especially libubsan) is missing. Therefore, we need to upgrade it. To do so, update the CentOS  repo data with ones for Fedora 23, so that we can upgrade to GCC 5.1.1, like so:
# cat << EOF > /etc/yum.repos.d/Fedora-Core23.repo
[warning:fedora]
name=fedora
mirrorlist=http://mirrors.fedoraproject.org/mirrorlist?repo=fedora-23&arch=$basearch
enabled=1
gpgcheck=1
gpgkey=https://getfedora.org/static/34EC9CBA.txt
EOF
# yum update gcc g++
# yum install libasan
# yum install libubsan
Now you have updated the C and C++ compiler and also installed the required libubsan and libasan. You can proceed to use -fsanitize option in GCC to add sanitizer option in your code.
Credits go to XakRu and sm1Ly for the script/command to add Fedora repo to CentOS (see: How To install gcc 5.2 on centos 7.1? [closed]).

Now let's proceed to learn how to use the flags in our build script. If you are using autotools, you can do these steps:
  1. Add -fsanitize=.. to your CFLAGS and/or CXXFLAGS
  2. Add -fsanitize=.. to your LDFLAGS as needed. Taking into account the GCC Debugging Options explanation.
  3. In some cases, you need to remove memory allocation function checks from configure.ac
This is an example build script (named build_debug.sh) which invokes configure script generated by autotools:
../configure CFLAGS="-DDEBUG -g -O0 -fsanitize=undefined -fsanitize=leak" \
 CXXFLAGS="-DDEBUG -g -O0 -fsanitize=undefined -fsanitize=leak" \
 LDFLAGS="-fsanitize=leak" --enable-debug \
 && make V=1
The build always failed when I enabled memory allocation function checks in configure.ac. Therefore, it needs to be disabled like so:
     ##AC_FUNC_MALLOC
     ##AC_FUNC_REALLOC
This happens because AC_FUNC_MALLOC and AC_FUNC_REALLOC are based on a runtime tests. See:https://github.com/LLNL/ior/issues/4

Hopefully, this is helpful for those developing software in C/C++ with GCC.