Monday, February 13, 2017

Fix for systemd v232 build failure when using GNU gperf 3.1

You might encounter the build failure in this post title if you're the kind that roll your own Linux Systemd. I encountered it while building Systemd package for my Arch Linux SELinux variant.

The culprit is mismatch in lookup functions declaration--hash functions--generated by GNU gperf version 3.1 and the function declaration in Systemd version 232. I managed to complete the build after creating and using this patch:https://github.com/pinczakko/systemd-gperf-3.1-patch. As for whether the patch is working or not, well, it works without problems in my machine. Nonetheless, it's just a very minor patch.

UPDATE:
------------
This issue has been fixed just now in systemd v232. See: https://github.com/systemd/systemd/commit/c9f7b4d356a453a01aa77a6bb74ca7ef49732c08

UPDATE 2:
---------------
You can add the change as cherry-picked git change to the PKGBUILD to fix this issue in Arch Linux SELinux package. This is the diff (or patch):
diff --git a/PKGBUILD b/PKGBUILD
index 47d82d1..1e57ec7 100644
--- a/PKGBUILD
+++ b/PKGBUILD
@@ -61,6 +61,7 @@ _backports=(
   'cfed63f60dd7412c199652825ed172c319b02b3c'  # nspawn: fix exit code for --help and --version (#4609)
   '3099caf2b5bb9498b1d0227c40926435ca81f26f'  # journal: make sure to initially populate the space info cache (#4807)
   '3d4cf7de48a74726694abbaa09f9804b845ff3ba'  # build-sys: check for lz4 in the old and new numbering scheme (#4717)
+  'c9f7b4d356a453a01aa77a6bb74ca7ef49732c08'  # build-sys: add check for gperf lookup function signature (#5055)
 )
  _validate_tag() {
Hopefully, this temporary fix could help before the official fix is included in the main Arch Linux package.

Wednesday, January 18, 2017

64-bit Software Development on IBM AIX

In this post I'll talk about software development on IBM AIX by means of open source software tools in concert with native AIX development tools.

Using GCC as the compiler to compile your application in AIX is just fine. However, GCC's ld (ld-gcc) linker is not suitable to be used as the linker. This is because linking in AIX is rather tricky and apparently only AIX linker (ld-xlc) work reliably. You can read more about this issue at Using the GNU C/C++ compiler on AIX and AIX Linking and Loading Mechanism.

AIX also has its set of binary utilities (binutils) programs. They are basically the analog of GCC binutils. AIX has native ar archiver, native ld linker (this one is the linker from the AIX xlc compiler suite), and dump utility which is analog to objdump in GCC binutils.

Now, let's see what you need to do to create 64-bit applications in AIX by using the GCC compiler and the native binutils.

  • Pass -maix64 parameter to the GCC C compiler to instruct it to emit the correct AIX 64-bit object file that can be handled by the native AIX linker.
  • Pass -b64 parameter to the native linker via GCC. You should use GCC's -Wl switch for that. The overall parameter becomes -Wl, -b64. IBM AIX ld command reference explains the parameter in detail.
  • Pass -X64 parameter to the native ar archiver to build a 64-bit AIX library. IBM AIX ar command reference explains the parameter in detail.
Once you have build the 64-bit executable or library, you may want to examine it. In Linux or BSD Unix, you would use objdump for that. In AIX, you can use the native dump utility. You need to pass -X64 parameter to dump to instruct it to work in 64-bit mode, i.e. treat the input executable/library as 64-bit binary. For example, the command to show the dependencies of a 64-bit AIX application is: dump -X64 -H.  Refer to the IBM AIX dump command reference for more details.

Listening to Multicast Netlink Socket

Netlink is the recommended way to communicate with Linux kernel from user-space application. In many cases, the communication is unicast, i.e. only one user-space application uses the netlink socket to communicate with a kernel subsystem that provides the netlink interface. But, what if the kernel subsystem provides a multicast netlink socket and you want to listen to the multicast kernel "message"s through netlink? Well, you could do that by bind()-ing to the multicast address provided by the kernel. I'm not going to provide a complete sample code. Just the most important code snippets.

First, you should head over to this netlink discussion to get a sense of the overall netlink architecture.

Once you grasped the netlink architecture, you may follow these steps/algorithm to "listen" to the multicast address(es) provided by the kernel subsystem through netlink:
  1. Init a netlink socket to the kernel subsystem you wish to access. Remember to use #include<linux/[subsystem_header].h>.
  2. Carry-out initialization on the socket if needed.
  3. Bind the socket to the multicast address provided by the kernel subsystem. The multicast address is basically a combination of the following: 
    • The netlink address family, i.e. AF_NETLINK.
    • The netlink multicast group which you can find in the kernel header. For example: the multicast group address for audit subsystem (a constant), is in the audit header file, i.e. <linux/audit.h>
    • Your application's Process ID (PID). 
  4. Read from the socket, when there is data coming in. You might want to use event-based library here, such libev or libevent. In many cases, the kernel only provides a multicast "Read-Only" channel, i.e. you can only read from it. It's not meant to be used to "write" to the kernel.
Step 3 above is probably rather vague. The code below clarify the multicast address that I talked about in that step. Look at the s_addr variable in the code below, it is the multicast address used by bind() to listen to kernel messages. The PID is included in the multicast address because the kernel need to know to which process the message should be sent.
 // ..  
 struct sockaddr_nl s_addr;
 memset(&s_addr, 0, sizeof(s_addr));
 s_addr.nl_family = AF_NETLINK;
 s_addr.nl_pad = 0;
 s_addr.nl_pid = getpid();
 s_addr.nl_groups = AUDIT_NLGRP_READLOG;

 retval = bind(fd, (struct sockaddr *)&s_addr, sizeof(s_addr));
 if (retval != 0) {
  PRINT_ERR_MSG("Failed binding to kernel multicast address");
  return -1;
 }
 // ..
Anyway, because the channel used by the code is multicast channel, multiple user-space application can "listen" to the same kernel subsystem simultaneously. The scenario explained here is not the norm. But, some use cases required this approach.

Monday, January 9, 2017

The Importance of C/C++ Program Exit Status in Unix/Linux

The return value from main() in C/C++ programs a.k.a exit status is often overlooked by less advanced Unix/Linux programmers. Nevertheless, it's important to keep in mind the exit status of your C/C++ code because it will help in the long run. There are at least 2 scenarios where exit status is important:

  1. When you're using shell script to automate processing by using several programs to perform "sub-tasks". In this case, the shell script--very possibly--need to make logical decision based on your program exit status. 
  2. When your C/C++ program is part of a multiprocess program in which your C/C++ program is called/executed (a.k.a fork-ed and exec-ed) by the parent process. In many cases, the parent process need to know whether your program executes successfully or not.

Now, let's be more concrete. Let's say you anticipated that your C/C++ program will be invoked by bash-compatible shell. In that case, your code's exit status must make sense to bash. Therefore, you should:


Following the rules above doesn't necessarily mean your C/C++ program will be bug free because some of the exit status are ambiguous. Nevertheless, it should make it more palatable to be combined into larger system and ease debugging.

As closing, let's look at a very simple code that uses sysexits.h below.
#include <stdio.h>
#include <sysexits.h>
/**
 * Usage: 
 *  test_code  -x param1 -z param2
 */
int main(int argc, char *argv[])
{

 if (argc != 5) {
  printf("Usage: %s -x param1 -z param2\n", argv[0]);
  return EX_USAGE;
 }


      //... irrelevant code omitted

       return EX_OK;
}