Friday, November 20, 2015

Openssh Hiccups and Fix on IBM System i (AS/400)

Let me start with the symptoms:
  • Logging in to the AS/400 (PASE) via openssh always failed despite the username, password and all directory/file permission has been triple-checked and confirmed to be OK.
  • From SSH log, it seems that the login is successful but the connection immediately "kicked out" for some reason. 
This is how I fix this problem:
  1. Run ssh client with most verbose flag.
  2. Run the shell (default shell) invoked by sshd in the server upon login.
  3. Look for clues from 1 and 2.
  4. Fix the problem based on the clue.
Running ssh -vvv in the ssh client machine produces this log:
debug3: no such identity: /home/pinczakko/.ssh/id_ecdsa: No such file or directory
debug1: Trying private key: /home/pinczakko/.ssh/id_ed25519
debug3: no such identity: /home/pinczakko/.ssh/id_ed25519: No such file or directory
debug2: we did not send a packet, disable method
debug3: authmethod_lookup keyboard-interactive
debug3: remaining preferred: password
debug3: authmethod_is_enabled keyboard-interactive
debug1: Next authentication method: keyboard-interactive
debug2: userauth_kbdint
debug2: we sent a keyboard-interactive packet, wait for reply
debug1: Authentications that can continue: publickey,password,keyboard-interactive
debug3: userauth_kbdint: disable: no info_req_seen
debug2: we did not send a packet, disable method
debug3: authmethod_lookup password
debug3: remaining preferred: 
debug3: authmethod_is_enabled password
debug1: Next authentication method: password
xxx@10.10.10.10's password: 
debug2: we sent a password packet, wait for reply
debug1: Authentication succeeded (password).
Authenticated to 10.10.10.10 ([10.10.10.10]:22).
debug1: channel 0: new [client-session]
debug3: ssh_session2_open: channel_new: 0
debug2: channel 0: send open
debug1: Entering interactive session.
debug2: callback start
debug2: fd 3 setting TCP_NODELAY
debug3: ssh_packet_set_tos: set IP_TOS 0x08
debug2: client_session2_setup: id 0
debug2: channel 0: request shell confirm 1
debug2: callback done
debug2: channel 0: open confirm rwindow 0 rmax 32768
debug2: channel 0: rcvd adjust 2097152
debug2: channel_input_status_confirm: type 99 id 0
debug2: shell request accepted on channel 0
debug1: client_input_channel_req: channel 0 rtype exit-signal reply 0
debug2: channel 0: rcvd eof
debug2: channel 0: output open -> drain
debug2: channel 0: obuf empty
debug2: channel 0: close_write
debug2: channel 0: output drain -> closed
debug2: channel 0: rcvd close
debug2: channel 0: close_read
debug2: channel 0: input open -> closed
debug3: channel 0: will not send data after close
debug2: channel 0: almost dead
debug2: channel 0: gc: notify user
debug2: channel 0: gc: user detached
debug2: channel 0: send close
debug2: channel 0: is dead
debug2: channel 0: garbage collecting
debug1: channel 0: free: client-session, nchannels 1
debug3: channel 0: status: The following connections are open:
  #0 client-session (t4 r0 i3/0 o3/0 fd -1/-1 cc -1)

Transferred: sent 3328, received 2568 bytes, in 0.5 seconds
Bytes per second: sent 6346.9, received 4897.5
debug1: Exit status -1

As you can see in the log, nothing particularly revealing. You need to run it in real time to see the connection immediately disconnected (starting at debug1: client_input_channel_req: channel 0 rtype exit-signal reply 0). This makes it clear that the client has successfully authenticated but somehow the shell on the server "died" or some permission problem on the user default login directory or other kinds of permission problem. I have triple checked the permission on all related paths without clear leads.

The next step I did was to check which shell executable is the default shell, i.e. which shell is invoked by openssh when you finished logging in to the machine running openssh. In System i version 6.1 (its AS400 PASE), openssh configuration file is located in QOpenSys/QIBM/UserData/SC1/OpenSSH/openssh-3.8.1p1/etc/sshd_config. Unfortunately, there is no default shell variable set in there. However, I have other System i machine that runs openssh with all default settings just fine. Cross-checking the latter machine I found out the default shell is QOpenSys/usr/bin/bsh. Therefore, I need to check whether bsh is working fine in the problematic System i machine.

I found out the bsh executable in the problematic System i machine is somehow broken. When I run bsh from PASE shell via 5250 terminal (via CALL QP2TERM command), bsh immediately stopped. Basically, the shell said bsh is "killed". I tried other shell, i.e. csh (QOpenSys/usr/bin/csh) and this shell worked. Therefore, what I need to fix the problem is a way to force openssh to use csh as the login shell.

Now, we arrived to the FIX. Forcing openssh to use certain shell when logging in via openssh in System i (at least in version 6.1) can be done via the ibmpaseforishell "magic" keyword (see: http://www-01.ibm.com/support/docview.wss?uid=nas8N1011555). These are the steps:
  1. Login to the System i machine via 5250 terminal application. 
  2. Open QOpenSys/QIBM/UserData/SC1/OpenSSH/openssh-3.8.1p1/etc/sshd_config via EDTF. This is the command: EDTF 'QOpenSys/QIBM/UserData/SC1/OpenSSH/openssh-3.8.1p1/etc/sshd_config'
  3. Add the ibmpaseforishell "magic" keyword near the end of the configuration file. This is the result for me:
    #no default banner path 
    #Banner /some/path 
    
    #ibm pase for IBM i shell 
    ibmpaseforishell /QOpenSys/usr/bin/csh 
    
    #override default of no subsystems 
    Subsystem sftp /QOpenSys/QIBM/ProdData/SC1/OpenSSH/openssh-3.8.1p1/libexec/sftp-server 
    
    
    As you can see, I changed the default shell to csh via the ibmpaseforishell keyword.
  4. Check that ssh client can now connect to the ssh server (daemon) in AS400. When I carried out this step, I finally get a working shell, i.e. the csh shell. 
The key point here is you must run ssh in full verbose (via the -vvv switch) to help debug the connection problem. When I was sure that openssh is just fine, then I moved up the chain by checking the default shell. It turns out the default shell is the culprit.

NOTE:
----------
- It seems that not all System i machine supports ibmpaseforishell keyword. It seems that the machine has to have this PTF. However, the keyword works in the problematic System i machine that I worked with.

Post a Comment

No comments: