GNU Screen with scrolling history under X window system

Screen program is really a good piece of software for every one in the console. We use it every day to execute programs in the “background” or if we want to end our ssh session to the server without stopping our executed program. We need a tiny configuration to use our favorite scrolling of the console program under the X windowing systems!

So if you are under let’s say KDE or Gnome (especially Ubuntu and CentOS) and use konsole or xterm and you want to be able to scroll the history of your ssh sessions, when you start something under gnu screen you must set the following file at your home directory

/home/<your_user>/.screenrc

with this configuration:

termcapinfo xterm* ti@:te@

Resume installation after a package build error, when emerging firefox under Gentoo

If you recently tried to emerge the latest Firefox in Gentoo (firefox 58.0/58.0.1) and you ended up with a compilation error of this kind:

UnicodeEncodeError: 'ascii' codec can't encode characters in position 142-144: ordinal not in range(128)

35:47.22 0 compiler warnings present.
35:47.31 Exception when writing resource usage file: u'compile'
35:47.32 We know it took a while, but your build finally finished successfully!
To view resource usage of the build, run |mach resource-usage|.
Error running mach:

    ['build', '-v']

The error occurred in code that was called by the mach command. This is either
a bug in the called code itself or in the way that mach is calling it.

You should consider filing a bug for this issue.

If filing a bug, please include the full output of mach, including this error
message.

The details of the failure are as follows:

KeyError: u'compile'

The thing is that the build reports successful completion as you can see below from the report, but the emerge process is terminated with error, so in this situation, when the build process is finished successfully we can manually install the package with:

[root@local ]# cd /usr/portage/www-client/firefox
[root@local firefox]# ebuild firefox-58.0.1.ebuild install && ebuild firefox-58.0.1.ebuild qmerge

* The whole output of the error

Exception in thread ProcessReader:
Traceback (most recent call last):
  File "/usr/lib64/python2.7/threading.py", line 801, in __bootstrap_inner
    self.run()
  File "/usr/lib64/python2.7/threading.py", line 754, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/var/tmp/portage/www-client/firefox-58.0.1/work/firefox-58.0.1/testing/mozbase/mozprocess/mozprocess/processhandler.py", line 986, in _read
    callback(line.rstrip())
  File "/var/tmp/portage/www-client/firefox-58.0.1/work/firefox-58.0.1/testing/mozbase/mozprocess/mozprocess/processhandler.py", line 904, in __call__
    e(*args, **kwargs)
  File "/var/tmp/portage/www-client/firefox-58.0.1/work/firefox-58.0.1/python/mach/mach/mixin/process.py", line 86, in handleLine
    line_handler(line)
  File "/var/tmp/portage/www-client/firefox-58.0.1/work/firefox-58.0.1/python/mozbuild/mozbuild/controller/building.py", line 720, in on_line
    self.log(logging.INFO, 'build_output', {'line': line}, '{line}')
  File "/var/tmp/portage/www-client/firefox-58.0.1/work/firefox-58.0.1/python/mach/mach/mixin/logging.py", line 54, in log
    extra={'action': action, 'params': params})
  File "/usr/lib64/python2.7/logging/__init__.py", line 1231, in log
    self._log(level, msg, args, **kwargs)
  File "/usr/lib64/python2.7/logging/__init__.py", line 1286, in _log
    self.handle(record)
  File "/usr/lib64/python2.7/logging/__init__.py", line 1296, in handle
    self.callHandlers(record)
  File "/usr/lib64/python2.7/logging/__init__.py", line 1336, in callHandlers
    hdlr.handle(record)
  File "/usr/lib64/python2.7/logging/__init__.py", line 759, in handle
    self.emit(record)
  File "/var/tmp/portage/www-client/firefox-58.0.1/work/firefox-58.0.1/python/mozbuild/mozbuild/controller/building.py", line 548, in emit
    self.fh.write(msg)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 142-144: ordinal not in range(128)

35:47.22 0 compiler warnings present.
35:47.31 Exception when writing resource usage file: u'compile'
35:47.32 We know it took a while, but your build finally finished successfully!
To view resource usage of the build, run |mach resource-usage|.
Error running mach:

    ['build', '-v']

The error occurred in code that was called by the mach command. This is either
a bug in the called code itself or in the way that mach is calling it.

You should consider filing a bug for this issue.

If filing a bug, please include the full output of mach, including this error
message.

The details of the failure are as follows:

KeyError: u'compile'

  File "/var/tmp/portage/www-client/firefox-58.0.1/work/firefox-58.0.1/python/mozbuild/mozbuild/mach_commands.py", line 170, in build
    mach_context=self._mach_context)
  File "/var/tmp/portage/www-client/firefox-58.0.1/work/firefox-58.0.1/python/mozbuild/mozbuild/controller/building.py", line 1232, in build
    telemetry_data = monitor.get_resource_usage()
  File "/var/tmp/portage/www-client/firefox-58.0.1/work/firefox-58.0.1/python/mozbuild/mozbuild/controller/building.py", line 433, in get_resource_usage
    o['tiers'] = self.tiers.tiered_resource_usage()
  File "/var/tmp/portage/www-client/firefox-58.0.1/work/firefox-58.0.1/python/mozbuild/mozbuild/controller/building.py", line 150, in tiered_resource_usage
    self.add_resources_to_dict(t_entry, phase=tier)
  File "/var/tmp/portage/www-client/firefox-58.0.1/work/firefox-58.0.1/python/mozbuild/mozbuild/controller/building.py", line 159, in add_resources_to_dict
    end=end, phase=phase, per_cpu=False)
  File "/var/tmp/portage/www-client/firefox-58.0.1/work/firefox-58.0.1/testing/mozbase/mozsystemmonitor/mozsystemmonitor/resourcemonitor.py", line 468, in aggregate_cpu_percent
    data = self.phase_usage(phase)
  File "/var/tmp/portage/www-client/firefox-58.0.1/work/firefox-58.0.1/testing/mozbase/mozsystemmonitor/mozsystemmonitor/resourcemonitor.py", line 427, in phase_usage
    time_start, time_end = self.phases[phase]
 * ERROR: www-client/firefox-58.0.1::gentoo failed (compile phase):
 *   (no error message)
 * 
 * Call stack:
 *     ebuild.sh, line 124:  Called src_compile
 *   environment, line 5006:  Called die
 * The specific snippet of code:
 *       MOZ_MAKE_FLAGS="${MAKEOPTS}" SHELL="${SHELL:-${EPREFIX}/bin/bash}" MOZ_NOSPAM=1 ./mach build -v || die
 * 
 * If you need support, post the output of `emerge --info '=www-client/firefox-58.0.1::gentoo'`,
 * the complete build log and the output of `emerge -pqv '=www-client/firefox-58.0.1::gentoo'`.
 * The complete build log is located at '/var/tmp/portage/www-client/firefox-58.0.1/temp/build.log'.
 * The ebuild environment file is located at '/var/tmp/portage/www-client/firefox-58.0.1/temp/environment'.
 * Working directory: '/var/tmp/portage/www-client/firefox-58.0.1/work/firefox-58.0.1'
 * S: '/var/tmp/portage/www-client/firefox-58.0.1/work/firefox-58.0.1'

>>> Failed to emerge www-client/firefox-58.0.1, Log file:

>>>  '/var/tmp/portage/www-client/firefox-58.0.1/temp/build.log'

 * Messages for package www-client/firefox-58.0.1:

 * You are enabling official branding. You may not redistribute this build
 * to any users on your network or the internet. Doing so puts yourself into
 * a legal problem with Mozilla Foundation
 * You can disable it by emerging firefox _with_ the bindist USE-flag
 * LINGUAS value bg is not enabled using L10N use flags
 * ERROR: www-client/firefox-58.0.1::gentoo failed (compile phase):
 *   (no error message)
 * 
 * Call stack:
 *     ebuild.sh, line 124:  Called src_compile
 *   environment, line 5006:  Called die
 * The specific snippet of code:
 *       MOZ_MAKE_FLAGS="${MAKEOPTS}" SHELL="${SHELL:-${EPREFIX}/bin/bash}" MOZ_NOSPAM=1 ./mach build -v || die
 * 
 * If you need support, post the output of `emerge --info '=www-client/firefox-58.0.1::gentoo'`,
 * the complete build log and the output of `emerge -pqv '=www-client/firefox-58.0.1::gentoo'`.
 * The complete build log is located at '/var/tmp/portage/www-client/firefox-58.0.1/temp/build.log'.
 * The ebuild environment file is located at '/var/tmp/portage/www-client/firefox-58.0.1/temp/environment'.
 * Working directory: '/var/tmp/portage/www-client/firefox-58.0.1/work/firefox-58.0.1'
 * S: '/var/tmp/portage/www-client/firefox-58.0.1/work/firefox-58.0.1'

xmr-stak starts with MEMORY ALLOC FAILED

Let’s start to mine some worthy coins, we load our xmr-stak and get the error of memory allocation field:

root@ubuntu-System-Product-Name:~/xmr-stak/build/bin# ./xmr-stak 
[2018-01-26 16:41:02] : MEMORY ALLOC FAILED: mmap failed
[2018-01-26 16:41:02] : MEMORY ALLOC FAILED: mmap failed
[2018-01-26 16:41:02] : MEMORY ALLOC FAILED: mmap failed
[2018-01-26 16:41:02] : MEMORY ALLOC FAILED: mmap failed
[2018-01-26 16:41:02] : MEMORY ALLOC FAILED: mmap failed
-------------------------------------------------------------------
xmr-stak 2.2.0 2ae7260

[2018-01-26 16:41:02] : Start mining: MONERO
[2018-01-26 16:41:02] : Starting NVIDIA GPU thread 0, no affinity.
[2018-01-26 16:41:02] : MEMORY ALLOC FAILED: mmap failed
[2018-01-26 16:41:02] : WARNING: No AMD OpenCL platform found. Possible driver issues or wrong vendor driver.
[2018-01-26 16:41:02] : WARNING: backend AMD disabled.
[2018-01-26 16:41:02] : Starting 1x thread, affinity: 0.
[2018-01-26 16:41:02] : hwloc: memory pinned
[2018-01-26 16:41:02] : Starting 1x thread, affinity: 1.
[2018-01-26 16:41:02] : MEMORY ALLOC FAILED: mmap failed
[2018-01-26 16:41:02] : hwloc: memory pinned
[2018-01-26 16:41:02] : Starting 1x thread, affinity: 2.
[2018-01-26 16:41:02] : MEMORY ALLOC FAILED: mmap failed
[2018-01-26 16:41:03] : hwloc: memory pinned

Ok, no problem, just increase the number of the huge pages available to the system with:

sysctl -w vm.nr_hugepages=128

And now you can start xmr-stak without errors of this kind!

PHP posix_kill missing in CentOS7

So you think you PHP code is running ok and your script for killing bad guys is perfect. Let’s assume you made a script to kill processes according to some criteria, put you script on your good all CentOS 7 server, but suddenly you encounter the error:

PHP Fatal error: Call to undefined function posix_kill() in ./kill-process.php on line 64
Killed process id: 21899
PHP Fatal error: Call to undefined function posix_kill() in ./kill-process.php on line 64
Killed process id: 21899
PHP Fatal error: Call to undefined function posix_kill() in ./kill-process.php on line 64

So you just missed a php plugin and there it is: “php-process”. Just install it with yum!

yum install -y php-process

And the error will disappear and the next time you want to kill some bad process you’ll sure do it!

bacula fatal error – Unable to connect to Storage daemon

Bacula is an open software enterprise backup system! Check out the official site here
Complex but useful software, which could automate the whole backup process of all your servers.
Some errors are easy to track some are not, so here is one error with a misleading error message if you do not know or forget the details of how the daemons works.

Here is the error extracted from the logs:

01-Sep 00:45 backup01-de-dir JobId 8789: No prior Full backup Job record found.
01-Sep 00:45 backup01-de-dir JobId 8789: No prior or suitable Full backup found in catalog. Doing FULL backup.
01-Sep 00:45 backup01-de-dir JobId 8789: Job srv123us.2017-09-01_00.45.28_34 waiting 103 seconds for scheduled start time.
01-Sep 00:47 backup01-de-dir JobId 8789: Start Backup JobId 8789, Job=srv123us.2017-09-01_00.45.28_34
01-Sep 00:47 backup01-de-dir JobId 8789: Using Device "web" to write.
01-Sep 00:51 srv123us-fd JobId 8789: Warning: bsock.c:112 Could not connect to Storage daemon on 1.1.1.1:9103. ERR=Connection timed out
01-Sep 01:17 srv123us-fd JobId 8789: Fatal error: bsock.c:118 Unable to connect to Storage daemon on 1.1.1.1:9103. ERR=Interrupted system call
01-Sep 01:17 srv123us-fd JobId 8789: Fatal error: job.c:1893 Failed to connect to Storage daemon: 1.1.1.1:9103
01-Sep 01:17 backup01-de-dir JobId 8789: Fatal error: Bad response to Storage command: wanted 2000 OK storage
01-Sep 01:17 backup01-de-dir JobId 8789: Error: Bacula backup01-de-dir 7.0.5 (28Jul14):
 Build OS:               x86_64-pc-linux-gnu ubuntu 16.04
  JobId:                  8789
  Job:                    srv123us.2017-09-01_00.45.28_34
  Backup Level:           Full (upgraded from Incremental)
  Client:                 "srv123us" 7.0.5 (28Jul14) x86_64-pc-linux-gnu,ubuntu,16.04
  FileSet:                "web" 2017-11-07 17:19:45
  Pool:                   "web-full" (From Job FullPool override)
  Catalog:                "ucdn" (From Client resource)
  Storage:                "web" (From Job resource)
  Scheduled time:         01-Sep-2018 00:47:11
  Start time:             01-Sep-2018 00:47:11
  End time:               01-Sep-2018 01:17:23
  Elapsed time:           30 mins 12 secs
  Priority:               10
  FD Files Written:       0
  SD Files Written:       0
  FD Bytes Written:       0 (0 B)
  SD Bytes Written:       0 (0 B)
  Rate:                   0.0 KB/s
  Software Compression:   None
  VSS:                    no
  Encryption:             no
  Accurate:               no
  Volume name(s):         
  Volume Session Id:      4719
  Volume Session Time:    1510075534
  Last Volume Bytes:      0 (0 B)
  Non-fatal FD errors:    2
  SD Errors:              0
  FD termination status:  Error
  SD termination status:  Waiting on FD
  Termination:            *** Backup Error ***

But when we check the status of client from “bconsole” (Bacula’s management Console), everything seems OK, the backup server (Director daemon = bacula-dir) connects and get the report from the client daemon (Bacula File service = bacula-fd) in the server, even when you run a backup job, the status report is OK, the backup is running on the client, here is the output:

srv@local ~ # bconsole
Connecting to Director localhost:9101
1000 OK: 1 backup01-de-dir Version: 7.0.5 (28 July 2014)
Enter a period to cancel a command.
*status
Status available for:
     1: Director
     2: Storage
     3: Client
     4: Scheduled
     5: All
Select daemon type for status (1-5): 3
The defined Client resources are:
     1: srv1us
     2: srv2us
     3: srv123us
Select Client (File daemon) resource (1-3): 3
Connecting to Client srv123us at 108.61.250.36:9102
srv123us-fd Version: 7.0.5 (28 July 2014)  x86_64-pc-linux-gnu ubuntu 16.04
Daemon started 23-Feb-17 00:43. Jobs: run=1 running=0.
 Heap: heap=98,304 smbytes=571,344 max_bytes=571,361 bufs=97 max_bufs=97
 Sizes: boffset_t=8 size_t=8 debug=0 trace=0 mode=0,0 bwlimit=0kB/s
 Plugin: bpipe-fd.so 

Running Jobs:
JobId 8789 Job srv123us.2017-09-01_00.45.28_34 is running.
    Incremental Backup Job started: 01-Sep-17 00:45
    Files=0 Bytes=0 AveBytes/sec=0 LastBytes/sec=0 Errors=0
    Bwlimit=0
    Files: Examined=5 Backed up=0
    SDReadSeqNo=6 fd=5
Director connected at: 01-Sep-17 01:10
====

Terminated Jobs:
====

As you can see, everything seems OK of the status, there was a running job in the client server and it seemed the backup process had been running without errors for more then 20 minutes, but then suddenly got Fatal error (the first log):

01-Sep 00:51 srv123us-fd JobId 8789: Warning: bsock.c:112 Could not connect to Storage daemon on 1.1.1.1:9103. ERR=Connection timed out
01-Sep 01:17 srv123us-fd JobId 8789: Fatal error: bsock.c:118 Unable to connect to Storage daemon on 1.1.1.1:9103. ERR=Interrupted system call
01-Sep 01:17 srv123us-fd JobId 8789: Fatal error: job.c:1893 Failed to connect to Storage daemon: 1.1.1.1:9103
01-Sep 01:17 backup01-de-dir JobId 8789: Fatal error: Bad response to Storage command: wanted 2000 OK storage

And the problem is that, the Director (backup server) connects to the File Service of the client (the daemon on the client), but the opposite connection is not possible! When the backup is ready, the client daemon bacula file service connects to the bacula storage service (which could be on the same server with the director, but it could be on another server) to send the backup files and here is the problem! Client could not connect to the storage! So always check the two way connections: backup server -> client server-port:9102 and backup server-port:9103 (or storage server) <- client.
In the world of bacula:

bacula-dir -> bacula-fd:9102

bacula-sd:9103 -> bacula-fd

Misleading error on causal look it seems like bacula-sd is returning error to bacula-fd (which would mean that bacula-fd could connect to bacula-sd after all), but in reality bacula-dir received and logged that bacula-fd did not connect to bacula-sd resulting in Fatal error.

In our situation the firewall of the backup server was denying the connections from the client, but it could be a DNS resolve issue or another network problem. Most common problems are firewall or DNS resolve issues. The solution – just add accept rule for the IP of the client to connect to port 9103 of the backup (storage) server.

Busybox ash, Debian dash and simulating bash arrays

Busybox ash (Almquist shell) shell and Debian dash (Debian Almquist shell) are lightweight Unix shell and they are a variant of System V.4 variant of the Bourne shell. Ash/dash shell is known to be very small and is used mainly in embedded (ash) devices and installation scripts (Debian/Ubuntu setup).
Unfortunately they do not support arrays, which could be really a problem in many cases. But we can simulate the arrays with eval function.
So if you need to write a ash/dash script let’s say for an installation script of Ubuntu or Debian or a script for an embedded device, which uses busybox or even you do not want to use arrays in bash, you can follow the consepts below – create variable with a “name” concatenated with a number.

  • 1) Set a variable

    It can be done with two ways:

    1. for myi in 0 1 2 ; do
          setvar mvar$myi "Payload: $myi"
      done
      
    2. for myi in 0 1 2 ; do
          eval mvar$myi=\"Payload: $myi\"
      done
      

    This will create variables with names:

    mvar1, mvar2, mvar3

    and they can be used in any place of your script after the creation of the variables using “eval” or accessing them with the names.

    * bash shell do not support the command “setvar”, so for bash scripts use only eval version.

  • 2) Use a variable

    1. using “eval”
      for myi in 0 1 2 ; do
          eval echo \$mvar$myi
      done
      
      myi=1
      eval newvar="\$mvar$myi"
      echo $newvar
      
    2. direct access
      echo $mvar2
      $mvar2="Payload 20"
      echo $mvar2
      

tmpfs mount on /dev/shm in LXC container or chroot environment

When using LXC containers booting the lxc container would not populate it as the normal boot process. Or when you create a chroot jail /dev is not mounted or just some devices are created.
There is an option to populate (when using LXC containers) it with minimal required devices:

lxc.autodev = 1

which will create a tmpfs mount under /dev and create some basic devices, it will ensure /dev/shm to be mounted on with tmpfs!
If you omit this option, the /dev directory won’t be populated and will stay with the devices you made or copied when you made the LXC container (or the chroot jail) and /dev/shm will not be mounted using tmps, which could create numerous bad issues.
If you get errors like

 * configure has detected that the sem_open function is broken.
 * Please ensure that /dev/shm is mounted as a tmpfs with mode 1777.

You could mount the /dev/shm of the LXC container or the chroot jail (usually you can tune the size half of the server’s RAM) with

mkdir -p /dev/shm
mount -t tmpfs -o nodev,nosuid,noexec,mode=1777,size=6144m tmpfs /dev/shm

Or reboot your LXC container with a new configuration (probably in the “/var/lxc/[lxc_name]/config”) adding the following line:

lxc.mount.entry = none dev/shm tmpfs nodev,nosuid,noexec,mode=1777,create=dir 0 0

Thus you ensure the /dev/shm to be mounted on tmpfs and all semaphore functions to work properly.

* Real output of Gentoo failed compilation of python package:
 * configure has detected that the sem_open function is broken.
 * Please ensure that /dev/shm is mounted as a tmpfs with mode 1777.
 * ERROR: dev-lang/python-3.3.4-r1::gentoo failed (configure phase):
 *   Broken sem_open function (bug 496328)
 * 
 * Call stack:
 *     ebuild.sh, line 124:  Called src_configure
 *   environment, line 3542:  Called die
 * The specific snippet of code:
 *           die "Broken sem_open function (bug 496328)";
 * 
 * If you need support, post the output of `emerge --info '=dev-lang/python-3.3.4-r1::gentoo'`,
 * the complete build log and the output of `emerge -pqv '=dev-lang/python-3.3.4-r1::gentoo'`.
 * The complete build log is located at '/var/tmp/portage/dev-lang/python-3.3.4-r1/temp/build.log'.
 * The ebuild environment file is located at '/var/tmp/portage/dev-lang/python-3.3.4-r1/temp/environment'.
 * Working directory: '/var/tmp/portage/dev-lang/python-3.3.4-r1/work/x86_64-pc-linux-gnu'
 * S: '/var/tmp/portage/dev-lang/python-3.3.4-r1/work/Python-3.3.4'

>>> Failed to emerge dev-lang/python-3.3.4-r1, Log file:

SUPERMICRO IPMI/KVM module tips – reset the unit and the admin password

After the previous howto “SUPERMICRO IPMI to use one of the one interfaces or dedicated LAN port” (in the howto is showed how to install the needed tool for managing the IPMI/KVM unit under console) of setting the network configuration there are a couple of interesting and important tips when working with the IPMI/KVM module. Here are they are:

  1. Reset IPMI/KVM module – sometimes it happen the keyboard or mouse not to work when the Console Redirection is loaded, it is easy to reset the unit from the web interface, but there are case when the web interface is not working – so ssh to your server and try one of the following commands:
    * warm reset – it’s like a reboot, inform the IPMI/KVM to reboot itself.

    ipmitool -I open bmc reset warm
    

    It does not work in all situations! So try a cold reset
    * cold reset – resets the IPMI/KVM, it’s like unplug and plug the power to the unit.

    ipmitool -I open bmc reset cold
    
  2. Reset the configuration of an IPMI/KVM module to factory defaults. It is useful when something goes wrong when upgrading the firmware of the unit and the old configuration is not supported or it says it is, but at the end the unit does not work properly. In rare cases it might help when the KVM (Keyboard, Video, Monitor part aka Console redirection does not work)
    Here is the command for resetting to factory defaults:

    ipmitool -I open raw 0x3c 0x40
    
  3. Reset admin password – reset the password for the administrator login of the IPMI/KVM unit. It’s trivial losing the password so with the help of the local console to the server you can reset the password to a simple one and then change it from the web interface.
    ipmitool -I open user set password 2 ADMIN
    

    The number “2” is the ID of the user, check it with:

    [root@srv0 ~]# ipmitool -I open user list
    ID  Name             Callin  Link Auth  IPMI Msg   Channel Priv Limit
    1                    true    false      false      Unknown (0x00)
    2   ADMIN            true    false      false      Unknown (0x00)
    3                    true    false      false      Unknown (0x00)
    4                    true    false      false      Unknown (0x00)
    5                    true    false      false      Unknown (0x00)
    6                    true    false      false      Unknown (0x00)
    7                    true    false      false      Unknown (0x00)
    8                    true    false      false      Unknown (0x00)
    9                    true    false      false      Unknown (0x00)
    10                   true    false      false      Unknown (0x00)
    

    Sometimes if a hacker got to your IPMI/KVM you could see the user table with the above command. There was a serious bug aka backdoor in some of these units, the ID of the ADMIN user or even the username could be changed, so you should use the list command to list the current user table.
    Use set name to set the username of the user.

    ipmitool -I open user set name 2 ADMIN
    
  4. Set a new network configuration. It’s worth mentioning again the howto for this purpose – “SUPERMICRO IPMI to use one of the one interfaces or dedicated LAN port

All commands using the network option of the ipmitool

ipmitool -I lanplus -H 192.168.7.150 -U ADMIN -P ADMIN bmc reset warm
ipmitool -I lanplus -H 192.168.7.150 -U ADMIN -P ADMIN bmc reset cold
ipmitool -I lanplus -H 192.168.7.150 -U ADMIN -P ADMIN raw 0x3c 0x40
ipmitool -I lanplus -H 192.168.7.150 -U ADMIN -P ADMIN user set password 2 ADMIN
ipmitool -I lanplus -H 192.168.7.150 -U ADMIN -P ADMIN user list

The IP 192.168.7.150 is the IP of your IPMI/KVM module, which you want to change with the above commands.

Tunneling the IPMI/KVM ports over ssh (supermicro ipmi ports)

The best security for the remote management unit in your server such as IPMI/KVM is to have local IP. All IPMI/KVM IP should be switched to a separated switch and a local sub-network used for the LAN Settings. So to be able to connect to the IPMI/KVM module you need a VPN connection to gain access to the local sub-network used for your servers’ management modules. However, sometimes the VPN cannot be used or it just happened the server is down, or you are at a place restricting unknown ports (or ports above 1024), which your VPN uses (that’s why the VPN server should use only one port from the most popular – 80, 443, but that’s a thing for another howto…) and so on. So you end with no ability to connect to the VPN server or you think you do not need at all a VPN server, because you always could use

openssh

to do the trick of tunneling ports from your computer to the IPMI/KVM module of your server through a server, which has an access to the local sub-network of the IPMI/KVM modules.

So here is what you need to get to the remote management of your server just using ssh for tunneling:

STEP 1) A server, which has access to the IP network of the IPMI/KVM modules.

Let’s say you set to all your servers’ IPMI/KVM modules IPs from network 192.168.7.0/24, so your server must have an IP from 192.168.7.0/24, for example 192.168.7.1, add it as an alias or to a dedicated LAN connected to the switch, in which of all your IPMI/KVM modules are plugged in. This server will be used as a transfer point to a selected IPMI/KVM IP.

STEP 2) Tunnel local selected ports using ssh to the server from STEP 1)

Use this command:

ssh -N -L 127.0.0.1:80:[IPMI-IP]:80 -L 127.0.0.1:443:[IPMI-IP]:443 -L 127.0.0.1:5900:[IPMI-IP]:5900 -L 127.0.0.1:623:[IPMI-IP]:623 root@[SERVER-IP]

For example using 192.168.7.150 for an IPMI/KVM IP:

[root@srv0 ~]# ssh -N -L 127.0.0.1:80:192.168.7.150:80 -L 127.0.0.1:443:192.168.7.150:443 -L 127.0.0.1:5900:192.168.7.150:5900 -L 127.0.0.1:623:192.168.7.150:623 root@example-server.com

With the above command you can use the web interface (https://127.0.0.1/, you could replace 127.0.0.1 with a local IP or a local IP alias of your machine), the java web start “Console Redirection” (the KVM – Keyboard, Video and Mouse) and you can mount Virtual Media from your computer to your server’s virtual CD/DVD device. Unfortunately to use properly the Virtual CD/DVD you must tunnel the UDP on port 623 (not only TCP 623), which is a little bit tricky. To tunnel the UDP packets

socat – Multipurpose relay (SOcket CAT)

program must be used.

STEP 3) Tunnel local selected ports using ssh to the server from STEP 1) and UDP port using socat

[root@srv0 ~]# socat -T15 udp4-recvfrom:623,reuseaddr,fork tcp:localhost:8000
[root@srv0 ~]# ssh -L8000:localhost:8000 -L 127.0.0.1:80:192.168.7.150:80 -L 127.0.0.1:443:192.168.7.150:443 -L 127.0.0.1:5900:192.168.7.150:5900 -L 127.0.0.1:623:192.168.7.150:623 root@example-server.com socat tcp4-listen:8000,reuseaddr,fork UDP:192.168.7.150:623

This will start a UDP listening socket on localhost port 8000. Every packet will be relayed using TCP to localhost 8000, which will be tunneled using ssh command to the remote server, where there is a started another socat listening TCP socket on port 8000, which will relay every packet to the UDP port 623 of IP 192.168.7.150. Replace the IP 192.168.7.150 with your IPMI/KVM IP.

* Here are the required ports for SUPERMICRO IPMI functionality in X9 and X10 motherboards

  • X9-motherboards, the ports are

    TCP Ports
    HTTP: 80
    HTTPS: 443
    SSH: 22
    WSMAN: 5985
    Video: 5901
    KVM: 5900
    CD/USB: 5120
    Floppy: 5123
    Virtual Media: 623
    SNMP: 161

    UDP ports:
    IPMI: 623

  • For X10-motherboards, the ports are

    TCP Ports
    HTTP: 80
    HTTPS: 443
    SSH: 22
    WSMAN: 5985
    Video: 5901
    KVM: 5900 , 3520
    CD/USB: 5120
    Floppy: 5123
    Virtual Media: 623
    SNMP: 161

    UDP ports:
    IPMI: 623

You could add the required port to the ssh command above if you need it!

Virtual Device mounted successfully

Successful mount in Console Redirection with Virtual Media:

main menu
Virtual Storage

if you are logged in the server and mount an ISO with the Virtual Device you’ll probably have this in “dmesg”:

[46683751.661063] usb 2-1.3.2: new high-speed USB device number 8 using ehci-pci
[46683751.795048] usb 2-1.3.2: New USB device found, idVendor=0ea0, idProduct=1111
[46683751.795051] usb 2-1.3.2: New USB device strings: Mfr=0, Product=0, SerialNumber=0
[46683751.795365] usb-storage 2-1.3.2:1.0: USB Mass Storage device detected
[46683751.795553] scsi6 : usb-storage 2-1.3.2:1.0
[46683752.795730] scsi 6:0:0:0: CD-ROM            ATEN     Virtual CDROM    YS0J PQ: 0 ANSI: 0 CCS
[46683752.806839] sr0: scsi3-mmc drive: 40x/40x cd/rw xa/form2 cdda tray
[46683752.806842] cdrom: Uniform CD-ROM driver Revision: 3.20
[46683752.806933] sr 6:0:0:0: Attached scsi CD-ROM sr0
[46683752.806971] sr 6:0:0:0: Attached scsi generic sg1 type 5

SUPERMICRO IPMI to use one of the LAN interfaces or dedicated LAN port

If you happen to have a Supermicro server and you want to change the default behavior of the IPMI LAN interface, which is

Failover – on boot check whether the dedicated LAN port is connected if so use the it, otherwise use the shared LAN1

So if change it there are some magic commands to change this default behavior:

  • Always use dedicated LAN:
    within the server under console:

    ipmitool -I open raw 0x30 0x70 0x0c 1 0
    

    from remote using the network:

    ipmitool -I lanplus -H 192.168.7.150 -U ADMIN -P ADMIN raw 0x30 0x70 0x0c 1 0
    

    Sometimes the output of the last command (that using the lanplus) will output:

    Unable to send RAW command (channel=0x0 netfn=0x30 lun=0x0 cmd=0x70)
    

    But it sets the value despite the error output “Unable to send”. You could check it with the read command (the last example).

  • Always use shared LAN1:
    within the server under console:

    ipmitool -I open raw 0x30 0x70 0xc 1 1 
    

    from remote using the network:

    ipmitool -I lanplus -H 192.168.7.150 -U ADMIN -P ADMIN raw 0x30 0x70 0x0c 1 1
    

    Sometimes the output of the last command (that using the lanplus) will output:

    Unable to send RAW command (channel=0x0 netfn=0x30 lun=0x0 cmd=0x70)
    

    But it sets the value despite the error output “Unable to send”. You could check it with the read command (the last example).

  • Always use failover (factory default):
    within the server under console:

    ipmitool -I open raw 0x30 0x70 0xc 1 2
    

    from remote using the network:

    ipmitool -I lanplus -H 192.168.7.150 -U ADMIN -P ADMIN raw 0x30 0x70 0x0c 1 2
    
  • Sometimes the output of the last command (that using the lanplus) will output:

    Unable to send RAW command (channel=0x0 netfn=0x30 lun=0x0 cmd=0x70)
    

    But it sets the value despite the error output “Unable to send”. You could check it with the read command (the last example).

Get the current value with:

[root@srv0 ~]# ipmitool -I open raw 0x30 0x70 0x0c 0
 02
[root@srv0 ~]#

Default (failover): you will see 02
Onboard LAN: you will see 01
Dedicated LAN: you will see 00

The 192.168.7.157 is the IP of the IPMI KVM module and the -U ADMIN and -P ADMIN are username and the password login details to the module (ADMIN/ADMIN are just default settings for the Supermicro IPMI/KVM)

* Here you can set the LAN IP configuration – “Set IP to the IPMI/KVM server module with ipmitool