Basic review of Eclipse Angular IDE with hello world app

This is a basic review of the Eclipse new Angular IDE. What is the basic functionality of the IDE and how we can work with it. The main purpose is to show what it looks like creating an angular project.

STEP 1) Launch the Angular IDE and select the Workspace you used for your projects.

main menu
Angular IDE

STEP 2) Eclipse IDE started for the first time – no projects opened and no history projects.

main menu
eclipse-workspace

STEP 3) Create project by clicking the down array (the second icon from the left) and then Angular Project

main menu
Angular Project

STEP 4) A new window will pop up asking for the name of the project and as you can see you can tune the specific versions of Angular CLI, Node.js and NPM. Click “Next”.

main menu
New Angular Project

STEP 5) Here you see the commands that will be executed by the eclipse IDE to create a new Angular Project. Click “Finish”.

main menu
New Angular Project Finish

STEP 6) The commands of the previous step are executed and the needed packages and their dependecies will be installed. There a progress status with percetages right bottom of the IDE. When the percetages reach 100% the view for the angular project will be opened.

main menu
Creating the Angular Project

STEP 7) Angular project opened, the “main.ts” file is showed in the typescript editor. A proper hilighting of typescript language.

main menu
ReviewIDEApp

STEP 8) We open the “app.component.ts” file in the typescript editor and we add “body” variable for testing the IDE.

main menu
ReviewIDEApp – app.component.ts

STEP 9) The html file of the AppComponent – it has a proposals (autocomplete) even in the angular string interpolation {{}}. So we use the proposal of our AppComponent variable “body”.

main menu
ReviewIDEApp – app.component.html

STEP 10) There is a tab “Servers” left bottom with all the Angular CLI applications. We have only one ReviewIDEApp, mark it and then click on the “play” button and the project will be built and started on the local ip port 4200.

main menu
ReviewIDEApp – Servers

STEP 11) in the terminal you can see all the commands and their output. Our angular application is being built.

main menu
ReviewIDEApp – Terminal

STEP 12) Again in the “Servers” tab we see our application is running on http://localhost:4200/ Also you see our modification of the default code.

main menu
ReviewIDEApp – running – build OK

STEP 13) Open your browser and load http://localhost:4200/ you’ll see something similar and our variable “body” (“Hello World”) showed in the header.

main menu
ReviewIDEApp – browser http://localhost:4200/

STEP 14) Create a new angular component with the IDE is simple, just mark your application in the “Project Explorer” -> right mouse click -> “New” -> “Component”.

main menu
New Component

STEP 15) Set the “Element Name” and you also could click on “Advanced” to see more options for the component creation.

main menu
New Angular CLI Component

STEP 16) For example you can uncheck “Create component with Unit Test (–spec)” and then Next. It will not generate the spec file.

main menu
New Angular CLI Component – Advanced

STEP 17) You see the commands to be executed in the terminal and then click “Finish” to execute the commands.

main menu
New Angular CLI Component – Generated Command

STEP 18) The files for the component are generated – ts (typescript file), html (the template file) and css (the style file). The three file are placed in a separate directory with the name of the component. The typescript file has the skeleton of your angular component.

main menu
myfirstcom.component.ts

STEP 19) As you can see the CLI inlcuded the component we created from the previous step in our global app.module file.

main menu
app.modules.ts

STEP 20) Create a new angular pipe with the IDE is simple, just mark your application in the “Project Explorer” -> right mouse click -> “New” -> “Pipe”.

main menu
New Pipe

STEP 21) Set the “Element Name” and you also could click on “Advanced” to see more options for the pipe creation.

main menu
New Angular CLI Pipe

STEP 22) For example you can uncheck “Create component with Unit Test (–spec)” and then Next. It will not generate the spec file.

main menu
New Angular CLI Pipe – Advanced

STEP 23) You see the commands to be executed in the terminal and then click “Finish” to execute the commands.

main menu
New Angular CLI Pipe – Generated Command

STEP 24) One typescript file is generated. The typescript mypipe.pipe.ts file has the skeleton of a pipe component. There is the definition of the pipe class implementing the the mandatory class “PipeTransform” and the method transform to override.

main menu
mypipe.pipe.ts

STEP 25) Create a new angular service with the IDE is simple, just mark your application in the “Project Explorer” -> right mouse click -> “New” -> “Service” and you’ll see the window for creating a service. Enter “Element Name”, click “Advanced” to see more options for the angular services.

main menu
New Angular CLI Service

STEP 26) You see the commands to be executed in the terminal and then click “Finish” to execute the commands.

main menu
New Angular CLI Service – Generated Command

STEP 27) The typescript file for our service is generated: a an exported class with blank constuctor.

main menu
myserver.service.ts

STEP 28) There is autocomplete for the import directive.

main menu
import – autocomplete

STEP 29) There is autocomplete for the exported class of the imported file/library.

main menu
import class – autocomplete

STEP 30) There is autocomplete for the component names in the html. If you write “
main menu
html component autocomplete

STEP 31) Our component works!

main menu
Our app in the Browser

STEP 32) There is an autocomplete of the all html tags.

main menu
Html template file

STEP 33) There is an autocomplete of the all html tags.

main menu
Html template file

STEP 34) Information is available for all properties.

main menu
Property information

STEP 35) Autocomplete proposal for all the available classes of an import.

main menu
Autocomplete proposal for classes

STEP 36) Autocomplete proposal fired in a funtion body.

main menu
Autocomplete proposal of keywords

Bacula – show configuration, status and information with bconsole tool

The following list of commands could be used to get a brief or detailed view of a Bacula backup server from the management utility

bconsole

. These commands are extremely useful for getting information on the backup process and policy and Bacula troubleshooting – could be used for fast debugging of an error, problems or misconfiguration.

the following commands give information for

  • Jobs

    list jobtotals Lists stats for all jobs, it also shows all jobs’ names
    show jobs Lists all jobs with their full configurations – show all jobs and for each job show detail explanation of what represent. The detail output includes full configuration of a job including client, catalog, fileset,schedule,pool,message. This will show all relationships between the different components of bacula system, how and which clients,storages,pools,schedules,filesets relate to. You’ll a thoroughly view of how let’s say a server is made the backup.
    show job=[job_name] shows full configuration for a job, the name could be taken by the two above commands
    list jobs Lists all jobs’ status – ID, StartTime, Type (backup?), Level (Full, increment, differential?), Files and bytes processed and the status of the job (Terminated normally, Running,Fatal error and so on)
    list files jobid=[ID] which files were included in the backup? Lists all paths and files included in the backup, not a configuration set but real physical path and filenames.
    status dir if you want to find all scheduled jobs for the next day (or more if you add a parameter). This will show the status of the director process.
  • Storages

    status storage list storage devices and their status – you can see the physical path on the filesystem where the Devices will put backup files
  • Clients

    show client show all clients’ names and backup policy
    status client=[client_name] show client status and what is doing, check the network connection between the director and the client, last terminated jobs and their status.
  • Filesets

    show fileset show all filesets, a fileset is a set with files and directories to include or exclude from a backup.
  • Schedule

    show schedule shows all registered schedulers and details for each one (Run Level=Full,Differential,Incremental), months, days, minutes.
    show schedule=[scheduler_name] shows details for a schedule with name scheduler_name (Run Level=Full,Differential,Incremental), months, days, minutes. It’s like the schedule backup plan of a server
  • Director

    message shows the last message of the backup process. If empty all logs of the backup process could be found in “/var/log/bacula/bacula.log”
    reload the director will re-read its all configuration files. Should be used when adding configuration files.

And here is the example output of the above commands with a little bit of explanation:

  • Jobs

    1) Get the stats and the jobs’ names, the names could be used in many other commands!

    srvbkp@local # bconsole 
    Connecting to Director localhost:9101
    1000 OK: 1 srvbkp-dir Version: 7.0.5 (28 July 2014)
    Enter a period to cancel a command.
    *list jobtotals
    Automatically selected Catalog: allbackup
    Using Catalog "allbackup"
    +------+-----------+-------------------+--------------------------------+
    | Jobs | Files     | Bytes             | Job                            |
    +------+-----------+-------------------+--------------------------------+
    |   90 |        90 |   123,665,584,337 | BackupCatalog                  |
    |    5 |         5 |   281,593,737,603 | RestoreFiles                   |
    |   13 | 1,232,316 |   118,480,634,434 | srv1-media                     |
    |   32 |        12 |             3,674 | srv1-dns                       |
    |   32 |        10 |             3,064 | srv2-dns                       |
    |   32 |        10 |             3,064 | srv3-dns                       |
    |   32 |        10 |             3,086 | srv4-dns                       |
    |   32 |        10 |             3,084 | srv5-dns                       |
    |   26 | 3,837,536 |   587,812,183,466 | srv1-images                    |
    +------+-----------+-------------------+--------------------------------+
    +-------+------------+-------------------+
    | Jobs  | Files      | Bytes             |
    +-------+------------+-------------------+
    | 1,474 | 14,925,321 | 5,475,024,028,957 |
    +-------+------------+-------------------+
    *
    

    2) Get all configurations for all jobs, here are included only two for clarity. All needed information for taking a backup of a server. You can see the files, which will be included (or excluded), where the backup will be stored when will happen in time, and how many different types of backup will be done – full, incremental, and differential. And this whole information is for all clients (servers).

    *show jobs
    Job: name=srv1-dns JobType=66 level= Priority=10 Enabled=1
         MaxJobs=1 Resched=0 Times=0 Interval=1,800 Spool=0 WritePartAfterJob=1
         Accurate=0
      --> Client: name=srv1-dns address=192.168.0.100 FDport=9102 MaxJobs=1
          JobRetention=1 month  FileRetention=1 month  AutoPrune=1
      --> Catalog: name=allbackup address=localhost DBport=0 db_name=bacula
          db_driver=*None* db_user=bacula MutliDBConn=0
      --> FileSet: name=bind
          O MZ6
          N
          I /var/lib/named
          N
      --> Schedule: name=bind
      --> Run Level=Full
          hour=20 
          mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 
          month=0 1 2 3 4 5 6 7 8 9 10 11 
          wday=0 
          wom=0 
          woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 
          mins=0
      --> Run Level=Differential
          hour=20 
          mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 
          month=0 1 2 3 4 5 6 7 8 9 10 11 
          wday=0 
          wom=1 2 3 4 
          woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 
          mins=0
      --> Run Level=Incremental
          hour=20 
          mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 
          month=0 1 2 3 4 5 6 7 8 9 10 11 
          wday=1 2 3 4 5 6 
          wom=0 1 2 3 4 5 
          woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 
          mins=0
      --> Storage: name=bind address=192.168.0.10 SDport=9103 MaxJobs=10
          DeviceName=bind MediaType=File StorageId=17
      --> Pool: name=Default PoolType=Backup
          use_cat=1 use_once=0 cat_files=1
          max_vols=100 auto_prune=1 VolRetention=1 year 
          VolUse=0 secs recycle=1 LabelFormat=*None*
          CleaningPrefix=*None* LabelType=0
          RecyleOldest=0 PurgeOldest=0 ActionOnPurge=0
          MaxVolJobs=0 MaxVolFiles=0 MaxVolBytes=53687091200
          MigTime=0 secs MigHiBytes=0 MigLoBytes=0
          JobRetention=0 secs FileRetention=0 secs
      --> Pool: name=bind-full PoolType=Backup
          use_cat=1 use_once=1 cat_files=1
          max_vols=0 auto_prune=1 VolRetention=2 months 
          VolUse=0 secs recycle=1 LabelFormat=bind-full
          CleaningPrefix=*None* LabelType=0
          RecyleOldest=0 PurgeOldest=0 ActionOnPurge=0
          MaxVolJobs=0 MaxVolFiles=0 MaxVolBytes=0
          MigTime=0 secs MigHiBytes=0 MigLoBytes=0
          JobRetention=0 secs FileRetention=0 secs
      --> Pool: name=bind-incr PoolType=Backup
          use_cat=1 use_once=0 cat_files=1
          max_vols=0 auto_prune=1 VolRetention=7 days 
          VolUse=23 hours  recycle=1 LabelFormat=bind-incr
          CleaningPrefix=*None* LabelType=0
          RecyleOldest=0 PurgeOldest=0 ActionOnPurge=0
          MaxVolJobs=0 MaxVolFiles=0 MaxVolBytes=0
          MigTime=0 secs MigHiBytes=0 MigLoBytes=0
          JobRetention=0 secs FileRetention=0 secs
      --> Pool: name=bind-diff PoolType=Backup
          use_cat=1 use_once=0 cat_files=1
          max_vols=0 auto_prune=1 VolRetention=1 month 1 day 
          VolUse=0 secs recycle=1 LabelFormat=bind-diff
          CleaningPrefix=*None* LabelType=0
          RecyleOldest=0 PurgeOldest=0 ActionOnPurge=0
          MaxVolJobs=0 MaxVolFiles=0 MaxVolBytes=0
          MigTime=0 secs MigHiBytes=0 MigLoBytes=0
          JobRetention=0 secs FileRetention=0 secs
      --> Messages: name=Standard
          mailcmd=/usr/sbin/bsmtp -h localhost -f "(Bacula) <%r>" -s "Bacula: %t %e of %c %l" %r
          opcmd=/usr/sbin/bsmtp -h localhost -f "(Bacula) <%r>" -s "Bacula: Intervention needed for %j" %r
    Job: name=srv2-dns JobType=66 level= Priority=10 Enabled=1
         MaxJobs=1 Resched=0 Times=0 Interval=1,800 Spool=0 WritePartAfterJob=1
         Accurate=0
      --> Client: name=srv2-dns address=192.168.0.101 FDport=9102 MaxJobs=1
          JobRetention=1 month  FileRetention=1 month  AutoPrune=1
      --> Catalog: name=allbackup address=localhost DBport=0 db_name=bacula
          db_driver=*None* db_user=bacula MutliDBConn=0
      --> FileSet: name=bind
          O MZ6
          N
          I /var/lib/named
          N
      --> Schedule: name=bind
      --> Run Level=Full
          hour=20 
          mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 
          month=0 1 2 3 4 5 6 7 8 9 10 11 
          wday=0 
          wom=0 
          woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 
          mins=0
      --> Run Level=Differential
          hour=20 
          mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 
          month=0 1 2 3 4 5 6 7 8 9 10 11 
          wday=0 
          wom=1 2 3 4 
          woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 
          mins=0
      --> Run Level=Incremental
          hour=20 
          mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 
          month=0 1 2 3 4 5 6 7 8 9 10 11 
          wday=1 2 3 4 5 6 
          wom=0 1 2 3 4 5 
          woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 
          mins=0
      --> Storage: name=bind address=192.168.0.10 SDport=9103 MaxJobs=10
          DeviceName=bind MediaType=File StorageId=17
      --> Pool: name=Default PoolType=Backup
          use_cat=1 use_once=0 cat_files=1
          max_vols=100 auto_prune=1 VolRetention=1 year 
          VolUse=0 secs recycle=1 LabelFormat=*None*
          CleaningPrefix=*None* LabelType=0
          RecyleOldest=0 PurgeOldest=0 ActionOnPurge=0
          MaxVolJobs=0 MaxVolFiles=0 MaxVolBytes=53687091200
          MigTime=0 secs MigHiBytes=0 MigLoBytes=0
          JobRetention=0 secs FileRetention=0 secs
      --> Pool: name=bind-full PoolType=Backup
          use_cat=1 use_once=1 cat_files=1
          max_vols=0 auto_prune=1 VolRetention=2 months 
          VolUse=0 secs recycle=1 LabelFormat=bind-full
          CleaningPrefix=*None* LabelType=0
          RecyleOldest=0 PurgeOldest=0 ActionOnPurge=0
          MaxVolJobs=0 MaxVolFiles=0 MaxVolBytes=0
          MigTime=0 secs MigHiBytes=0 MigLoBytes=0
          JobRetention=0 secs FileRetention=0 secs
      --> Pool: name=bind-incr PoolType=Backup
          use_cat=1 use_once=0 cat_files=1
          max_vols=0 auto_prune=1 VolRetention=7 days 
          VolUse=23 hours  recycle=1 LabelFormat=bind-incr
          CleaningPrefix=*None* LabelType=0
          RecyleOldest=0 PurgeOldest=0 ActionOnPurge=0
          MaxVolJobs=0 MaxVolFiles=0 MaxVolBytes=0
          MigTime=0 secs MigHiBytes=0 MigLoBytes=0
          JobRetention=0 secs FileRetention=0 secs
      --> Pool: name=bind-diff PoolType=Backup
          use_cat=1 use_once=0 cat_files=1
          max_vols=0 auto_prune=1 VolRetention=1 month 1 day 
          VolUse=0 secs recycle=1 LabelFormat=bind-diff
          CleaningPrefix=*None* LabelType=0
          RecyleOldest=0 PurgeOldest=0 ActionOnPurge=0
          MaxVolJobs=0 MaxVolFiles=0 MaxVolBytes=0
          MigTime=0 secs MigHiBytes=0 MigLoBytes=0
          JobRetention=0 secs FileRetention=0 secs
      --> Messages: name=Standard
          mailcmd=/usr/sbin/bsmtp -h localhost -f "(Bacula) <%r>" -s "Bacula: %t %e of %c %l" %r
          opcmd=/usr/sbin/bsmtp -h localhost -f "(Bacula) <%r>" -s "Bacula: Intervention needed for %j" %r
    

    3) You can get the full configuration information of a job (the information is the same as above, but for a given job name, which could be taken from the first command above, it is not necessary to output all the configurations every time):

    srvbkp@local # bconsole 
    Connecting to Director localhost:9101
    1000 OK: 1 srvbkp-dir Version: 7.0.5 (28 July 2014)
    Enter a period to cancel a command.
    *show jobs=srv2-dns
    Job: name=srv2-dns JobType=66 level= Priority=10 Enabled=1
         MaxJobs=1 Resched=0 Times=0 Interval=1,800 Spool=0 WritePartAfterJob=1
         Accurate=0
      --> Client: name=srv2-dns address=192.168.0.101 FDport=9102 MaxJobs=1
          JobRetention=1 month  FileRetention=1 month  AutoPrune=1
      --> Catalog: name=allbackup address=localhost DBport=0 db_name=bacula
          db_driver=*None* db_user=bacula MutliDBConn=0
      --> FileSet: name=bind
          O MZ6
          N
          I /var/lib/named
          N
      --> Schedule: name=bind
      --> Run Level=Full
          hour=20 
          mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 
          month=0 1 2 3 4 5 6 7 8 9 10 11 
          wday=0 
          wom=0 
          woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 
          mins=0
      --> Run Level=Differential
          hour=20 
          mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 
          month=0 1 2 3 4 5 6 7 8 9 10 11 
          wday=0 
          wom=1 2 3 4 
          woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 
          mins=0
      --> Run Level=Incremental
          hour=20 
          mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 
          month=0 1 2 3 4 5 6 7 8 9 10 11 
          wday=1 2 3 4 5 6 
          wom=0 1 2 3 4 5 
          woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 
          mins=0
      --> Storage: name=bind address=192.168.0.10 SDport=9103 MaxJobs=10
          DeviceName=bind MediaType=File StorageId=17
      --> Pool: name=Default PoolType=Backup
          use_cat=1 use_once=0 cat_files=1
          max_vols=100 auto_prune=1 VolRetention=1 year 
          VolUse=0 secs recycle=1 LabelFormat=*None*
          CleaningPrefix=*None* LabelType=0
          RecyleOldest=0 PurgeOldest=0 ActionOnPurge=0
          MaxVolJobs=0 MaxVolFiles=0 MaxVolBytes=53687091200
          MigTime=0 secs MigHiBytes=0 MigLoBytes=0
          JobRetention=0 secs FileRetention=0 secs
      --> Pool: name=bind-full PoolType=Backup
          use_cat=1 use_once=1 cat_files=1
          max_vols=0 auto_prune=1 VolRetention=2 months 
          VolUse=0 secs recycle=1 LabelFormat=bind-full
          CleaningPrefix=*None* LabelType=0
          RecyleOldest=0 PurgeOldest=0 ActionOnPurge=0
          MaxVolJobs=0 MaxVolFiles=0 MaxVolBytes=0
          MigTime=0 secs MigHiBytes=0 MigLoBytes=0
          JobRetention=0 secs FileRetention=0 secs
      --> Pool: name=bind-incr PoolType=Backup
          use_cat=1 use_once=0 cat_files=1
          max_vols=0 auto_prune=1 VolRetention=7 days 
          VolUse=23 hours  recycle=1 LabelFormat=bind-incr
          CleaningPrefix=*None* LabelType=0
          RecyleOldest=0 PurgeOldest=0 ActionOnPurge=0
          MaxVolJobs=0 MaxVolFiles=0 MaxVolBytes=0
          MigTime=0 secs MigHiBytes=0 MigLoBytes=0
          JobRetention=0 secs FileRetention=0 secs
      --> Pool: name=bind-diff PoolType=Backup
          use_cat=1 use_once=0 cat_files=1
          max_vols=0 auto_prune=1 VolRetention=1 month 1 day 
          VolUse=0 secs recycle=1 LabelFormat=bind-diff
          CleaningPrefix=*None* LabelType=0
          RecyleOldest=0 PurgeOldest=0 ActionOnPurge=0
          MaxVolJobs=0 MaxVolFiles=0 MaxVolBytes=0
          MigTime=0 secs MigHiBytes=0 MigLoBytes=0
          JobRetention=0 secs FileRetention=0 secs
      --> Messages: name=Standard
          mailcmd=/usr/sbin/bsmtp -h localhost -f "(Bacula) <%r>" -s "Bacula: %t %e of %c %l" %r
          opcmd=/usr/sbin/bsmtp -h localhost -f "(Bacula) <%r>" -s "Bacula: Intervention needed for %j" %r
    

    4) Lists all jobs’ status – ID, StartTime, Type (backup?), Level (Full, increment, differential?), Files and bytes processed and the status of the job (Terminated normally, Running,Fatal error and so on). Found out if you have backups of a server or the backup process failed!

    srvbkp@local # bconsole 
    Connecting to Director localhost:9101
    1000 OK: 1 srvbkp-dir Version: 7.0.5 (28 July 2014)
    Enter a period to cancel a command.
    *list jobs
    +-------+--------------------------------+---------------------+------+-------+-----------+-----------------+-----------+
    | JobId | Name                           | StartTime           | Type | Level | JobFiles  | JobBytes        | JobStatus |
    +-------+--------------------------------+---------------------+------+-------+-----------+-----------------+-----------+
    |   128 | srv1-test                      | 2016-12-04 23:05:00 | B    | F     |    17,506 |      52,116,400 | T         |
    |   178 | srv1-test                      | 2016-12-09 23:05:01 | B    | I     |        13 |           1,509 | T         |
    |   188 | srv1-test                      | 2016-12-10 23:05:01 | B    | I     |        13 |           1,509 | T         |
    .........................................................................................................................
    | 8,927 | srv2-images                    | 2018-03-04 20:00:00 | B    | F     |         0 |               0 | f         |
    | 8,928 | srv1-media                     | 2018-03-04 20:00:00 | B    | F     |         3 |             978 | T         |
    | 8,930 | srv1-dns                       | 2018-03-04 20:00:01 | B    | F     |         6 |           1,843 | T         |
    | 8,932 | srv2-dns                       | 2018-03-04 20:00:01 | B    | F     |         6 |           1,837 | T         |
    | 8,931 | srv3-dns                       | 2018-03-04 20:00:03 | B    | F     |         5 |           1,542 | T         |
    | 8,933 | srv4-dns                       | 2018-03-04 20:00:04 | B    | F     |         4 |           1,258 | T         |
    | 8,934 | srv5-dns                       | 2018-03-04 20:00:04 | B    | F     |         4 |           1,258 | T         |
    +-------+--------------------------------+---------------------+------+-------+-----------+-----------------+-----------+
    

    5) Which files were included in a backup job? Lists all paths and files included in the backup job (the ID is from the above command):

    *list files jobid=8934
    +----------+
    | Filename |
    +----------+
    | /var/lib/named/ |
    | /var/lib/named/root.cache |
    | /var/lib/named/sec |
    | /var/lib/named/sec/example.com.db |
    | /var/lib/named/sec/example2.net.db |
    | /var/lib/named/pri |
    +----------+
    +-------+----------+---------------------+------+-------+----------+----------+-----------+
    | JobId | Name     | StartTime           | Type | Level | JobFiles | JobBytes | JobStatus |
    +-------+----------+---------------------+------+-------+----------+----------+-----------+
    | 8,934 | srv5-dns | 2018-03-04 20:00:20 | B    | F     |        6 |    1,851 | T         |
    +-------+----------+---------------------+------+-------+----------+----------+-----------+
    

    6) Get the scheduled jobs. Which jobs will be executed and when:

    *status dir
    Scheduled Jobs:
    Level          Type     Pri  Scheduled          Job Name           Volume
    ===================================================================================
    Incremental    Backup    10  06-Mar-18 20:00    srv1-dns           *unknown*
    Incremental    Backup    10  06-Mar-18 20:00    srv2-dns           *unknown*
    Incremental    Backup    10  06-Mar-18 20:00    srv3-dns           *unknown*
    Incremental    Backup    10  06-Mar-18 20:00    srv4-dns           *unknown*
    Incremental    Backup    10  06-Mar-18 20:00    srv5-dns           *unknown*
    ...................................................................................
    Incremental    Backup    10  06-Mar-18 23:05    srv1-media         *unknown*
    Full           Backup    10  06-Mar-18 23:05    srv1-images        *unknown*
    ====
    Running Jobs:
    Console connected at 06-Mar-18 13:36
    No Jobs running.
    ====
    Terminated Jobs:
     JobId  Level    Files      Bytes   Status   Finished        Name
    ====================================================================
      9006  Incr        472    296.8 M  OK       05-Mar-18 23:05 srv1-dns
      9007  Incr     10,547    194.8 M  Error    05-Mar-18 23:05 srv2-dns
      9002  Incr         37    133.0 M  OK       05-Mar-18 23:05 srv3-dns
      8995  Incr         57    372.2 M  OK       05-Mar-18 23:06 srv4-dns
      9000  Incr        391    1.195 G  OK       05-Mar-18 23:07 srv5-dns
      9008  Full        832    7.139 G  OK       05-Mar-18 23:49 srv1-images
      9009  Full          1    1.493 G  OK       05-Mar-18 23:50 srv1-media
      9011  Full    315,027    121.6 G  OK       06-Mar-18 03:44 srv2-images
      9012  Full    314,804    93.85 G  OK       06-Mar-18 04:18 srv2-media
    ====
    *
    
  • Storages

  • Where are the backup files on your system? Trace the bacula media devices to the real path of your backup files.

    *status storage
    Automatically selected Storage: bind
    Connecting to Storage daemon bind at 192.168.0.10:9103
    srvbkp-sd Version: 7.0.5 (28 July 2014) x86_64-pc-linux-gnu ubuntu 16.04
    Daemon started 06-Nov-17 17:25. Jobs: run=5025, running=0.
     Heap: heap=135,168 smbytes=2,231,640 max_bytes=5,264,027 bufs=439 max_bufs=2,152
     Sizes: boffset_t=8 size_t=8 int32_t=4 int64_t=8 mode=0,0
    Running Jobs:
    No Jobs running.
    ====
    Jobs waiting to reserve a drive:
    ====
    Terminated Jobs:
     JobId  Level    Files      Bytes   Status   Finished        Name
    ===================================================================
      9071  Incr          0         0   OK       06-Mar-18 20:00 srv1-dns
      9074  Incr          0         0   OK       06-Mar-18 20:00 srv2-dns
      9073  Incr          0         0   OK       06-Mar-18 20:00 srv3-dns
      9075  Incr          5    2.043 K  OK       06-Mar-18 20:00 srv4-dns
      9078  Incr          5    2.042 K  OK       06-Mar-18 20:00 srv5-dns
      9077  Incr          0         0   OK       06-Mar-18 20:00 srv1-media
      9079  Incr          0         0   OK       06-Mar-18 20:00 srv1-images
      9076  Incr          0         0   OK       06-Mar-18 20:00 srv2-images
      9057  Full          0         0   Other    06-Mar-18 20:31 srv2-media
    ====
    Device status:
    Device "localstorage" (/mnt/storage1/bacula-storage/local) is not open.
    ==
    Device "media" (/mnt/storage1/bacula-storage/media) is not open.
    ==
    Device "bind" (/mnt/storage1/bacula-storage/bind) is not open.
    ==
    Device "image" (/mnt/storage1/bacula-storage/image) is not open.
    ==
    ====
    Used Volume status:
    ====
    Attr spooling: 0 active jobs, 34,753 bytes; 120 total jobs, 34,753 max bytes.
    ====
    *
    
  • Clients

    1) Show all client names in the bacula system, this could be useful to link a client name with a server and then to use the name in the next command (below in 2):

    *show client
    Client: name=srvbkp-fd address=192.168.0.5 FDport=9102 MaxJobs=1
          JobRetention=3 months  FileRetention=2 months  AutoPrune=1
      --> Catalog: name=allbackup address=localhost DBport=0 db_name=bacula
          db_driver=*None* db_user=bacula MutliDBConn=0
    Client: name=srv1-dns address=192.168.0.100 FDport=9102 MaxJobs=1
          JobRetention=1 month  FileRetention=1 month  AutoPrune=1
      --> Catalog: name=allbackup address=localhost DBport=0 db_name=bacula
          db_driver=*None* db_user=bacula MutliDBConn=0
    Client: name=srv2-dns address=192.168.0.101 FDport=9102 MaxJobs=1
          JobRetention=1 month  FileRetention=1 month  AutoPrune=1
      --> Catalog: name=allbackup address=localhost DBport=0 db_name=bacula
          db_driver=*None* db_user=bacula MutliDBConn=0
    Client: name=srv3-dns address=192.168.0.103 FDport=9102 MaxJobs=1
          JobRetention=1 month  FileRetention=1 month  AutoPrune=1
      --> Catalog: name=allbackup address=localhost DBport=0 db_name=bacula
          db_driver=*None* db_user=bacula MutliDBConn=0
    Client: name=srv4-dns address=192.168.0.104 FDport=9102 MaxJobs=1
          JobRetention=1 month  FileRetention=1 month  AutoPrune=1
      --> Catalog: name=allbackup address=localhost DBport=0 db_name=bacula
          db_driver=*None* db_user=bacula MutliDBConn=0
    Client: name=srv5-dns address=192.168.0.105 FDport=9102 MaxJobs=1
          JobRetention=1 month  FileRetention=1 month  AutoPrune=1
      --> Catalog: name=allbackup address=localhost DBport=0 db_name=bacula
          db_driver=*None* db_user=bacula MutliDBConn=0
    Client: name=srv1-media address=192.168.0.106 FDport=9102 MaxJobs=1
          JobRetention=1 month  FileRetention=1 month  AutoPrune=1
      --> Catalog: name=allbackup address=localhost DBport=0 db_name=bacula
          db_driver=*None* db_user=bacula MutliDBConn=0
    Client: name=srv1-images address=192.168.0.107 FDport=9102 MaxJobs=1
          JobRetention=1 month  FileRetention=1 month  AutoPrune=1
      --> Catalog: name=allbackup address=localhost DBport=0 db_name=bacula
          db_driver=*None* db_user=bacula MutliDBConn=0
    *
    

    2) Show status of a client. We can use this command to check what is going on with the client at the moment of issuing the command, the last terminated backup jobs and their status. In addition we can check the connection between the Director daemon and the client daemon, because the Director connects at the moment we issue the command, so it is useful for debugging purposes:

    *status client=srv1-dns
    Connecting to Client srv1-dns at 192.168.0.100:9102
    
    srv1-dns-fd Version: 7.0.5 (28 July 2014)  x86_64-pc-linux-gnu ubuntu 16.04
    Daemon started 23-Feb-18 00:43. Jobs: run=8 running=0.
     Heap: heap=98,304 smbytes=188,701 max_bytes=571,361 bufs=64 max_bufs=97
     Sizes: boffset_t=8 size_t=8 debug=0 trace=0 mode=0,0 bwlimit=0kB/s
     Plugin: bpipe-fd.so 
    
    Running Jobs:
    Director connected at: 06-Mar-18 22:51
    No Jobs running.
    ====
    
    Terminated Jobs:
     JobId  Level    Files      Bytes   Status   Finished        Name 
    ===================================================================
      1832  Full          4    1.333 K  OK       01-Mar-18 23:48 srv1-dns
      1836  Incr          0         0   OK       02-Mar-18 00:01 srv1-dns
      1864  Incr          0         0   OK       02-Mar-18 20:30 srv1-dns
      1907  Incr          0         0   OK       03-Mar-18 20:00 srv1-dns
      1950  Full          4    1.333 K  OK       04-Mar-18 20:01 srv1-dns
      1994  Incr          0         0   OK       05-Mar-18 20:00 srv1-dns
      1037  Incr          0         0   OK       06-Mar-18 20:00 srv1-dns
    ====
    *
    
  • Filesets

    Which files will be included or excluded from the backup process. The lines starting with “I” mean “include”, the lines starting with “E” mean exclude.

     *show fileset
    FileSet: name=Full Set
     O M
     N
     I /usr/sbin
     N
     E /var/lib/bacula
     E /proc
     E /tmp
     E /sys
     E /.journal
     E /.fsck
     N
    FileSet: name=Catalog
     O M
     N
     I /var/lib/bacula/bacula.sql
     N
    FileSet: name=images
     O MfZ6
     N
     I /
     N
     E /proc
     E /tmp
     E /run
     E /dev
     E /sys
     N
    FileSet: name=bind
     O MZ6
     N
     I /var/lib/named
     N
    FileSet: name=media
     O MfZ6
     N
     I /
     N
     E /proc
     E /tmp
     E /run
     E /dev
     E /sys
     N
    
  • Schedule

    1) The all timeline plans for taking backups

    *show schedule
    Schedule: name=WeeklyCycle
      --> Run Level=Full
          hour=23 
          mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 
          month=0 1 2 3 4 5 6 7 8 9 10 11 
          wday=0 
          wom=0 
          woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 
          mins=5
      --> Run Level=Differential
          hour=23 
          mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 
          month=0 1 2 3 4 5 6 7 8 9 10 11 
          wday=0 
          wom=1 2 3 4 
          woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 
          mins=5
      --> Run Level=Incremental
          hour=23 
          mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 
          month=0 1 2 3 4 5 6 7 8 9 10 11 
          wday=1 2 3 4 5 6 
          wom=0 1 2 3 4 5 
          woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 
          mins=5
    Schedule: name=WeeklyCycleAfterBackup
      --> Run Level=Full
          hour=23 
          mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 
          month=0 1 2 3 4 5 6 7 8 9 10 11 
          wday=0 1 2 3 4 5 6 
          wom=0 1 2 3 4 5 
          woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 
          mins=10
    Schedule: name=images
      --> Run Level=Full
          hour=10 
          mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 
          month=0 1 2 3 4 5 6 7 8 9 10 11 
          wday=0 
          wom=0 
          woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 
          mins=0
      --> Run Level=Differential
          hour=10 
          mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 
          month=0 1 2 3 4 5 6 7 8 9 10 11 
          wday=0 
          wom=1 2 3 4 
          woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 
          mins=0
      --> Run Level=Incremental
          hour=20 
          mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 
          month=0 1 2 3 4 5 6 7 8 9 10 11 
          wday=1 2 3 4 5 6 
          wom=0 1 2 3 4 5 
          woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 
          mins=0
    

    2) The schedule plan for one client or groups of clients (server/s):

    *show schedule=bind
    Schedule: name=bind
      --> Run Level=Full
          hour=20 
          mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 
          month=0 1 2 3 4 5 6 7 8 9 10 11 
          wday=0 
          wom=0 
          woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 
          mins=0
      --> Run Level=Differential
          hour=20 
          mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 
          month=0 1 2 3 4 5 6 7 8 9 10 11 
          wday=0 
          wom=1 2 3 4 
          woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 
          mins=0
      --> Run Level=Incremental
          hour=20 
          mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 
          month=0 1 2 3 4 5 6 7 8 9 10 11 
          wday=1 2 3 4 5 6 
          wom=0 1 2 3 4 5 
          woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 
          mins=0
    
  • Director

    1) Show the last messages of the backup processes. As you can see there is an error in one of the jobs, this error means that the client did not connect to the backup daemon (probably the same server with the Director), the problem was the firewall did not allow connections from this client (IP):

    *message
    06-Mar 20:00 srvbkp-dir JobId 1075: Start Backup JobId 1075, Job=srv1-bind.2018-03-06_20.00.01_05
    06-Mar 20:00 srvbkp-dir JobId 1075: Using Device "bind" to write.
    06-Mar 20:00 srvbkp-sd JobId 1066: Elapsed time=00:00:10, Transfer rate=0  Bytes/second
    06-Mar 20:00 srvbkp-dir JobId 1066: Bacula srvbkp-dir 7.0.5 (28Jul14):
      Build OS:               x86_64-pc-linux-gnu ubuntu 16.04
      JobId:                  1066
      Job:                    srv1-media.2018-03-06_20.00.00_56
      Backup Level:           Incremental, since=2018-03-06 20:00:14
      Client:                 "srv1-media" 7.0.5 (28Jul14) x86_64-pc-linux-gnu,ubuntu,16.04
      FileSet:                "bind" 2017-11-07 17:19:45
      Pool:                   "bind-incr" (From Job IncPool override)
      Catalog:                "allbackup" (From Client resource)
      Storage:                "bind" (From Job resource)
      Scheduled time:         06-Mar-2018 20:00:00
      Start time:             06-Mar-2018 20:00:15
      End time:               06-Mar-2018 20:00:26
      Elapsed time:           11 secs
      Priority:               10
      FD Files Written:       0
      SD Files Written:       0
      FD Bytes Written:       0 (0 B)
      SD Bytes Written:       0 (0 B)
      Rate:                   0.0 KB/s
      Software Compression:   None
      VSS:                    no
      Encryption:             no
      Accurate:               no
      Volume name(s):        
      Volume Session Id:      1025
      Volume Session Time:    1509989534
      Last Volume Bytes:      4,618 (4.618 KB)
      Non-fatal FD errors:    0
      SD Errors:              0
      FD termination status:  OK
      SD termination status:  OK
      Termination:            Backup OK
    06-Mar 20:01 srvbkp-dir JobId 1057: Using Device "bind" to write.
    06-Mar 20:05 srv2-dns-fd JobId 1057: Warning: bsock.c:112 Could not connect to Storage daemon on 192.168.0.5:9103. ERR=Connection timed out
    Retrying ...
    06-Mar 20:31 srv2-dns-fd JobId 1057: Fatal error: bsock.c:118 Unable to connect to Storage daemon on 192.168.0.5:9103. ERR=Interrupted system call
    06-Mar 20:31 srv2-dns-fd JobId 1057: Fatal error: job.c:1893 Failed to connect to Storage daemon: 192.168.0.5:9103
    06-Mar 20:31 srvbkp-dir JobId 1057: Fatal error: Bad response to Storage command: wanted 2000 OK storage
    , got 2902 Bad storage
    06-Mar 20:31 srvbkp-dir JobId 1057: Error: Bacula srvbkp-dir 7.0.5 (28Jul14):
      Build OS:               x86_64-pc-linux-gnu ubuntu 16.04
      JobId:                  1057
      Job:                    srv2-dns.2018-03-06_20.00.00_47
      Backup Level:           Full (upgraded from Incremental)
      Client:                 "srv2-dns" 7.0.5 (28Jul14) x86_64-pc-linux-gnu,ubuntu,16.04
      FileSet:                "bind" 2017-11-07 17:19:45
      Pool:                   "bind-full" (From Job FullPool override)
      Catalog:                "allbackup" (From Client resource)
      Storage:                "bind" (From Job resource)
      Scheduled time:         06-Mar-2018 20:00:00
      Start time:             06-Mar-2018 20:00:00
      End time:               06-Mar-2018 20:31:22
      Elapsed time:           31 mins 22 secs
      Priority:               10
      FD Files Written:       0
      SD Files Written:       0
      FD Bytes Written:       0 (0 B)
      SD Bytes Written:       0 (0 B)
      Rate:                   0.0 KB/s
      Software Compression:   None
      VSS:                    no
      Encryption:             no
      Accurate:               no
      Volume name(s):        
      Volume Session Id:      1201
      Volume Session Time:    1509989534
      Last Volume Bytes:      1 (1 B)
      Non-fatal FD errors:    2
      SD Errors:              0
      FD termination status:  Error
      SD termination status:  Waiting on FD
      Termination:            *** Backup Error ***
    

    2) reload – reload the configuration files of bacula system. The daemons will re-read all configuration files in “/etc/bacula”. Unfortunately there is no output:

    *reload
    *
    

Install netdata monitoring in CentOS 7

netdata became a great tool for admins to monitor in real time their servers!
At first it was just an additional not mandatory tool to check what’s going on with the servers for the last hour or so, but it evolved to really handy and informative monitoring server tracking every seconds what is going on with the server and server’s most used services like database, web, application service.
Today in version 1.9 (this installation howto is for netdata 1.9) it could track activity at least of this services:

apache          hddtemp          postgres
beanstalk       haproxy          rabbitmq
ceph            isc_dhcpd        retroshare
bind_rndc       ipfs             redis
couchdb         memcached        sensors
chrony          mdstat           samba
cpufreq         mongodb          squid
dns_query_time  nginx            springboot
dnsdist         mysql            smartd_log
elasticsearch   nsd              tomcat
dovecot         nginx_plus       web_log
exim            ovpn_status_log  varnish
example         ntpd             fronius
freeradius      postfix          named
fail2ban        phpfpm           snmp
go_expvar       powerdns         stiebeleltron

And some of these plugins support multiple programs and services, for example web_log supports the access/error logs of the major web servers at the moment.

The installation is really simple netdata includes a script to facilitate the installation process.
Here are the minimal steps to install this great software:

STEP 1) Install dependencies, because will pull it from the official repository we also need git command

yum install -y git gcc make autoconf automake pkgconfig zlib-devel libuuid-devel curl nodejs freeipmi freeipmi-devel elfutils-libelf cmake openssl-devel libuv-devel

As you can see there is a nodejs packet, which depends on additional repository (you could skip this, just the modules, which depends on nodejs won’t work, as if now only the plugins using nodejs are located in “/etc/netdata/node.d/” and they are not so many).

yum -y install epel-release
yum -y install nodejs

STEP 2) Clone the netdata repository

cd
git clone https://github.com/firehol/netdata

STEP 3) Instal netdata

cd netdata
CFLAGS="-march=native -O2 -msse3 -fomit-frame-pointer -pipe" ./netdata-installer.sh --install /usr/local/netdata

Install the netdata software in a separate directory and if you clean the system, just delete this directory. The example above uses

/usr/local/netdata

all files will be installed there.
As you can see the installation output the path of your files

   - the daemon     at /usr/local/netdata/netdata/usr/sbin/netdata
   - config files   in /usr/local/netdata/netdata/etc/netdata
   - web files      in /usr/local/netdata/netdata/usr/share/netdata
   - plugins        in /usr/local/netdata/netdata/usr/libexec/netdata
   - cache files    in /usr/local/netdata/netdata/var/cache/netdata
   - db files       in /usr/local/netdata/netdata/var/lib/netdata
   - log files      in /usr/local/netdata/netdata/var/log/netdata
   - pid file       at /usr/local/netdata/netdata/var/run/netdata.pid
   - logrotate file at /etc/logrotate.d/netdata

STEP 4) USE firewall and open the port 19999 of your server to check e able to load the monitoring page

firewall-cmd --permanent --add-rich-rule="rule family="ipv4" source address="<YOURIP>" port protocol="tcp" port="19999" accept"
firewall-cmd --add-rich-rule="rule family="ipv4" source address="<YOURIP>" port protocol="tcp" port="19999" accept"

Because firewalld is the default firewall under CentOS 7 we used it ot show you how to let your IP access netdata web – replace with your current trusted IP.

* The installation process creates start/stop unit files for systemd and tells you how to update it (even you can run it automatically in a cron job)

To stop netdata run:
  systemctl stop netdata
To start netdata run:
  systemctl start netdata
 
Uninstall script generated: ./netdata-uninstaller.sh
Update script generated   : ./netdata-updater.sh
 
netdata-updater.sh can work from cron. It will trigger an email from cron
only if it fails (it does not print anything when it can update netdata).
Run this to automatically check and install netdata updates once per day:
 
sudo ln -s /root/netdata/netdata-updater.sh /etc/cron.daily/netdata-updater

* Here is the output of an installation help menu – it also hints the dependencies it may need:

[root@srv.local netdata]# ./netdata-installer.sh --help

  ^
  |.-.   .-.   .-.   .-.   .-.   .  netdata                          .-.   .-
  |   '-'   '-'   '-'   '-'   '-'   installer command line options  '   '-'  
  +----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+--->


./netdata-installer.sh <installer options>

Valid <installer options> are:

   --install /PATH/TO/INSTALL

        If you give: --install /opt
        netdata will be installed in /opt/netdata

   --dont-start-it

        Do not (re)start netdata.
        Just install it.

   --dont-wait

        Do not wait for the user to press ENTER.
        Start immediately building it.

   --auto-update | -u

        Install netdata-updater to cron,
        to update netdata automatically once per day
        (can only be done for installations from git)

   --enable-plugin-freeipmi
   --disable-plugin-freeipmi

        Enable/disable the FreeIPMI plugin.
        Default: enable it when libipmimonitoring is available.

   --enable-plugin-nfacct
   --disable-plugin-nfacct

        Enable/disable the nfacct plugin.
        Default: enable it when libmnl and libnetfilter_acct are available.

   --enable-lto
   --disable-lto

        Enable/disable Link-Time-Optimization
        Default: enabled

   --zlib-is-really-here
   --libs-are-really-here

        If you get errors about missing zlib,
        or libuuid but you know it is available,
        you have a broken pkg-config.
        Use this option to allow it continue
        without checking pkg-config.

Netdata will by default be compiled with gcc optimization -O2
If you need to pass different CFLAGS, use something like this:

  CFLAGS="<gcc options>" ./netdata-installer.sh <installer options>

For the installer to complete successfully, you will need
these packages installed:

   gcc make autoconf automake pkg-config zlib1g-dev (or zlib-devel)
   uuid-dev (or libuuid-devel)

For the plugins, you will at least need:

   curl, bash v4+, python v2 or v3, node.js

* netdata in action

main menu
All real-time monitoring plugins, System Overview opened

main menu
Memory details

main menu
PHP-FPM local details

* And here is the output of an installation process:

[root@lsrv3 netdata]# CFLAGS="-march=native -O2 -msse3 -fomit-frame-pointer -pipe" ./netdata-installer.sh --install /usr/local/netdata

  ^
  |.-.   .-.   .-.   .-.   .  netdata                                        
  |   '-'   '-'   '-'   '-'   real-time performance monitoring, done right!  
  +----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+--->


  You are about to build and install netdata to your system.

  It will be installed at these locations:

   - the daemon     at /usr/local/netdata/netdata/usr/sbin/netdata
   - config files   in /usr/local/netdata/netdata/etc/netdata
   - web files      in /usr/local/netdata/netdata/usr/share/netdata
   - plugins        in /usr/local/netdata/netdata/usr/libexec/netdata
   - cache files    in /usr/local/netdata/netdata/var/cache/netdata
   - db files       in /usr/local/netdata/netdata/var/lib/netdata
   - log files      in /usr/local/netdata/netdata/var/log/netdata
   - pid file       at /usr/local/netdata/netdata/var/run/netdata.pid
   - logrotate file at /etc/logrotate.d/netdata

  This installer allows you to change the installation path.
  Press Control-C and run the same command with --help for help.

Press ENTER to build and install netdata to '/usr/local/netdata/netdata' > 

 --- Run autotools to configure the build environment --- 
[/root/netdata]# ./autogen.sh 
autoreconf: Entering directory `.'
autoreconf: configure.ac: not using Gettext
autoreconf: running: aclocal --force -I m4
autoreconf: configure.ac: tracing
autoreconf: configure.ac: not using Libtool
autoreconf: running: /usr/bin/autoconf --force
autoreconf: running: /usr/bin/autoheader --force
autoreconf: running: automake --add-missing --copy --force-missing
autoreconf: Leaving directory `.'
 OK   

[/root/netdata]# ./configure --prefix=/usr/local/netdata/netdata/usr --sysconfdir=/usr/local/netdata/netdata/etc --localstatedir=/usr/local/netdata/netdata/var --with-zlib --with-math --with-user=netdata CFLAGS=-march=native\ -O2\ -msse3\ -fomit-frame-pointer\ -pipe 
checking whether to enable maintainer-specific portions of Makefiles... no
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /usr/bin/mkdir -p
checking for gawk... gawk
checking whether make sets $(MAKE)... yes
checking whether make supports nested variables... yes
checking how to create a pax tar archive... gnutar
checking build system type... x86_64-unknown-linux-gnu
checking host system type... x86_64-unknown-linux-gnu
checking for gcc... gcc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables... 
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking for style of include used by make... GNU
checking dependency style of gcc... gcc3
checking for pkg-config... /usr/bin/pkg-config
checking pkg-config is at least version 0.9.0... yes
checking how to run the C preprocessor... gcc -E
checking for grep that handles long lines and -e... /usr/bin/grep
checking for egrep... /usr/bin/grep -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking minix/config.h usability... no
checking minix/config.h presence... no
checking for minix/config.h... no
checking whether it is safe to define __EXTENSIONS__... yes
checking for __attribute__((returns_nonnull))... no
checking for __attribute__((malloc))... yes
checking for __attribute__((noreturn))... yes
checking for __attribute__((noinline))... yes
checking for __attribute__((format))... yes
checking for __attribute__((warn_unused_result))... yes
checking for struct timespec... yes
checking for clockid_t... yes
checking for library containing clock_gettime... none required
checking for clock_gettime... yes
checking for sched_setscheduler... yes
checking for sched_get_priority_min... yes
checking for sched_get_priority_max... yes
checking for nice... yes
checking for recvmmsg... yes
checking for int8_t... yes
checking for int16_t... yes
checking for int32_t... yes
checking for int64_t... yes
checking for uint8_t... yes
checking for uint16_t... yes
checking for uint32_t... yes
checking for uint64_t... yes
checking for inline... inline
checking whether strerror_r is declared... yes
checking for strerror_r... yes
checking whether strerror_r returns char *... yes
checking for _Generic... no
checking for __atomic... yes
checking size of void *... 8
checking whether sys/types.h defines makedev... yes
checking for sys/types.h... (cached) yes
checking for netinet/in.h... yes
checking for arpa/nameser.h... yes
checking for netdb.h... yes
checking for resolv.h... yes
checking for sys/prctl.h... yes
checking for linux/netfilter/nfnetlink_conntrack.h... yes
checking for accept4... yes
checking operating system... linux
checking if compiler needs -Werror to reject unknown flags... no
checking for the pthreads library -lpthreads... no
checking whether pthreads work without any flags... no
checking whether pthreads work with -Kthread... no
checking whether pthreads work with -kthread... no
checking for the pthreads library -llthread... no
checking whether pthreads work with -pthread... yes
checking for joinable pthread attribute... PTHREAD_CREATE_JOINABLE
checking if more special flags are required for pthreads... no
checking for PTHREAD_PRIO_INHERIT... yes
checking for sin in -lm... yes
checking if libm should be used... yes
checking for ZLIB... yes
checking if zlib should be used... yes
checking for UUID... yes
checking for memory allocator... system
checking for mallopt... yes
checking for mallinfo... yes
checking for LIBCAP... no
checking if libcap should be used... no
checking if apps.plugin should be enabled... yes
checking for IPMIMONITORING... yes
checking for
        ipmi_monitoring_sensor_readings_by_record_id,
        ipmi_monitoring_sensor_readings_by_sensor_type,
        ipmi_monitoring_sensor_read_sensor_number,
        ipmi_monitoring_sensor_read_sensor_name,
        ipmi_monitoring_sensor_read_sensor_state,
        ipmi_monitoring_sensor_read_sensor_units,
        ipmi_monitoring_sensor_iterator_next,
        ipmi_monitoring_ctx_sensor_config_file,
        ipmi_monitoring_ctx_sdr_cache_directory,
        ipmi_monitoring_ctx_errormsg,
        ipmi_monitoring_ctx_create
     in -lipmimonitoring... yes
checking ipmi_monitoring.h usability... yes
checking ipmi_monitoring.h presence... yes
checking for ipmi_monitoring.h... yes
checking ipmi_monitoring_bitmasks.h usability... yes
checking ipmi_monitoring_bitmasks.h presence... yes
checking for ipmi_monitoring_bitmasks.h... yes
checking if freeipmi.plugin should be enabled... yes
checking for NFACCT... no
checking for LIBMNL... no
checking if nfacct.plugin should be enabled... no
checking for setns... yes
checking if cgroup-network can be enabled... yes
checking whether C compiler accepts -flto... yes
checking if -flto builds executables... yes
checking if LTO should be enabled... yes
checking that generated files are newer than configure... done
configure: creating ./config.status
config.status: creating Makefile
config.status: creating charts.d/Makefile
config.status: creating conf.d/Makefile
config.status: creating netdata.spec
config.status: creating python.d/Makefile
config.status: creating node.d/Makefile
config.status: creating plugins.d/Makefile
config.status: creating src/Makefile
config.status: creating system/Makefile
config.status: creating web/Makefile
config.status: creating diagrams/Makefile
config.status: creating makeself/Makefile
config.status: creating contrib/Makefile
config.status: creating tests/Makefile
config.status: creating config.h
config.status: config.h is unchanged
config.status: executing depfiles commands
 OK   

 --- Cleanup compilation directory --- 
 --- Compile netdata --- 
[/root/netdata]# make -j8 
make  all-recursive
make[1]: Entering directory `/root/netdata'
Making all in charts.d
make[2]: Entering directory `/root/netdata/charts.d'
make[2]: Nothing to be done for `all'.
make[2]: Leaving directory `/root/netdata/charts.d'
Making all in conf.d
make[2]: Entering directory `/root/netdata/conf.d'
make[2]: Nothing to be done for `all'.
make[2]: Leaving directory `/root/netdata/conf.d'
Making all in diagrams
make[2]: Entering directory `/root/netdata/diagrams'
make[2]: Nothing to be done for `all'.
make[2]: Leaving directory `/root/netdata/diagrams'
Making all in makeself
make[2]: Entering directory `/root/netdata/makeself'
make[2]: Nothing to be done for `all'.
make[2]: Leaving directory `/root/netdata/makeself'
Making all in node.d
make[2]: Entering directory `/root/netdata/node.d'
make[2]: Nothing to be done for `all'.
make[2]: Leaving directory `/root/netdata/node.d'
Making all in plugins.d
make[2]: Entering directory `/root/netdata/plugins.d'
make[2]: Nothing to be done for `all'.
make[2]: Leaving directory `/root/netdata/plugins.d'
Making all in python.d
make[2]: Entering directory `/root/netdata/python.d'
if sed \
        -e 's#[@]localstatedir_POST@#/usr/local/netdata/netdata/var#g' \
        -e 's#[@]sbindir_POST@#/usr/local/netdata/netdata/usr/sbin#g' \
        -e 's#[@]sysconfdir_POST@#/usr/local/netdata/netdata/etc#g' \
        -e 's#[@]pythondir_POST@#/usr/local/netdata/netdata/usr/libexec/netdata/python.d#g' \
        python-modules-installer.sh.in > python-modules-installer.sh.tmp; then \
        mv "python-modules-installer.sh.tmp" "python-modules-installer.sh"; \
else \
        rm -f "python-modules-installer.sh.tmp"; \
        false; \
fi
make[2]: Leaving directory `/root/netdata/python.d'
Making all in src
make[2]: Entering directory `/root/netdata/src'
gcc -DHAVE_CONFIG_H -I. -I..  -DVARLIB_DIR="\"/usr/local/netdata/netdata/var/lib/netdata\"" -DCACHE_DIR="\"/usr/local/netdata/netdata/var/cache/netdata\"" -DCONFIG_DIR="\"/usr/local/netdata/netdata/etc/netdata\"" -DLOG_DIR="\"/usr/local/netdata/netdata/var/log/netdata\"" -DPLUGINS_DIR="\"/usr/local/netdata/netdata/usr/libexec/netdata/plugins.d\"" -DRUN_DIR="\"/usr/local/netdata/netdata/var/run/netdata\"" -DWEB_DIR="\"/usr/local/netdata/netdata/usr/share/netdata/web\""          -march=native -O2 -msse3 -fomit-frame-pointer -pipe -pthread -flto -MT apps_plugin.o -MD -MP -MF .deps/apps_plugin.Tpo -c -o apps_plugin.o apps_plugin.c
make[2]: Leaving directory `/root/netdata'
make[1]: Leaving directory `/root/netdata'
 OK   

 --- Restore user edited netdata configuration files --- 
 --- Fix generated files permissions --- 
[/root/netdata]# find ./system/ -type f -a \! -name \*.in -a \! -name Makefile\* -a \! -name \*.conf -a \! -name \*.service -a \! -name \*.logrotate -exec chmod 755 \{\} \; 
 OK   

 --- Add user netdata to required user groups --- 
Adding netdata user group ...
[/root/netdata]# groupadd -r netdata 
 OK   

Adding netdata user account with home /usr/local/netdata/netdata ...
[/root/netdata]# useradd -r -g netdata -c netdata -s /usr/sbin/nologin --no-create-home -d /usr/local/netdata/netdata netdata 
 OK   

Group 'docker' does not exist.
Adding netdata user to the nginx group ...
[/root/netdata]# usermod -a -G nginx netdata 
 OK   

Group 'varnish' does not exist.
Adding netdata user to the haproxy group ...
[/root/netdata]# usermod -a -G haproxy netdata 
 OK   

Adding netdata user to the adm group ...
[/root/netdata]# usermod -a -G adm netdata 
 OK   

Group 'nsd' does not exist.
Group 'proxy' does not exist.
Group 'squid' does not exist.
Group 'ceph' does not exist.
 --- Install logrotate configuration for netdata --- 
[/root/netdata]# cp system/netdata.logrotate /etc/logrotate.d/netdata 
 OK   

[/root/netdata]# chmod 644 /etc/logrotate.d/netdata 
 OK   

 --- Read installation options from netdata.conf --- 

    Permissions
    - netdata user     : netdata
    - netdata group    : netdata
    - web files user   : netdata
    - web files group  : netdata
    - root user        : root

    Directories
    - netdata conf dir : /usr/local/netdata/netdata/etc/netdata
    - netdata log dir  : /usr/local/netdata/netdata/var/log/netdata
    - netdata run dir  : /usr/local/netdata/netdata/var/run
    - netdata lib dir  : /usr/local/netdata/netdata/var/lib/netdata
    - netdata web dir  : /usr/local/netdata/netdata/usr/share/netdata/web
    - netdata cache dir: /usr/local/netdata/netdata/var/cache/netdata

    Other
    - netdata port     : 19999

 --- Fix permissions of netdata directories (using user 'netdata') --- 
[/root/netdata]# mkdir -p /usr/local/netdata/netdata/var/run 
 OK   

[/root/netdata]# chown -R root:netdata /usr/local/netdata/netdata/etc/netdata 
 OK   

[/root/netdata]# find /usr/local/netdata/netdata/etc/netdata -type f -exec chmod 0640 \{\} \; 
 OK   

[/root/netdata]# find /usr/local/netdata/netdata/etc/netdata -type d -exec chmod 0755 \{\} \; 
 OK   

[/root/netdata]# chown -R netdata:netdata /usr/local/netdata/netdata/usr/share/netdata/web 
 OK   

[/root/netdata]# find /usr/local/netdata/netdata/usr/share/netdata/web -type f -exec chmod 0664 \{\} \; 
 OK   

[/root/netdata]# find /usr/local/netdata/netdata/usr/share/netdata/web -type d -exec chmod 0775 \{\} \; 
 OK   

[/root/netdata]# chown -R netdata:netdata /usr/local/netdata/netdata/var/lib/netdata 
 OK   

[/root/netdata]# chown -R netdata:netdata /usr/local/netdata/netdata/var/cache/netdata 
 OK   

[/root/netdata]# chown -R netdata:netdata /usr/local/netdata/netdata/var/log/netdata 
 OK   

[/root/netdata]# chmod 755 /usr/local/netdata/netdata/var/log/netdata 
 OK   

[/root/netdata]# chown netdata:root /usr/local/netdata/netdata/var/log/netdata 
 OK   

[/root/netdata]# chown -R root /usr/local/netdata/netdata/usr/libexec/netdata 
 OK   

[/root/netdata]# find /usr/local/netdata/netdata/usr/libexec/netdata -type d -exec chmod 0755 \{\} \; 
 OK   

[/root/netdata]# find /usr/local/netdata/netdata/usr/libexec/netdata -type f -exec chmod 0644 \{\} \; 
 OK   

[/root/netdata]# find /usr/local/netdata/netdata/usr/libexec/netdata -type f -a -name \*.plugin -exec chmod 0755 \{\} \; 
 OK   

[/root/netdata]# find /usr/local/netdata/netdata/usr/libexec/netdata -type f -a -name \*.sh -exec chmod 0755 \{\} \; 
 OK   

[/root/netdata]# chown root:netdata /usr/local/netdata/netdata/usr/libexec/netdata/plugins.d/apps.plugin 
 OK   

[/root/netdata]# chmod 0750 /usr/local/netdata/netdata/usr/libexec/netdata/plugins.d/apps.plugin 
 OK   

[/root/netdata]# setcap cap_dac_read_search\,cap_sys_ptrace+ep /usr/local/netdata/netdata/usr/libexec/netdata/plugins.d/apps.plugin 
 OK   

[/root/netdata]# chown root:netdata /usr/local/netdata/netdata/usr/libexec/netdata/plugins.d/freeipmi.plugin 
 OK   

[/root/netdata]# chmod 4750 /usr/local/netdata/netdata/usr/libexec/netdata/plugins.d/freeipmi.plugin 
 OK   

[/root/netdata]# chown root:netdata /usr/local/netdata/netdata/usr/libexec/netdata/plugins.d/cgroup-network 
 OK   

[/root/netdata]# chmod 4750 /usr/local/netdata/netdata/usr/libexec/netdata/plugins.d/cgroup-network 
 OK   

[/root/netdata]# chown root /usr/local/netdata/netdata/usr/libexec/netdata/plugins.d/cgroup-network-helper.sh 
 OK   

[/root/netdata]# chmod 0550 /usr/local/netdata/netdata/usr/libexec/netdata/plugins.d/cgroup-network-helper.sh 
 OK   

[/root/netdata]# chmod a+rX /usr/local/netdata/netdata/usr/libexec 
 OK   

[/root/netdata]# chmod a+rX /usr/local/netdata/netdata/usr/share/netdata 
 OK   

 --- Install netdata at system init --- 
Installing systemd service...
[/root/netdata]# cp system/netdata.service /etc/systemd/system/netdata.service 
 OK   

[/root/netdata]# systemctl daemon-reload 
 OK   

[/root/netdata]# systemctl enable netdata 
Created symlink from /etc/systemd/system/multi-user.target.wants/netdata.service to /etc/systemd/system/netdata.service.
 OK   

 --- Start netdata --- 
[/root/netdata]# /usr/bin/systemctl stop netdata 
 OK   

[/root/netdata]# /usr/bin/systemctl restart netdata 
 OK   

OK. NetData Started!


-------------------------------------------------------------------------------

Downloading default configuration from netdata...
[/root/netdata]# curl -s -o /usr/local/netdata/netdata/etc/netdata/netdata.conf.new http://localhost:19999/netdata.conf 
 OK   

[/root/netdata]# mv /usr/local/netdata/netdata/etc/netdata/netdata.conf.new /usr/local/netdata/netdata/etc/netdata/netdata.conf 
 OK   

 OK  New configuration saved for you to edit at /usr/local/netdata/netdata/etc/netdata/netdata.conf 

[/root/netdata]# chown netdata /usr/local/netdata/netdata/etc/netdata/netdata.conf 
 OK   

[/root/netdata]# chmod 0664 /usr/local/netdata/netdata/etc/netdata/netdata.conf 
 OK   

 --- Check KSM (kernel memory deduper) --- 
 --- Check version.txt --- 
 --- Check apps.plugin --- 
 --- Generate netdata-uninstaller.sh --- 
 --- Basic netdata instructions --- 

netdata by default listens on all IPs on port 19999,
so you can access it with:

  http://this.machine.ip:19999/

To stop netdata run:

  systemctl stop netdata

To start netdata run:

  systemctl start netdata


Uninstall script generated: ./netdata-uninstaller.sh
Update script generated   : ./netdata-updater.sh

netdata-updater.sh can work from cron. It will trigger an email from cron
only if it fails (it does not print anything when it can update netdata).
Run this to automatically check and install netdata updates once per day:

sudo ln -s /root/netdata/netdata-updater.sh /etc/cron.daily/netdata-updater

 --- We are done! --- 

  ^
  |.-.   .-.   .-.   .-.   .-.   .  netdata                          .-.   .-
  |   '-'   '-'   '-'   '-'   '-'   is installed and running now!  -'   '-'  
  +----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+--->

  enjoy real-time performance and health monitoring...

mysql slave reset and fixing relay log read failure

Suddenly your slave server reset without a clean shutdown and when it came up again you saw the error of this kind:

2016-02-26 10:41:50 876 [ERROR] Slave SQL: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave. Error_code: 1594
2016-02-26 10:41:50 876 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.005014' position 152146793

So what we know, our master server is OK, our slave server was reset by unknown issue, so the problem is only in out slave logs. Mysql sever shows the status with:

mysql> SHOW SLAVE STATUS\G;

There are multiple lines of information, but the most important in our situation is these two lines:

Relay_Master_Log_File: mysql-bin.005014
  Exec_Master_Log_Pos: 152146793

This is the place where the slave server stopped at (as you can see from the logs above, newer versions of MySQL print these two values in the log, but older versions do not print them in the log, so check them with the above command!).
The slave server stopped at file mysql-bin.005014 and position 152146793 and could not continue, because its files are corrupted. We can reset the position issuing a CHANGE MASTER command, which will clean up the relay logs and the slave will start the replication from this position – no data will be lost. Before issuing the following commands save the relay log files, they can be useful if you have later errors. Here is the command:

STOP SLAVE;              
CHANGE MASTER TO
         MASTER_HOST='1.1.1.1',
         MASTER_USER='replusr',
         MASTER_LOG_FILE='mysql-bin.005014',
         MASTER_LOG_POS=152146793;
START SLAVE;

There three commands above

  • Stop the replication in the slave, because the replication is still running and the slave is logging the binary log received from the master
  • Change master command to reset the logs with the right position
  • Start the replication in the slave

The replication must continue without errors!

In some cases after we issue the above commands and the replication starts it immediately stops with error of

Duplicate entry

.

Last_SQL_Errno: 1062
Last_SQL_Error: Error 'Duplicate entry '3918722' for key 'PRIMARY'' on query. Default database: 'testdb'. Query: 'INSERT INTO `testtable` (`tabid`, `tabip`, `stat`, `ins`) VALUES ('83908', '2591777309', '1', NOW())'

So we did everything right, but our replication is again broken? The problem is that when there is such reset, it could happen the autoincrement of the table is reserved,but not used, because the server was reset just in the middle of the insert operation or it could be inserted properly, but the server was reset in the middle of updating the replication metadata! So you have two options:

  • Change the autoincrement value of the table if there is no record with ID of the duplicate entry, just select it:
    SELECT * FROM testdb WHERE id=[ID_FROM_THE_ERROR]
    

    If there is no ID with such value, change the autoicrement of the table with

    ALTER TABLE tbl AUTO_INCREMENT = [ID_FROM_THE_ERROR];
    
  • Skip the duplicate entry query with
    STOP SLAVE;
    SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1;
    START SLAVE;
    

    or parallel replication use

    STOP SLAVE;
    START SLAVE UNTIL sql_after_mts_gaps;
    SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1;
    START SLAVE;
    

You could trace the problem reading the relay logs at the position it stopped.
Often there is an issue with the last recorded position, so you should examine why you have duplicate entry. Check if the entry is inserted and if it is, just skip it! But then if you hit again a duplicate entry or another error, you should reinitialize the slave dumping the replicated databases from the master!

Here is the full log of status command, when there is a problem with the corrupted mysql relay logs:

mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 1.1.1.1
                  Master_User: repluser
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.005014
          Read_Master_Log_Pos: 246696051
               Relay_Log_File: mysqld-relay-bin.009911
                Relay_Log_Pos: 152146956
        Relay_Master_Log_File: mysql-bin.005014
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 1594
                   Last_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave.
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 152146793
              Relay_Log_Space: 246698113
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 1594
               Last_SQL_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave.
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 2
                  Master_UUID: ce8a6c29-cf8e-11e5-9d39-000000000001
             Master_Info_File: /var/lib/mysql/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: 
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 180226 11:54:51
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: 
                Auto_Position: 0
1 row in set (0.00 sec)

Create a simple spamassassin rule to catch words

Not so often we need to write our custom rules for fighting against spam, but sometimes we need it, because a spammer just wanted to target specifically our server or clients. If you use spamassassin here what you can do to create a simple rule to find words and rate the message with a desired score, which will (probably) mark it as a spam.
The template is as follows:

  • headers search, the example template is for the Subject header, but you could any other header name.
    header <RULENAME> Subject =~ /word1, word2, word3, ..., wordN/
    score <RULENAME> <score>
    describe <RULENAME> <description>
    
  • body search
    body <RULENAME> /word1, word2, word3, ..., wordN/
    score <RULENAME> <score>
    describe <RULENAME> <description>
    

Set these 3 lines (or the 6 above for the headers and body) in your user_prefs.cf file, which is probably here:

  • /etc/mail/spamassassin/local.cf – CentOS 7
  • /etc/spamassassin/ – Ubuntu 16/17, Gentoo
  • ~/.spamassassin/user_prefs.cf – custom file per user

Here is example of the rules:

header CONTAINS_VIG Subject =~ /apple, orange/
score CONTAINS_VIG 1.5
describe CONTAINS_VIG Bad Word fruits in the Subject
body CONTAINS_PEN /apple, orange/
score CONTAINS_PEN 1.5
describe CONTAINS_PEN Bad Word in the Body

Catch messages in the Subject and body containing apple and orange and add to the scoring system 1.5, for your purses you may need to increase the scoring drastically it depends on your required score for spam (check for it in local.cf).

* Update

As of Rob Morin proposed in the comments it is a good idea to add “/i” to catch lower and capital letters (“ignore case”) like this:

header CONTAINS_VIG Subject =~ /apple, orange/i
score CONTAINS_VIG 1.5
describe CONTAINS_VIG Bad Word fruits in the Subject
body CONTAINS_PEN /apple, orange/i
score CONTAINS_PEN 1.5
describe CONTAINS_PEN Bad Word in the Body

bacula fatal error – Unable to connect to Storage daemon

Bacula is an open software enterprise backup system! Check out the official site here
Complex but useful software, which could automate the whole backup process of all your servers.
Some errors are easy to track some are not, so here is one error with a misleading error message if you do not know or forget the details of how the daemons works.

Here is the error extracted from the logs:

01-Sep 00:45 backup01-de-dir JobId 8789: No prior Full backup Job record found.
01-Sep 00:45 backup01-de-dir JobId 8789: No prior or suitable Full backup found in catalog. Doing FULL backup.
01-Sep 00:45 backup01-de-dir JobId 8789: Job srv123us.2017-09-01_00.45.28_34 waiting 103 seconds for scheduled start time.
01-Sep 00:47 backup01-de-dir JobId 8789: Start Backup JobId 8789, Job=srv123us.2017-09-01_00.45.28_34
01-Sep 00:47 backup01-de-dir JobId 8789: Using Device "web" to write.
01-Sep 00:51 srv123us-fd JobId 8789: Warning: bsock.c:112 Could not connect to Storage daemon on 1.1.1.1:9103. ERR=Connection timed out
01-Sep 01:17 srv123us-fd JobId 8789: Fatal error: bsock.c:118 Unable to connect to Storage daemon on 1.1.1.1:9103. ERR=Interrupted system call
01-Sep 01:17 srv123us-fd JobId 8789: Fatal error: job.c:1893 Failed to connect to Storage daemon: 1.1.1.1:9103
01-Sep 01:17 backup01-de-dir JobId 8789: Fatal error: Bad response to Storage command: wanted 2000 OK storage
01-Sep 01:17 backup01-de-dir JobId 8789: Error: Bacula backup01-de-dir 7.0.5 (28Jul14):
 Build OS:               x86_64-pc-linux-gnu ubuntu 16.04
  JobId:                  8789
  Job:                    srv123us.2017-09-01_00.45.28_34
  Backup Level:           Full (upgraded from Incremental)
  Client:                 "srv123us" 7.0.5 (28Jul14) x86_64-pc-linux-gnu,ubuntu,16.04
  FileSet:                "web" 2017-11-07 17:19:45
  Pool:                   "web-full" (From Job FullPool override)
  Catalog:                "ucdn" (From Client resource)
  Storage:                "web" (From Job resource)
  Scheduled time:         01-Sep-2018 00:47:11
  Start time:             01-Sep-2018 00:47:11
  End time:               01-Sep-2018 01:17:23
  Elapsed time:           30 mins 12 secs
  Priority:               10
  FD Files Written:       0
  SD Files Written:       0
  FD Bytes Written:       0 (0 B)
  SD Bytes Written:       0 (0 B)
  Rate:                   0.0 KB/s
  Software Compression:   None
  VSS:                    no
  Encryption:             no
  Accurate:               no
  Volume name(s):         
  Volume Session Id:      4719
  Volume Session Time:    1510075534
  Last Volume Bytes:      0 (0 B)
  Non-fatal FD errors:    2
  SD Errors:              0
  FD termination status:  Error
  SD termination status:  Waiting on FD
  Termination:            *** Backup Error ***

But when we check the status of client from “bconsole” (Bacula’s management Console), everything seems OK, the backup server (Director daemon = bacula-dir) connects and get the report from the client daemon (Bacula File service = bacula-fd) in the server, even when you run a backup job, the status report is OK, the backup is running on the client, here is the output:

srv@local ~ # bconsole
Connecting to Director localhost:9101
1000 OK: 1 backup01-de-dir Version: 7.0.5 (28 July 2014)
Enter a period to cancel a command.
*status
Status available for:
     1: Director
     2: Storage
     3: Client
     4: Scheduled
     5: All
Select daemon type for status (1-5): 3
The defined Client resources are:
     1: srv1us
     2: srv2us
     3: srv123us
Select Client (File daemon) resource (1-3): 3
Connecting to Client srv123us at 108.61.250.36:9102
srv123us-fd Version: 7.0.5 (28 July 2014)  x86_64-pc-linux-gnu ubuntu 16.04
Daemon started 23-Feb-17 00:43. Jobs: run=1 running=0.
 Heap: heap=98,304 smbytes=571,344 max_bytes=571,361 bufs=97 max_bufs=97
 Sizes: boffset_t=8 size_t=8 debug=0 trace=0 mode=0,0 bwlimit=0kB/s
 Plugin: bpipe-fd.so 

Running Jobs:
JobId 8789 Job srv123us.2017-09-01_00.45.28_34 is running.
    Incremental Backup Job started: 01-Sep-17 00:45
    Files=0 Bytes=0 AveBytes/sec=0 LastBytes/sec=0 Errors=0
    Bwlimit=0
    Files: Examined=5 Backed up=0
    SDReadSeqNo=6 fd=5
Director connected at: 01-Sep-17 01:10
====

Terminated Jobs:
====

As you can see, everything seems OK of the status, there was a running job in the client server and it seemed the backup process had been running without errors for more then 20 minutes, but then suddenly got Fatal error (the first log):

01-Sep 00:51 srv123us-fd JobId 8789: Warning: bsock.c:112 Could not connect to Storage daemon on 1.1.1.1:9103. ERR=Connection timed out
01-Sep 01:17 srv123us-fd JobId 8789: Fatal error: bsock.c:118 Unable to connect to Storage daemon on 1.1.1.1:9103. ERR=Interrupted system call
01-Sep 01:17 srv123us-fd JobId 8789: Fatal error: job.c:1893 Failed to connect to Storage daemon: 1.1.1.1:9103
01-Sep 01:17 backup01-de-dir JobId 8789: Fatal error: Bad response to Storage command: wanted 2000 OK storage

And the problem is that, the Director (backup server) connects to the File Service of the client (the daemon on the client), but the opposite connection is not possible! When the backup is ready, the client daemon bacula file service connects to the bacula storage service (which could be on the same server with the director, but it could be on another server) to send the backup files and here is the problem! Client could not connect to the storage! So always check the two way connections: backup server -> client server-port:9102 and backup server-port:9103 (or storage server) <- client.
In the world of bacula:

bacula-dir -> bacula-fd:9102

bacula-sd:9103 -> bacula-fd

Misleading error on causal look it seems like bacula-sd is returning error to bacula-fd (which would mean that bacula-fd could connect to bacula-sd after all), but in reality bacula-dir received and logged that bacula-fd did not connect to bacula-sd resulting in Fatal error.

In our situation the firewall of the backup server was denying the connections from the client, but it could be a DNS resolve issue or another network problem. Most common problems are firewall or DNS resolve issues. The solution – just add accept rule for the IP of the client to connect to port 9103 of the backup (storage) server.

SUPERMICRO IPMI/KVM module tips – reset the unit and the admin password

After the previous howto “SUPERMICRO IPMI to use one of the one interfaces or dedicated LAN port” (in the howto is showed how to install the needed tool for managing the IPMI/KVM unit under console) of setting the network configuration there are a couple of interesting and important tips when working with the IPMI/KVM module. Here are they are:

  1. Reset IPMI/KVM module – sometimes it happen the keyboard or mouse not to work when the Console Redirection is loaded, it is easy to reset the unit from the web interface, but there are case when the web interface is not working – so ssh to your server and try one of the following commands:
    * warm reset – it’s like a reboot, inform the IPMI/KVM to reboot itself.

    ipmitool -I open bmc reset warm
    

    It does not work in all situations! So try a cold reset
    * cold reset – resets the IPMI/KVM, it’s like unplug and plug the power to the unit.

    ipmitool -I open bmc reset cold
    
  2. Reset the configuration of an IPMI/KVM module to factory defaults. It is useful when something goes wrong when upgrading the firmware of the unit and the old configuration is not supported or it says it is, but at the end the unit does not work properly. In rare cases it might help when the KVM (Keyboard, Video, Monitor part aka Console redirection does not work)
    Here is the command for resetting to factory defaults:

    ipmitool -I open raw 0x3c 0x40
    
  3. Reset admin password – reset the password for the administrator login of the IPMI/KVM unit. It’s trivial losing the password so with the help of the local console to the server you can reset the password to a simple one and then change it from the web interface.
    ipmitool -I open user set password 2 ADMIN
    

    The number “2” is the ID of the user, check it with:

    [root@srv0 ~]# ipmitool -I open user list
    ID  Name             Callin  Link Auth  IPMI Msg   Channel Priv Limit
    1                    true    false      false      Unknown (0x00)
    2   ADMIN            true    false      false      Unknown (0x00)
    3                    true    false      false      Unknown (0x00)
    4                    true    false      false      Unknown (0x00)
    5                    true    false      false      Unknown (0x00)
    6                    true    false      false      Unknown (0x00)
    7                    true    false      false      Unknown (0x00)
    8                    true    false      false      Unknown (0x00)
    9                    true    false      false      Unknown (0x00)
    10                   true    false      false      Unknown (0x00)
    

    Sometimes if a hacker got to your IPMI/KVM you could see the user table with the above command. There was a serious bug aka backdoor in some of these units, the ID of the ADMIN user or even the username could be changed, so you should use the list command to list the current user table.
    Use set name to set the username of the user.

    ipmitool -I open user set name 2 ADMIN
    
  4. Set a new network configuration. It’s worth mentioning again the howto for this purpose – “SUPERMICRO IPMI to use one of the one interfaces or dedicated LAN port

All commands using the network option of the ipmitool

ipmitool -I lanplus -H 192.168.7.150 -U ADMIN -P ADMIN bmc reset warm
ipmitool -I lanplus -H 192.168.7.150 -U ADMIN -P ADMIN bmc reset cold
ipmitool -I lanplus -H 192.168.7.150 -U ADMIN -P ADMIN raw 0x3c 0x40
ipmitool -I lanplus -H 192.168.7.150 -U ADMIN -P ADMIN user set password 2 ADMIN
ipmitool -I lanplus -H 192.168.7.150 -U ADMIN -P ADMIN user list

The IP 192.168.7.150 is the IP of your IPMI/KVM module, which you want to change with the above commands.

Tunneling the IPMI/KVM ports over ssh (supermicro ipmi ports)

The best security for the remote management unit in your server such as IPMI/KVM is to have local IP. All IPMI/KVM IP should be switched to a separated switch and a local sub-network used for the LAN Settings. So to be able to connect to the IPMI/KVM module you need a VPN connection to gain access to the local sub-network used for your servers’ management modules. However, sometimes the VPN cannot be used or it just happened the server is down, or you are at a place restricting unknown ports (or ports above 1024), which your VPN uses (that’s why the VPN server should use only one port from the most popular – 80, 443, but that’s a thing for another howto…) and so on. So you end with no ability to connect to the VPN server or you think you do not need at all a VPN server, because you always could use

openssh

to do the trick of tunneling ports from your computer to the IPMI/KVM module of your server through a server, which has an access to the local sub-network of the IPMI/KVM modules.

So here is what you need to get to the remote management of your server just using ssh for tunneling:

STEP 1) A server, which has access to the IP network of the IPMI/KVM modules.

Let’s say you set to all your servers’ IPMI/KVM modules IPs from network 192.168.7.0/24, so your server must have an IP from 192.168.7.0/24, for example 192.168.7.1, add it as an alias or to a dedicated LAN connected to the switch, in which of all your IPMI/KVM modules are plugged in. This server will be used as a transfer point to a selected IPMI/KVM IP.

STEP 2) Tunnel local selected ports using ssh to the server from STEP 1)

Use this command:

ssh -N -L 127.0.0.1:80:[IPMI-IP]:80 -L 127.0.0.1:443:[IPMI-IP]:443 -L 127.0.0.1:5900:[IPMI-IP]:5900 -L 127.0.0.1:623:[IPMI-IP]:623 root@[SERVER-IP]

For example using 192.168.7.150 for an IPMI/KVM IP:

[root@srv0 ~]# ssh -N -L 127.0.0.1:80:192.168.7.150:80 -L 127.0.0.1:443:192.168.7.150:443 -L 127.0.0.1:5900:192.168.7.150:5900 -L 127.0.0.1:623:192.168.7.150:623 root@example-server.com

With the above command you can use the web interface (https://127.0.0.1/, you could replace 127.0.0.1 with a local IP or a local IP alias of your machine), the java web start “Console Redirection” (the KVM – Keyboard, Video and Mouse) and you can mount Virtual Media from your computer to your server’s virtual CD/DVD device. Unfortunately to use properly the Virtual CD/DVD you must tunnel the UDP on port 623 (not only TCP 623), which is a little bit tricky. To tunnel the UDP packets

socat – Multipurpose relay (SOcket CAT)

program must be used.

STEP 3) Tunnel local selected ports using ssh to the server from STEP 1) and UDP port using socat

[root@srv0 ~]# socat -T15 udp4-recvfrom:623,reuseaddr,fork tcp:localhost:8000
[root@srv0 ~]# ssh -L8000:localhost:8000 -L 127.0.0.1:80:192.168.7.150:80 -L 127.0.0.1:443:192.168.7.150:443 -L 127.0.0.1:5900:192.168.7.150:5900 -L 127.0.0.1:623:192.168.7.150:623 root@example-server.com socat tcp4-listen:8000,reuseaddr,fork UDP:192.168.7.150:623

This will start a UDP listening socket on localhost port 8000. Every packet will be relayed using TCP to localhost 8000, which will be tunneled using ssh command to the remote server, where there is a started another socat listening TCP socket on port 8000, which will relay every packet to the UDP port 623 of IP 192.168.7.150. Replace the IP 192.168.7.150 with your IPMI/KVM IP.

* Here are the required ports for SUPERMICRO IPMI functionality in X9 and X10 motherboards

  • X9-motherboards, the ports are

    TCP Ports
    HTTP: 80
    HTTPS: 443
    SSH: 22
    WSMAN: 5985
    Video: 5901
    KVM: 5900
    CD/USB: 5120
    Floppy: 5123
    Virtual Media: 623
    SNMP: 161

    UDP ports:
    IPMI: 623

  • For X10-motherboards, the ports are

    TCP Ports
    HTTP: 80
    HTTPS: 443
    SSH: 22
    WSMAN: 5985
    Video: 5901
    KVM: 5900 , 3520
    CD/USB: 5120
    Floppy: 5123
    Virtual Media: 623
    SNMP: 161

    UDP ports:
    IPMI: 623

You could add the required port to the ssh command above if you need it!

Virtual Device mounted successfully

Successful mount in Console Redirection with Virtual Media:

main menu
Virtual Storage

if you are logged in the server and mount an ISO with the Virtual Device you’ll probably have this in “dmesg”:

[46683751.661063] usb 2-1.3.2: new high-speed USB device number 8 using ehci-pci
[46683751.795048] usb 2-1.3.2: New USB device found, idVendor=0ea0, idProduct=1111
[46683751.795051] usb 2-1.3.2: New USB device strings: Mfr=0, Product=0, SerialNumber=0
[46683751.795365] usb-storage 2-1.3.2:1.0: USB Mass Storage device detected
[46683751.795553] scsi6 : usb-storage 2-1.3.2:1.0
[46683752.795730] scsi 6:0:0:0: CD-ROM            ATEN     Virtual CDROM    YS0J PQ: 0 ANSI: 0 CCS
[46683752.806839] sr0: scsi3-mmc drive: 40x/40x cd/rw xa/form2 cdda tray
[46683752.806842] cdrom: Uniform CD-ROM driver Revision: 3.20
[46683752.806933] sr 6:0:0:0: Attached scsi CD-ROM sr0
[46683752.806971] sr 6:0:0:0: Attached scsi generic sg1 type 5

Set IP to the IPMI/KVM server module with ipmitool

IPMI/KVM module is a pretty useful add-on module to every server. In fact, every server should have IPMI module installed for fast management of the server in critical cases!
Here are the commands to set a static IP to the IPMI/KVM module with ipmitool using a console to the server:

ipmitool -I open lan set 1 ipsrc static
ipmitool -I open lan set 1 ipaddr [IPADDR]
ipmitool -I open lan set 1 netmask [NETMASK]
ipmitool -I open lan set 1 defgw ipaddr [GW IPADDR]
ipmitool -I open lan set 1 access on
  • [IPADDR] – the IP address of the IPMI/KVM
  • [NETMASK] – the netmask of the network
  • [GW IPADDR] – the gateway of the network

Here is a real world example of setting properly the LAN settings of the IPMI module.

[root@srv0 ~]# ipmitool -I open lan set 1 ipsrc static
[root@srv0 ~]# ipmitool -I open lan set 1 ipaddr 192.168.6.45
Setting LAN IP Address to 192.168.6.45
[root@srv0 ~]# ipmitool -I open lan set 1 netmask 255.255.255.0
Setting LAN Subnet Mask to 255.255.255.0
[root@srv0 ~]# ipmitool -I open lan set 1 defgw ipaddr 192.168.6.1
Setting LAN Default Gateway IP to 192.168.6.1
[root@srv0 ~]# ipmitool -I open lan set 1 access on
Set Channel Access for channel 1 was successful.
[root@srv0 ~]#

To see the current settings use:

[root@srv0 ~]# ipmitool -I open lan print
Set in Progress         : Set Complete
Auth Type Support       : NONE MD2 MD5 PASSWORD 
Auth Type Enable        : Callback : MD2 MD5 PASSWORD 
                        : User     : MD2 MD5 PASSWORD 
                        : Operator : MD2 MD5 PASSWORD 
                        : Admin    : MD2 MD5 PASSWORD 
                        : OEM      : MD2 MD5 PASSWORD 
IP Address Source       : Static Address
IP Address              : 192.168.6.45
Subnet Mask             : 255.255.255.0
MAC Address             : 00:25:90:18:8b:c9
SNMP Community String   : public
IP Header               : TTL=0x00 Flags=0x00 Precedence=0x00 TOS=0x00
BMC ARP Control         : ARP Responses Enabled, Gratuitous ARP Disabled
Default Gateway IP      : 192.168.6.1
Default Gateway MAC     : 00:00:00:00:00:00
Backup Gateway IP       : 0.0.0.0
Backup Gateway MAC      : 00:00:00:00:00:00
802.1q VLAN ID          : Disabled
802.1q VLAN Priority    : 0
RMCP+ Cipher Suites     : 1,2,3,6,7,8,11,12
Cipher Suite Priv Max   : aaaaXXaaaXXaaXX
                        :     X=Cipher Suite Unused
                        :     c=CALLBACK
                        :     u=USER
                        :     o=OPERATOR
                        :     a=ADMIN
                        :     O=OEM
Bad Password Threshold  : Not Available

*Dependencies

Installation of ipmitool:

  • CentOS 7
    yum -y install ipmitool
    
  • Ubuntu 16+
  • apt-get install ipmitool
    
  • Gentoo
    emerge -vu sys-apps/ipmitool
    

*Troubleshooting

If you receive errors when you execute ipmitool:

[root@srv0 ~]# ipmitool -I open lan set 1 ipaddr 192.168.6.45
Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0: No such file or directory
[root@srv0 ~]# ipmitool -I open lan set 1 netmask 255.255.255.0
Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0: No such file or directory
[root@srv0 ~]# ipmitool -I open lan set 1 defgw ipaddr 192.168.6.1
Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0: No such file or directory

The kernel module for the IPMI/KVM is not loaded by the system, so just execute:

[root@srv0 ~]# modprobe ipmi_si
[root@srv0 ~]# modprobe ipmi_devintf

And then you could use ipmitool commands above to set the network configuration of the IPMI/KVM add-on module.

megacli – restart a rebuild with a disk in failed state

Sometimes we need to start a rebuild with a disk in failed state when using a LSI hardware controller, but if we just return the good state of the failed disk, it will return immediately in the array and our filesystem will be broken for sure! In addition it happens that when we replace a disk the new disk to be in failed state, too.

So here are simple and tested steps for proper resetting a failed state of a disk to a good state and starting a rebuild. In the example below the disk in failed state is [32:1], replace with the proper [enclosure_id:slot_id] in your case.

  1. Make “Failed State” in “Unconfigured(BAD)”
    megacli -pdmarkmissing -physdrv[32:1] -aAll
    
  2. Prepare for removal (this command could fail, not a critical one)
    megacli -pdprprmv -physdrv[32:1] -a0
    
  3. Make the state of the disk “Unconfigured(Good), Spun Up”
    megacli -PDMakeGood -PhysDrv[32:1] -a0
    
  4. Start rebuild (this command could fail) – if the command fails continue with the next step, if not, the rebuild is restarted successfully.
    megacli -PDRbld -Start -PhysDrv[32:1] -a0
    

    Or

    megacli -pdlocate -start -physdrv[32:1] -a0
    

    One of the two commands will probably start the rebuild, but if the two fail then continue to the next step.

  5. Start rebuild, first clean the foreign configuration and then make the device hot spare (only if 4 the above command failed)
    megacli -CfgForeign -Clear -aALL
    #set global hostspare
    megacli -PDHSP -Set -PhysDrv [32:1] -a0
    

* If you need to unset/remove a global hotspare:

megacli -PDHSP -Rmv -PhysDrv [32:1] -aN