This is a basic review of the Eclipse new Angular IDE. What is the basic functionality of the IDE and how we can work with it. The main purpose is to show what it looks like creating an angular project.
Category: software
STEP 30) There is autocomplete for the component names in the html. If you write “
Bacula – show configuration, status and information with bconsole tool
The following list of commands could be used to get a brief or detailed view of a Bacula backup server from the management utility
bconsole
. These commands are extremely useful for getting information on the backup process and policy and Bacula troubleshooting – could be used for fast debugging of an error, problems or misconfiguration.
the following commands give information for
-
Jobs
list jobtotals Lists stats for all jobs, it also shows all jobs’ names show jobs Lists all jobs with their full configurations – show all jobs and for each job show detail explanation of what represent. The detail output includes full configuration of a job including client, catalog, fileset,schedule,pool,message. This will show all relationships between the different components of bacula system, how and which clients,storages,pools,schedules,filesets relate to. You’ll a thoroughly view of how let’s say a server is made the backup. show job=[job_name] shows full configuration for a job, the name could be taken by the two above commands list jobs Lists all jobs’ status – ID, StartTime, Type (backup?), Level (Full, increment, differential?), Files and bytes processed and the status of the job (Terminated normally, Running,Fatal error and so on) list files jobid=[ID] which files were included in the backup? Lists all paths and files included in the backup, not a configuration set but real physical path and filenames. status dir if you want to find all scheduled jobs for the next day (or more if you add a parameter). This will show the status of the director process. -
Storages
status storage list storage devices and their status – you can see the physical path on the filesystem where the Devices will put backup files -
Clients
show client show all clients’ names and backup policy status client=[client_name] show client status and what is doing, check the network connection between the director and the client, last terminated jobs and their status. -
Filesets
show fileset show all filesets, a fileset is a set with files and directories to include or exclude from a backup. -
Schedule
show schedule shows all registered schedulers and details for each one (Run Level=Full,Differential,Incremental), months, days, minutes. show schedule=[scheduler_name] shows details for a schedule with name scheduler_name (Run Level=Full,Differential,Incremental), months, days, minutes. It’s like the schedule backup plan of a server -
Director
message shows the last message of the backup process. If empty all logs of the backup process could be found in “/var/log/bacula/bacula.log” reload the director will re-read its all configuration files. Should be used when adding configuration files.
And here is the example output of the above commands with a little bit of explanation:
-
Jobs
1) Get the stats and the jobs’ names, the names could be used in many other commands!
srvbkp@local # bconsole Connecting to Director localhost:9101 1000 OK: 1 srvbkp-dir Version: 7.0.5 (28 July 2014) Enter a period to cancel a command. *list jobtotals Automatically selected Catalog: allbackup Using Catalog "allbackup" +------+-----------+-------------------+--------------------------------+ | Jobs | Files | Bytes | Job | +------+-----------+-------------------+--------------------------------+ | 90 | 90 | 123,665,584,337 | BackupCatalog | | 5 | 5 | 281,593,737,603 | RestoreFiles | | 13 | 1,232,316 | 118,480,634,434 | srv1-media | | 32 | 12 | 3,674 | srv1-dns | | 32 | 10 | 3,064 | srv2-dns | | 32 | 10 | 3,064 | srv3-dns | | 32 | 10 | 3,086 | srv4-dns | | 32 | 10 | 3,084 | srv5-dns | | 26 | 3,837,536 | 587,812,183,466 | srv1-images | +------+-----------+-------------------+--------------------------------+ +-------+------------+-------------------+ | Jobs | Files | Bytes | +-------+------------+-------------------+ | 1,474 | 14,925,321 | 5,475,024,028,957 | +-------+------------+-------------------+ *
2) Get all configurations for all jobs, here are included only two for clarity. All needed information for taking a backup of a server. You can see the files, which will be included (or excluded), where the backup will be stored when will happen in time, and how many different types of backup will be done – full, incremental, and differential. And this whole information is for all clients (servers).
*show jobs Job: name=srv1-dns JobType=66 level= Priority=10 Enabled=1 MaxJobs=1 Resched=0 Times=0 Interval=1,800 Spool=0 WritePartAfterJob=1 Accurate=0 --> Client: name=srv1-dns address=192.168.0.100 FDport=9102 MaxJobs=1 JobRetention=1 month FileRetention=1 month AutoPrune=1 --> Catalog: name=allbackup address=localhost DBport=0 db_name=bacula db_driver=*None* db_user=bacula MutliDBConn=0 --> FileSet: name=bind O MZ6 N I /var/lib/named N --> Schedule: name=bind --> Run Level=Full hour=20 mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 month=0 1 2 3 4 5 6 7 8 9 10 11 wday=0 wom=0 woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 mins=0 --> Run Level=Differential hour=20 mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 month=0 1 2 3 4 5 6 7 8 9 10 11 wday=0 wom=1 2 3 4 woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 mins=0 --> Run Level=Incremental hour=20 mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 month=0 1 2 3 4 5 6 7 8 9 10 11 wday=1 2 3 4 5 6 wom=0 1 2 3 4 5 woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 mins=0 --> Storage: name=bind address=192.168.0.10 SDport=9103 MaxJobs=10 DeviceName=bind MediaType=File StorageId=17 --> Pool: name=Default PoolType=Backup use_cat=1 use_once=0 cat_files=1 max_vols=100 auto_prune=1 VolRetention=1 year VolUse=0 secs recycle=1 LabelFormat=*None* CleaningPrefix=*None* LabelType=0 RecyleOldest=0 PurgeOldest=0 ActionOnPurge=0 MaxVolJobs=0 MaxVolFiles=0 MaxVolBytes=53687091200 MigTime=0 secs MigHiBytes=0 MigLoBytes=0 JobRetention=0 secs FileRetention=0 secs --> Pool: name=bind-full PoolType=Backup use_cat=1 use_once=1 cat_files=1 max_vols=0 auto_prune=1 VolRetention=2 months VolUse=0 secs recycle=1 LabelFormat=bind-full CleaningPrefix=*None* LabelType=0 RecyleOldest=0 PurgeOldest=0 ActionOnPurge=0 MaxVolJobs=0 MaxVolFiles=0 MaxVolBytes=0 MigTime=0 secs MigHiBytes=0 MigLoBytes=0 JobRetention=0 secs FileRetention=0 secs --> Pool: name=bind-incr PoolType=Backup use_cat=1 use_once=0 cat_files=1 max_vols=0 auto_prune=1 VolRetention=7 days VolUse=23 hours recycle=1 LabelFormat=bind-incr CleaningPrefix=*None* LabelType=0 RecyleOldest=0 PurgeOldest=0 ActionOnPurge=0 MaxVolJobs=0 MaxVolFiles=0 MaxVolBytes=0 MigTime=0 secs MigHiBytes=0 MigLoBytes=0 JobRetention=0 secs FileRetention=0 secs --> Pool: name=bind-diff PoolType=Backup use_cat=1 use_once=0 cat_files=1 max_vols=0 auto_prune=1 VolRetention=1 month 1 day VolUse=0 secs recycle=1 LabelFormat=bind-diff CleaningPrefix=*None* LabelType=0 RecyleOldest=0 PurgeOldest=0 ActionOnPurge=0 MaxVolJobs=0 MaxVolFiles=0 MaxVolBytes=0 MigTime=0 secs MigHiBytes=0 MigLoBytes=0 JobRetention=0 secs FileRetention=0 secs --> Messages: name=Standard mailcmd=/usr/sbin/bsmtp -h localhost -f "(Bacula) <%r>" -s "Bacula: %t %e of %c %l" %r opcmd=/usr/sbin/bsmtp -h localhost -f "(Bacula) <%r>" -s "Bacula: Intervention needed for %j" %r Job: name=srv2-dns JobType=66 level= Priority=10 Enabled=1 MaxJobs=1 Resched=0 Times=0 Interval=1,800 Spool=0 WritePartAfterJob=1 Accurate=0 --> Client: name=srv2-dns address=192.168.0.101 FDport=9102 MaxJobs=1 JobRetention=1 month FileRetention=1 month AutoPrune=1 --> Catalog: name=allbackup address=localhost DBport=0 db_name=bacula db_driver=*None* db_user=bacula MutliDBConn=0 --> FileSet: name=bind O MZ6 N I /var/lib/named N --> Schedule: name=bind --> Run Level=Full hour=20 mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 month=0 1 2 3 4 5 6 7 8 9 10 11 wday=0 wom=0 woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 mins=0 --> Run Level=Differential hour=20 mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 month=0 1 2 3 4 5 6 7 8 9 10 11 wday=0 wom=1 2 3 4 woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 mins=0 --> Run Level=Incremental hour=20 mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 month=0 1 2 3 4 5 6 7 8 9 10 11 wday=1 2 3 4 5 6 wom=0 1 2 3 4 5 woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 mins=0 --> Storage: name=bind address=192.168.0.10 SDport=9103 MaxJobs=10 DeviceName=bind MediaType=File StorageId=17 --> Pool: name=Default PoolType=Backup use_cat=1 use_once=0 cat_files=1 max_vols=100 auto_prune=1 VolRetention=1 year VolUse=0 secs recycle=1 LabelFormat=*None* CleaningPrefix=*None* LabelType=0 RecyleOldest=0 PurgeOldest=0 ActionOnPurge=0 MaxVolJobs=0 MaxVolFiles=0 MaxVolBytes=53687091200 MigTime=0 secs MigHiBytes=0 MigLoBytes=0 JobRetention=0 secs FileRetention=0 secs --> Pool: name=bind-full PoolType=Backup use_cat=1 use_once=1 cat_files=1 max_vols=0 auto_prune=1 VolRetention=2 months VolUse=0 secs recycle=1 LabelFormat=bind-full CleaningPrefix=*None* LabelType=0 RecyleOldest=0 PurgeOldest=0 ActionOnPurge=0 MaxVolJobs=0 MaxVolFiles=0 MaxVolBytes=0 MigTime=0 secs MigHiBytes=0 MigLoBytes=0 JobRetention=0 secs FileRetention=0 secs --> Pool: name=bind-incr PoolType=Backup use_cat=1 use_once=0 cat_files=1 max_vols=0 auto_prune=1 VolRetention=7 days VolUse=23 hours recycle=1 LabelFormat=bind-incr CleaningPrefix=*None* LabelType=0 RecyleOldest=0 PurgeOldest=0 ActionOnPurge=0 MaxVolJobs=0 MaxVolFiles=0 MaxVolBytes=0 MigTime=0 secs MigHiBytes=0 MigLoBytes=0 JobRetention=0 secs FileRetention=0 secs --> Pool: name=bind-diff PoolType=Backup use_cat=1 use_once=0 cat_files=1 max_vols=0 auto_prune=1 VolRetention=1 month 1 day VolUse=0 secs recycle=1 LabelFormat=bind-diff CleaningPrefix=*None* LabelType=0 RecyleOldest=0 PurgeOldest=0 ActionOnPurge=0 MaxVolJobs=0 MaxVolFiles=0 MaxVolBytes=0 MigTime=0 secs MigHiBytes=0 MigLoBytes=0 JobRetention=0 secs FileRetention=0 secs --> Messages: name=Standard mailcmd=/usr/sbin/bsmtp -h localhost -f "(Bacula) <%r>" -s "Bacula: %t %e of %c %l" %r opcmd=/usr/sbin/bsmtp -h localhost -f "(Bacula) <%r>" -s "Bacula: Intervention needed for %j" %r
3) You can get the full configuration information of a job (the information is the same as above, but for a given job name, which could be taken from the first command above, it is not necessary to output all the configurations every time):
srvbkp@local # bconsole Connecting to Director localhost:9101 1000 OK: 1 srvbkp-dir Version: 7.0.5 (28 July 2014) Enter a period to cancel a command. *show jobs=srv2-dns Job: name=srv2-dns JobType=66 level= Priority=10 Enabled=1 MaxJobs=1 Resched=0 Times=0 Interval=1,800 Spool=0 WritePartAfterJob=1 Accurate=0 --> Client: name=srv2-dns address=192.168.0.101 FDport=9102 MaxJobs=1 JobRetention=1 month FileRetention=1 month AutoPrune=1 --> Catalog: name=allbackup address=localhost DBport=0 db_name=bacula db_driver=*None* db_user=bacula MutliDBConn=0 --> FileSet: name=bind O MZ6 N I /var/lib/named N --> Schedule: name=bind --> Run Level=Full hour=20 mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 month=0 1 2 3 4 5 6 7 8 9 10 11 wday=0 wom=0 woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 mins=0 --> Run Level=Differential hour=20 mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 month=0 1 2 3 4 5 6 7 8 9 10 11 wday=0 wom=1 2 3 4 woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 mins=0 --> Run Level=Incremental hour=20 mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 month=0 1 2 3 4 5 6 7 8 9 10 11 wday=1 2 3 4 5 6 wom=0 1 2 3 4 5 woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 mins=0 --> Storage: name=bind address=192.168.0.10 SDport=9103 MaxJobs=10 DeviceName=bind MediaType=File StorageId=17 --> Pool: name=Default PoolType=Backup use_cat=1 use_once=0 cat_files=1 max_vols=100 auto_prune=1 VolRetention=1 year VolUse=0 secs recycle=1 LabelFormat=*None* CleaningPrefix=*None* LabelType=0 RecyleOldest=0 PurgeOldest=0 ActionOnPurge=0 MaxVolJobs=0 MaxVolFiles=0 MaxVolBytes=53687091200 MigTime=0 secs MigHiBytes=0 MigLoBytes=0 JobRetention=0 secs FileRetention=0 secs --> Pool: name=bind-full PoolType=Backup use_cat=1 use_once=1 cat_files=1 max_vols=0 auto_prune=1 VolRetention=2 months VolUse=0 secs recycle=1 LabelFormat=bind-full CleaningPrefix=*None* LabelType=0 RecyleOldest=0 PurgeOldest=0 ActionOnPurge=0 MaxVolJobs=0 MaxVolFiles=0 MaxVolBytes=0 MigTime=0 secs MigHiBytes=0 MigLoBytes=0 JobRetention=0 secs FileRetention=0 secs --> Pool: name=bind-incr PoolType=Backup use_cat=1 use_once=0 cat_files=1 max_vols=0 auto_prune=1 VolRetention=7 days VolUse=23 hours recycle=1 LabelFormat=bind-incr CleaningPrefix=*None* LabelType=0 RecyleOldest=0 PurgeOldest=0 ActionOnPurge=0 MaxVolJobs=0 MaxVolFiles=0 MaxVolBytes=0 MigTime=0 secs MigHiBytes=0 MigLoBytes=0 JobRetention=0 secs FileRetention=0 secs --> Pool: name=bind-diff PoolType=Backup use_cat=1 use_once=0 cat_files=1 max_vols=0 auto_prune=1 VolRetention=1 month 1 day VolUse=0 secs recycle=1 LabelFormat=bind-diff CleaningPrefix=*None* LabelType=0 RecyleOldest=0 PurgeOldest=0 ActionOnPurge=0 MaxVolJobs=0 MaxVolFiles=0 MaxVolBytes=0 MigTime=0 secs MigHiBytes=0 MigLoBytes=0 JobRetention=0 secs FileRetention=0 secs --> Messages: name=Standard mailcmd=/usr/sbin/bsmtp -h localhost -f "(Bacula) <%r>" -s "Bacula: %t %e of %c %l" %r opcmd=/usr/sbin/bsmtp -h localhost -f "(Bacula) <%r>" -s "Bacula: Intervention needed for %j" %r
4) Lists all jobs’ status – ID, StartTime, Type (backup?), Level (Full, increment, differential?), Files and bytes processed and the status of the job (Terminated normally, Running,Fatal error and so on). Found out if you have backups of a server or the backup process failed!
srvbkp@local # bconsole Connecting to Director localhost:9101 1000 OK: 1 srvbkp-dir Version: 7.0.5 (28 July 2014) Enter a period to cancel a command. *list jobs +-------+--------------------------------+---------------------+------+-------+-----------+-----------------+-----------+ | JobId | Name | StartTime | Type | Level | JobFiles | JobBytes | JobStatus | +-------+--------------------------------+---------------------+------+-------+-----------+-----------------+-----------+ | 128 | srv1-test | 2016-12-04 23:05:00 | B | F | 17,506 | 52,116,400 | T | | 178 | srv1-test | 2016-12-09 23:05:01 | B | I | 13 | 1,509 | T | | 188 | srv1-test | 2016-12-10 23:05:01 | B | I | 13 | 1,509 | T | ......................................................................................................................... | 8,927 | srv2-images | 2018-03-04 20:00:00 | B | F | 0 | 0 | f | | 8,928 | srv1-media | 2018-03-04 20:00:00 | B | F | 3 | 978 | T | | 8,930 | srv1-dns | 2018-03-04 20:00:01 | B | F | 6 | 1,843 | T | | 8,932 | srv2-dns | 2018-03-04 20:00:01 | B | F | 6 | 1,837 | T | | 8,931 | srv3-dns | 2018-03-04 20:00:03 | B | F | 5 | 1,542 | T | | 8,933 | srv4-dns | 2018-03-04 20:00:04 | B | F | 4 | 1,258 | T | | 8,934 | srv5-dns | 2018-03-04 20:00:04 | B | F | 4 | 1,258 | T | +-------+--------------------------------+---------------------+------+-------+-----------+-----------------+-----------+
5) Which files were included in a backup job? Lists all paths and files included in the backup job (the ID is from the above command):
*list files jobid=8934 +----------+ | Filename | +----------+ | /var/lib/named/ | | /var/lib/named/root.cache | | /var/lib/named/sec | | /var/lib/named/sec/example.com.db | | /var/lib/named/sec/example2.net.db | | /var/lib/named/pri | +----------+ +-------+----------+---------------------+------+-------+----------+----------+-----------+ | JobId | Name | StartTime | Type | Level | JobFiles | JobBytes | JobStatus | +-------+----------+---------------------+------+-------+----------+----------+-----------+ | 8,934 | srv5-dns | 2018-03-04 20:00:20 | B | F | 6 | 1,851 | T | +-------+----------+---------------------+------+-------+----------+----------+-----------+
6) Get the scheduled jobs. Which jobs will be executed and when:
*status dir Scheduled Jobs: Level Type Pri Scheduled Job Name Volume =================================================================================== Incremental Backup 10 06-Mar-18 20:00 srv1-dns *unknown* Incremental Backup 10 06-Mar-18 20:00 srv2-dns *unknown* Incremental Backup 10 06-Mar-18 20:00 srv3-dns *unknown* Incremental Backup 10 06-Mar-18 20:00 srv4-dns *unknown* Incremental Backup 10 06-Mar-18 20:00 srv5-dns *unknown* ................................................................................... Incremental Backup 10 06-Mar-18 23:05 srv1-media *unknown* Full Backup 10 06-Mar-18 23:05 srv1-images *unknown* ==== Running Jobs: Console connected at 06-Mar-18 13:36 No Jobs running. ==== Terminated Jobs: JobId Level Files Bytes Status Finished Name ==================================================================== 9006 Incr 472 296.8 M OK 05-Mar-18 23:05 srv1-dns 9007 Incr 10,547 194.8 M Error 05-Mar-18 23:05 srv2-dns 9002 Incr 37 133.0 M OK 05-Mar-18 23:05 srv3-dns 8995 Incr 57 372.2 M OK 05-Mar-18 23:06 srv4-dns 9000 Incr 391 1.195 G OK 05-Mar-18 23:07 srv5-dns 9008 Full 832 7.139 G OK 05-Mar-18 23:49 srv1-images 9009 Full 1 1.493 G OK 05-Mar-18 23:50 srv1-media 9011 Full 315,027 121.6 G OK 06-Mar-18 03:44 srv2-images 9012 Full 314,804 93.85 G OK 06-Mar-18 04:18 srv2-media ==== *
-
Storages
Where are the backup files on your system? Trace the bacula media devices to the real path of your backup files.
*status storage Automatically selected Storage: bind Connecting to Storage daemon bind at 192.168.0.10:9103 srvbkp-sd Version: 7.0.5 (28 July 2014) x86_64-pc-linux-gnu ubuntu 16.04 Daemon started 06-Nov-17 17:25. Jobs: run=5025, running=0. Heap: heap=135,168 smbytes=2,231,640 max_bytes=5,264,027 bufs=439 max_bufs=2,152 Sizes: boffset_t=8 size_t=8 int32_t=4 int64_t=8 mode=0,0 Running Jobs: No Jobs running. ==== Jobs waiting to reserve a drive: ==== Terminated Jobs: JobId Level Files Bytes Status Finished Name =================================================================== 9071 Incr 0 0 OK 06-Mar-18 20:00 srv1-dns 9074 Incr 0 0 OK 06-Mar-18 20:00 srv2-dns 9073 Incr 0 0 OK 06-Mar-18 20:00 srv3-dns 9075 Incr 5 2.043 K OK 06-Mar-18 20:00 srv4-dns 9078 Incr 5 2.042 K OK 06-Mar-18 20:00 srv5-dns 9077 Incr 0 0 OK 06-Mar-18 20:00 srv1-media 9079 Incr 0 0 OK 06-Mar-18 20:00 srv1-images 9076 Incr 0 0 OK 06-Mar-18 20:00 srv2-images 9057 Full 0 0 Other 06-Mar-18 20:31 srv2-media ==== Device status: Device "localstorage" (/mnt/storage1/bacula-storage/local) is not open. == Device "media" (/mnt/storage1/bacula-storage/media) is not open. == Device "bind" (/mnt/storage1/bacula-storage/bind) is not open. == Device "image" (/mnt/storage1/bacula-storage/image) is not open. == ==== Used Volume status: ==== Attr spooling: 0 active jobs, 34,753 bytes; 120 total jobs, 34,753 max bytes. ==== *
Clients
1) Show all client names in the bacula system, this could be useful to link a client name with a server and then to use the name in the next command (below in 2):
*show client Client: name=srvbkp-fd address=192.168.0.5 FDport=9102 MaxJobs=1 JobRetention=3 months FileRetention=2 months AutoPrune=1 --> Catalog: name=allbackup address=localhost DBport=0 db_name=bacula db_driver=*None* db_user=bacula MutliDBConn=0 Client: name=srv1-dns address=192.168.0.100 FDport=9102 MaxJobs=1 JobRetention=1 month FileRetention=1 month AutoPrune=1 --> Catalog: name=allbackup address=localhost DBport=0 db_name=bacula db_driver=*None* db_user=bacula MutliDBConn=0 Client: name=srv2-dns address=192.168.0.101 FDport=9102 MaxJobs=1 JobRetention=1 month FileRetention=1 month AutoPrune=1 --> Catalog: name=allbackup address=localhost DBport=0 db_name=bacula db_driver=*None* db_user=bacula MutliDBConn=0 Client: name=srv3-dns address=192.168.0.103 FDport=9102 MaxJobs=1 JobRetention=1 month FileRetention=1 month AutoPrune=1 --> Catalog: name=allbackup address=localhost DBport=0 db_name=bacula db_driver=*None* db_user=bacula MutliDBConn=0 Client: name=srv4-dns address=192.168.0.104 FDport=9102 MaxJobs=1 JobRetention=1 month FileRetention=1 month AutoPrune=1 --> Catalog: name=allbackup address=localhost DBport=0 db_name=bacula db_driver=*None* db_user=bacula MutliDBConn=0 Client: name=srv5-dns address=192.168.0.105 FDport=9102 MaxJobs=1 JobRetention=1 month FileRetention=1 month AutoPrune=1 --> Catalog: name=allbackup address=localhost DBport=0 db_name=bacula db_driver=*None* db_user=bacula MutliDBConn=0 Client: name=srv1-media address=192.168.0.106 FDport=9102 MaxJobs=1 JobRetention=1 month FileRetention=1 month AutoPrune=1 --> Catalog: name=allbackup address=localhost DBport=0 db_name=bacula db_driver=*None* db_user=bacula MutliDBConn=0 Client: name=srv1-images address=192.168.0.107 FDport=9102 MaxJobs=1 JobRetention=1 month FileRetention=1 month AutoPrune=1 --> Catalog: name=allbackup address=localhost DBport=0 db_name=bacula db_driver=*None* db_user=bacula MutliDBConn=0 *
2) Show status of a client. We can use this command to check what is going on with the client at the moment of issuing the command, the last terminated backup jobs and their status. In addition we can check the connection between the Director daemon and the client daemon, because the Director connects at the moment we issue the command, so it is useful for debugging purposes:
*status client=srv1-dns Connecting to Client srv1-dns at 192.168.0.100:9102 srv1-dns-fd Version: 7.0.5 (28 July 2014) x86_64-pc-linux-gnu ubuntu 16.04 Daemon started 23-Feb-18 00:43. Jobs: run=8 running=0. Heap: heap=98,304 smbytes=188,701 max_bytes=571,361 bufs=64 max_bufs=97 Sizes: boffset_t=8 size_t=8 debug=0 trace=0 mode=0,0 bwlimit=0kB/s Plugin: bpipe-fd.so Running Jobs: Director connected at: 06-Mar-18 22:51 No Jobs running. ==== Terminated Jobs: JobId Level Files Bytes Status Finished Name =================================================================== 1832 Full 4 1.333 K OK 01-Mar-18 23:48 srv1-dns 1836 Incr 0 0 OK 02-Mar-18 00:01 srv1-dns 1864 Incr 0 0 OK 02-Mar-18 20:30 srv1-dns 1907 Incr 0 0 OK 03-Mar-18 20:00 srv1-dns 1950 Full 4 1.333 K OK 04-Mar-18 20:01 srv1-dns 1994 Incr 0 0 OK 05-Mar-18 20:00 srv1-dns 1037 Incr 0 0 OK 06-Mar-18 20:00 srv1-dns ==== *
Filesets
Which files will be included or excluded from the backup process. The lines starting with “I” mean “include”, the lines starting with “E” mean exclude.
*show fileset FileSet: name=Full Set O M N I /usr/sbin N E /var/lib/bacula E /proc E /tmp E /sys E /.journal E /.fsck N FileSet: name=Catalog O M N I /var/lib/bacula/bacula.sql N FileSet: name=images O MfZ6 N I / N E /proc E /tmp E /run E /dev E /sys N FileSet: name=bind O MZ6 N I /var/lib/named N FileSet: name=media O MfZ6 N I / N E /proc E /tmp E /run E /dev E /sys N
Schedule
1) The all timeline plans for taking backups
*show schedule Schedule: name=WeeklyCycle --> Run Level=Full hour=23 mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 month=0 1 2 3 4 5 6 7 8 9 10 11 wday=0 wom=0 woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 mins=5 --> Run Level=Differential hour=23 mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 month=0 1 2 3 4 5 6 7 8 9 10 11 wday=0 wom=1 2 3 4 woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 mins=5 --> Run Level=Incremental hour=23 mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 month=0 1 2 3 4 5 6 7 8 9 10 11 wday=1 2 3 4 5 6 wom=0 1 2 3 4 5 woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 mins=5 Schedule: name=WeeklyCycleAfterBackup --> Run Level=Full hour=23 mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 month=0 1 2 3 4 5 6 7 8 9 10 11 wday=0 1 2 3 4 5 6 wom=0 1 2 3 4 5 woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 mins=10 Schedule: name=images --> Run Level=Full hour=10 mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 month=0 1 2 3 4 5 6 7 8 9 10 11 wday=0 wom=0 woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 mins=0 --> Run Level=Differential hour=10 mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 month=0 1 2 3 4 5 6 7 8 9 10 11 wday=0 wom=1 2 3 4 woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 mins=0 --> Run Level=Incremental hour=20 mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 month=0 1 2 3 4 5 6 7 8 9 10 11 wday=1 2 3 4 5 6 wom=0 1 2 3 4 5 woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 mins=0
2) The schedule plan for one client or groups of clients (server/s):
*show schedule=bind Schedule: name=bind --> Run Level=Full hour=20 mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 month=0 1 2 3 4 5 6 7 8 9 10 11 wday=0 wom=0 woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 mins=0 --> Run Level=Differential hour=20 mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 month=0 1 2 3 4 5 6 7 8 9 10 11 wday=0 wom=1 2 3 4 woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 mins=0 --> Run Level=Incremental hour=20 mday=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 month=0 1 2 3 4 5 6 7 8 9 10 11 wday=1 2 3 4 5 6 wom=0 1 2 3 4 5 woy=0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 mins=0
Director
1) Show the last messages of the backup processes. As you can see there is an error in one of the jobs, this error means that the client did not connect to the backup daemon (probably the same server with the Director), the problem was the firewall did not allow connections from this client (IP):
*message 06-Mar 20:00 srvbkp-dir JobId 1075: Start Backup JobId 1075, Job=srv1-bind.2018-03-06_20.00.01_05 06-Mar 20:00 srvbkp-dir JobId 1075: Using Device "bind" to write. 06-Mar 20:00 srvbkp-sd JobId 1066: Elapsed time=00:00:10, Transfer rate=0 Bytes/second 06-Mar 20:00 srvbkp-dir JobId 1066: Bacula srvbkp-dir 7.0.5 (28Jul14): Build OS: x86_64-pc-linux-gnu ubuntu 16.04 JobId: 1066 Job: srv1-media.2018-03-06_20.00.00_56 Backup Level: Incremental, since=2018-03-06 20:00:14 Client: "srv1-media" 7.0.5 (28Jul14) x86_64-pc-linux-gnu,ubuntu,16.04 FileSet: "bind" 2017-11-07 17:19:45 Pool: "bind-incr" (From Job IncPool override) Catalog: "allbackup" (From Client resource) Storage: "bind" (From Job resource) Scheduled time: 06-Mar-2018 20:00:00 Start time: 06-Mar-2018 20:00:15 End time: 06-Mar-2018 20:00:26 Elapsed time: 11 secs Priority: 10 FD Files Written: 0 SD Files Written: 0 FD Bytes Written: 0 (0 B) SD Bytes Written: 0 (0 B) Rate: 0.0 KB/s Software Compression: None VSS: no Encryption: no Accurate: no Volume name(s): Volume Session Id: 1025 Volume Session Time: 1509989534 Last Volume Bytes: 4,618 (4.618 KB) Non-fatal FD errors: 0 SD Errors: 0 FD termination status: OK SD termination status: OK Termination: Backup OK 06-Mar 20:01 srvbkp-dir JobId 1057: Using Device "bind" to write. 06-Mar 20:05 srv2-dns-fd JobId 1057: Warning: bsock.c:112 Could not connect to Storage daemon on 192.168.0.5:9103. ERR=Connection timed out Retrying ... 06-Mar 20:31 srv2-dns-fd JobId 1057: Fatal error: bsock.c:118 Unable to connect to Storage daemon on 192.168.0.5:9103. ERR=Interrupted system call 06-Mar 20:31 srv2-dns-fd JobId 1057: Fatal error: job.c:1893 Failed to connect to Storage daemon: 192.168.0.5:9103 06-Mar 20:31 srvbkp-dir JobId 1057: Fatal error: Bad response to Storage command: wanted 2000 OK storage , got 2902 Bad storage 06-Mar 20:31 srvbkp-dir JobId 1057: Error: Bacula srvbkp-dir 7.0.5 (28Jul14): Build OS: x86_64-pc-linux-gnu ubuntu 16.04 JobId: 1057 Job: srv2-dns.2018-03-06_20.00.00_47 Backup Level: Full (upgraded from Incremental) Client: "srv2-dns" 7.0.5 (28Jul14) x86_64-pc-linux-gnu,ubuntu,16.04 FileSet: "bind" 2017-11-07 17:19:45 Pool: "bind-full" (From Job FullPool override) Catalog: "allbackup" (From Client resource) Storage: "bind" (From Job resource) Scheduled time: 06-Mar-2018 20:00:00 Start time: 06-Mar-2018 20:00:00 End time: 06-Mar-2018 20:31:22 Elapsed time: 31 mins 22 secs Priority: 10 FD Files Written: 0 SD Files Written: 0 FD Bytes Written: 0 (0 B) SD Bytes Written: 0 (0 B) Rate: 0.0 KB/s Software Compression: None VSS: no Encryption: no Accurate: no Volume name(s): Volume Session Id: 1201 Volume Session Time: 1509989534 Last Volume Bytes: 1 (1 B) Non-fatal FD errors: 2 SD Errors: 0 FD termination status: Error SD termination status: Waiting on FD Termination: *** Backup Error ***
2) reload – reload the configuration files of bacula system. The daemons will re-read all configuration files in “/etc/bacula”. Unfortunately there is no output:
*reload *
Install netdata monitoring in CentOS 7
netdata became a great tool for admins to monitor in real time their servers!
At first it was just an additional not mandatory tool to check what’s going on with the servers for the last hour or so, but it evolved to really handy and informative monitoring server tracking every seconds what is going on with the server and server’s most used services like database, web, application service.
Today in version 1.9 (this installation howto is for netdata 1.9) it could track activity at least of this services:
apache hddtemp postgres beanstalk haproxy rabbitmq ceph isc_dhcpd retroshare bind_rndc ipfs redis couchdb memcached sensors chrony mdstat samba cpufreq mongodb squid dns_query_time nginx springboot dnsdist mysql smartd_log elasticsearch nsd tomcat dovecot nginx_plus web_log exim ovpn_status_log varnish example ntpd fronius freeradius postfix named fail2ban phpfpm snmp go_expvar powerdns stiebeleltron
And some of these plugins support multiple programs and services, for example web_log supports the access/error logs of the major web servers at the moment.
The installation is really simple netdata includes a script to facilitate the installation process.
Here are the minimal steps to install this great software:
STEP 1) Install dependencies, because will pull it from the official repository we also need git command
yum install -y git gcc make autoconf automake pkgconfig zlib-devel libuuid-devel curl nodejs freeipmi freeipmi-devel elfutils-libelf cmake openssl-devel libuv-devel
As you can see there is a nodejs packet, which depends on additional repository (you could skip this, just the modules, which depends on nodejs won’t work, as if now only the plugins using nodejs are located in “/etc/netdata/node.d/” and they are not so many).
yum -y install epel-release yum -y install nodejs
STEP 2) Clone the netdata repository
cd git clone https://github.com/firehol/netdata
STEP 3) Instal netdata
cd netdata CFLAGS="-march=native -O2 -msse3 -fomit-frame-pointer -pipe" ./netdata-installer.sh --install /usr/local/netdata
Install the netdata software in a separate directory and if you clean the system, just delete this directory. The example above uses
/usr/local/netdata
all files will be installed there.
As you can see the installation output the path of your files
- the daemon at /usr/local/netdata/netdata/usr/sbin/netdata - config files in /usr/local/netdata/netdata/etc/netdata - web files in /usr/local/netdata/netdata/usr/share/netdata - plugins in /usr/local/netdata/netdata/usr/libexec/netdata - cache files in /usr/local/netdata/netdata/var/cache/netdata - db files in /usr/local/netdata/netdata/var/lib/netdata - log files in /usr/local/netdata/netdata/var/log/netdata - pid file at /usr/local/netdata/netdata/var/run/netdata.pid - logrotate file at /etc/logrotate.d/netdata
STEP 4) USE firewall and open the port 19999 of your server to check e able to load the monitoring page
firewall-cmd --permanent --add-rich-rule="rule family="ipv4" source address="<YOURIP>" port protocol="tcp" port="19999" accept" firewall-cmd --add-rich-rule="rule family="ipv4" source address="<YOURIP>" port protocol="tcp" port="19999" accept"
Because firewalld is the default firewall under CentOS 7 we used it ot show you how to let your IP access netdata web – replace
* The installation process creates start/stop unit files for systemd and tells you how to update it (even you can run it automatically in a cron job)
To stop netdata run: systemctl stop netdata To start netdata run: systemctl start netdata Uninstall script generated: ./netdata-uninstaller.sh Update script generated : ./netdata-updater.sh netdata-updater.sh can work from cron. It will trigger an email from cron only if it fails (it does not print anything when it can update netdata). Run this to automatically check and install netdata updates once per day: sudo ln -s /root/netdata/netdata-updater.sh /etc/cron.daily/netdata-updater
* Here is the output of an installation help menu – it also hints the dependencies it may need:
[root@srv.local netdata]# ./netdata-installer.sh --help ^ |.-. .-. .-. .-. .-. . netdata .-. .- | '-' '-' '-' '-' '-' installer command line options ' '-' +----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+---> ./netdata-installer.sh <installer options> Valid <installer options> are: --install /PATH/TO/INSTALL If you give: --install /opt netdata will be installed in /opt/netdata --dont-start-it Do not (re)start netdata. Just install it. --dont-wait Do not wait for the user to press ENTER. Start immediately building it. --auto-update | -u Install netdata-updater to cron, to update netdata automatically once per day (can only be done for installations from git) --enable-plugin-freeipmi --disable-plugin-freeipmi Enable/disable the FreeIPMI plugin. Default: enable it when libipmimonitoring is available. --enable-plugin-nfacct --disable-plugin-nfacct Enable/disable the nfacct plugin. Default: enable it when libmnl and libnetfilter_acct are available. --enable-lto --disable-lto Enable/disable Link-Time-Optimization Default: enabled --zlib-is-really-here --libs-are-really-here If you get errors about missing zlib, or libuuid but you know it is available, you have a broken pkg-config. Use this option to allow it continue without checking pkg-config. Netdata will by default be compiled with gcc optimization -O2 If you need to pass different CFLAGS, use something like this: CFLAGS="<gcc options>" ./netdata-installer.sh <installer options> For the installer to complete successfully, you will need these packages installed: gcc make autoconf automake pkg-config zlib1g-dev (or zlib-devel) uuid-dev (or libuuid-devel) For the plugins, you will at least need: curl, bash v4+, python v2 or v3, node.js
* netdata in action
* And here is the output of an installation process:
[root@lsrv3 netdata]# CFLAGS="-march=native -O2 -msse3 -fomit-frame-pointer -pipe" ./netdata-installer.sh --install /usr/local/netdata ^ |.-. .-. .-. .-. . netdata | '-' '-' '-' '-' real-time performance monitoring, done right! +----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+---> You are about to build and install netdata to your system. It will be installed at these locations: - the daemon at /usr/local/netdata/netdata/usr/sbin/netdata - config files in /usr/local/netdata/netdata/etc/netdata - web files in /usr/local/netdata/netdata/usr/share/netdata - plugins in /usr/local/netdata/netdata/usr/libexec/netdata - cache files in /usr/local/netdata/netdata/var/cache/netdata - db files in /usr/local/netdata/netdata/var/lib/netdata - log files in /usr/local/netdata/netdata/var/log/netdata - pid file at /usr/local/netdata/netdata/var/run/netdata.pid - logrotate file at /etc/logrotate.d/netdata This installer allows you to change the installation path. Press Control-C and run the same command with --help for help. Press ENTER to build and install netdata to '/usr/local/netdata/netdata' > --- Run autotools to configure the build environment --- [/root/netdata]# ./autogen.sh autoreconf: Entering directory `.' autoreconf: configure.ac: not using Gettext autoreconf: running: aclocal --force -I m4 autoreconf: configure.ac: tracing autoreconf: configure.ac: not using Libtool autoreconf: running: /usr/bin/autoconf --force autoreconf: running: /usr/bin/autoheader --force autoreconf: running: automake --add-missing --copy --force-missing autoreconf: Leaving directory `.' OK [/root/netdata]# ./configure --prefix=/usr/local/netdata/netdata/usr --sysconfdir=/usr/local/netdata/netdata/etc --localstatedir=/usr/local/netdata/netdata/var --with-zlib --with-math --with-user=netdata CFLAGS=-march=native\ -O2\ -msse3\ -fomit-frame-pointer\ -pipe checking whether to enable maintainer-specific portions of Makefiles... no checking for a BSD-compatible install... /usr/bin/install -c checking whether build environment is sane... yes checking for a thread-safe mkdir -p... /usr/bin/mkdir -p checking for gawk... gawk checking whether make sets $(MAKE)... yes checking whether make supports nested variables... yes checking how to create a pax tar archive... gnutar checking build system type... x86_64-unknown-linux-gnu checking host system type... x86_64-unknown-linux-gnu checking for gcc... gcc checking whether the C compiler works... yes checking for C compiler default output file name... a.out checking for suffix of executables... checking whether we are cross compiling... no checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc accepts -g... yes checking for gcc option to accept ISO C89... none needed checking for style of include used by make... GNU checking dependency style of gcc... gcc3 checking for pkg-config... /usr/bin/pkg-config checking pkg-config is at least version 0.9.0... yes checking how to run the C preprocessor... gcc -E checking for grep that handles long lines and -e... /usr/bin/grep checking for egrep... /usr/bin/grep -E checking for ANSI C header files... yes checking for sys/types.h... yes checking for sys/stat.h... yes checking for stdlib.h... yes checking for string.h... yes checking for memory.h... yes checking for strings.h... yes checking for inttypes.h... yes checking for stdint.h... yes checking for unistd.h... yes checking minix/config.h usability... no checking minix/config.h presence... no checking for minix/config.h... no checking whether it is safe to define __EXTENSIONS__... yes checking for __attribute__((returns_nonnull))... no checking for __attribute__((malloc))... yes checking for __attribute__((noreturn))... yes checking for __attribute__((noinline))... yes checking for __attribute__((format))... yes checking for __attribute__((warn_unused_result))... yes checking for struct timespec... yes checking for clockid_t... yes checking for library containing clock_gettime... none required checking for clock_gettime... yes checking for sched_setscheduler... yes checking for sched_get_priority_min... yes checking for sched_get_priority_max... yes checking for nice... yes checking for recvmmsg... yes checking for int8_t... yes checking for int16_t... yes checking for int32_t... yes checking for int64_t... yes checking for uint8_t... yes checking for uint16_t... yes checking for uint32_t... yes checking for uint64_t... yes checking for inline... inline checking whether strerror_r is declared... yes checking for strerror_r... yes checking whether strerror_r returns char *... yes checking for _Generic... no checking for __atomic... yes checking size of void *... 8 checking whether sys/types.h defines makedev... yes checking for sys/types.h... (cached) yes checking for netinet/in.h... yes checking for arpa/nameser.h... yes checking for netdb.h... yes checking for resolv.h... yes checking for sys/prctl.h... yes checking for linux/netfilter/nfnetlink_conntrack.h... yes checking for accept4... yes checking operating system... linux checking if compiler needs -Werror to reject unknown flags... no checking for the pthreads library -lpthreads... no checking whether pthreads work without any flags... no checking whether pthreads work with -Kthread... no checking whether pthreads work with -kthread... no checking for the pthreads library -llthread... no checking whether pthreads work with -pthread... yes checking for joinable pthread attribute... PTHREAD_CREATE_JOINABLE checking if more special flags are required for pthreads... no checking for PTHREAD_PRIO_INHERIT... yes checking for sin in -lm... yes checking if libm should be used... yes checking for ZLIB... yes checking if zlib should be used... yes checking for UUID... yes checking for memory allocator... system checking for mallopt... yes checking for mallinfo... yes checking for LIBCAP... no checking if libcap should be used... no checking if apps.plugin should be enabled... yes checking for IPMIMONITORING... yes checking for ipmi_monitoring_sensor_readings_by_record_id, ipmi_monitoring_sensor_readings_by_sensor_type, ipmi_monitoring_sensor_read_sensor_number, ipmi_monitoring_sensor_read_sensor_name, ipmi_monitoring_sensor_read_sensor_state, ipmi_monitoring_sensor_read_sensor_units, ipmi_monitoring_sensor_iterator_next, ipmi_monitoring_ctx_sensor_config_file, ipmi_monitoring_ctx_sdr_cache_directory, ipmi_monitoring_ctx_errormsg, ipmi_monitoring_ctx_create in -lipmimonitoring... yes checking ipmi_monitoring.h usability... yes checking ipmi_monitoring.h presence... yes checking for ipmi_monitoring.h... yes checking ipmi_monitoring_bitmasks.h usability... yes checking ipmi_monitoring_bitmasks.h presence... yes checking for ipmi_monitoring_bitmasks.h... yes checking if freeipmi.plugin should be enabled... yes checking for NFACCT... no checking for LIBMNL... no checking if nfacct.plugin should be enabled... no checking for setns... yes checking if cgroup-network can be enabled... yes checking whether C compiler accepts -flto... yes checking if -flto builds executables... yes checking if LTO should be enabled... yes checking that generated files are newer than configure... done configure: creating ./config.status config.status: creating Makefile config.status: creating charts.d/Makefile config.status: creating conf.d/Makefile config.status: creating netdata.spec config.status: creating python.d/Makefile config.status: creating node.d/Makefile config.status: creating plugins.d/Makefile config.status: creating src/Makefile config.status: creating system/Makefile config.status: creating web/Makefile config.status: creating diagrams/Makefile config.status: creating makeself/Makefile config.status: creating contrib/Makefile config.status: creating tests/Makefile config.status: creating config.h config.status: config.h is unchanged config.status: executing depfiles commands OK --- Cleanup compilation directory --- --- Compile netdata --- [/root/netdata]# make -j8 make all-recursive make[1]: Entering directory `/root/netdata' Making all in charts.d make[2]: Entering directory `/root/netdata/charts.d' make[2]: Nothing to be done for `all'. make[2]: Leaving directory `/root/netdata/charts.d' Making all in conf.d make[2]: Entering directory `/root/netdata/conf.d' make[2]: Nothing to be done for `all'. make[2]: Leaving directory `/root/netdata/conf.d' Making all in diagrams make[2]: Entering directory `/root/netdata/diagrams' make[2]: Nothing to be done for `all'. make[2]: Leaving directory `/root/netdata/diagrams' Making all in makeself make[2]: Entering directory `/root/netdata/makeself' make[2]: Nothing to be done for `all'. make[2]: Leaving directory `/root/netdata/makeself' Making all in node.d make[2]: Entering directory `/root/netdata/node.d' make[2]: Nothing to be done for `all'. make[2]: Leaving directory `/root/netdata/node.d' Making all in plugins.d make[2]: Entering directory `/root/netdata/plugins.d' make[2]: Nothing to be done for `all'. make[2]: Leaving directory `/root/netdata/plugins.d' Making all in python.d make[2]: Entering directory `/root/netdata/python.d' if sed \ -e 's#[@]localstatedir_POST@#/usr/local/netdata/netdata/var#g' \ -e 's#[@]sbindir_POST@#/usr/local/netdata/netdata/usr/sbin#g' \ -e 's#[@]sysconfdir_POST@#/usr/local/netdata/netdata/etc#g' \ -e 's#[@]pythondir_POST@#/usr/local/netdata/netdata/usr/libexec/netdata/python.d#g' \ python-modules-installer.sh.in > python-modules-installer.sh.tmp; then \ mv "python-modules-installer.sh.tmp" "python-modules-installer.sh"; \ else \ rm -f "python-modules-installer.sh.tmp"; \ false; \ fi make[2]: Leaving directory `/root/netdata/python.d' Making all in src make[2]: Entering directory `/root/netdata/src' gcc -DHAVE_CONFIG_H -I. -I.. -DVARLIB_DIR="\"/usr/local/netdata/netdata/var/lib/netdata\"" -DCACHE_DIR="\"/usr/local/netdata/netdata/var/cache/netdata\"" -DCONFIG_DIR="\"/usr/local/netdata/netdata/etc/netdata\"" -DLOG_DIR="\"/usr/local/netdata/netdata/var/log/netdata\"" -DPLUGINS_DIR="\"/usr/local/netdata/netdata/usr/libexec/netdata/plugins.d\"" -DRUN_DIR="\"/usr/local/netdata/netdata/var/run/netdata\"" -DWEB_DIR="\"/usr/local/netdata/netdata/usr/share/netdata/web\"" -march=native -O2 -msse3 -fomit-frame-pointer -pipe -pthread -flto -MT apps_plugin.o -MD -MP -MF .deps/apps_plugin.Tpo -c -o apps_plugin.o apps_plugin.c make[2]: Leaving directory `/root/netdata' make[1]: Leaving directory `/root/netdata' OK --- Restore user edited netdata configuration files --- --- Fix generated files permissions --- [/root/netdata]# find ./system/ -type f -a \! -name \*.in -a \! -name Makefile\* -a \! -name \*.conf -a \! -name \*.service -a \! -name \*.logrotate -exec chmod 755 \{\} \; OK --- Add user netdata to required user groups --- Adding netdata user group ... [/root/netdata]# groupadd -r netdata OK Adding netdata user account with home /usr/local/netdata/netdata ... [/root/netdata]# useradd -r -g netdata -c netdata -s /usr/sbin/nologin --no-create-home -d /usr/local/netdata/netdata netdata OK Group 'docker' does not exist. Adding netdata user to the nginx group ... [/root/netdata]# usermod -a -G nginx netdata OK Group 'varnish' does not exist. Adding netdata user to the haproxy group ... [/root/netdata]# usermod -a -G haproxy netdata OK Adding netdata user to the adm group ... [/root/netdata]# usermod -a -G adm netdata OK Group 'nsd' does not exist. Group 'proxy' does not exist. Group 'squid' does not exist. Group 'ceph' does not exist. --- Install logrotate configuration for netdata --- [/root/netdata]# cp system/netdata.logrotate /etc/logrotate.d/netdata OK [/root/netdata]# chmod 644 /etc/logrotate.d/netdata OK --- Read installation options from netdata.conf --- Permissions - netdata user : netdata - netdata group : netdata - web files user : netdata - web files group : netdata - root user : root Directories - netdata conf dir : /usr/local/netdata/netdata/etc/netdata - netdata log dir : /usr/local/netdata/netdata/var/log/netdata - netdata run dir : /usr/local/netdata/netdata/var/run - netdata lib dir : /usr/local/netdata/netdata/var/lib/netdata - netdata web dir : /usr/local/netdata/netdata/usr/share/netdata/web - netdata cache dir: /usr/local/netdata/netdata/var/cache/netdata Other - netdata port : 19999 --- Fix permissions of netdata directories (using user 'netdata') --- [/root/netdata]# mkdir -p /usr/local/netdata/netdata/var/run OK [/root/netdata]# chown -R root:netdata /usr/local/netdata/netdata/etc/netdata OK [/root/netdata]# find /usr/local/netdata/netdata/etc/netdata -type f -exec chmod 0640 \{\} \; OK [/root/netdata]# find /usr/local/netdata/netdata/etc/netdata -type d -exec chmod 0755 \{\} \; OK [/root/netdata]# chown -R netdata:netdata /usr/local/netdata/netdata/usr/share/netdata/web OK [/root/netdata]# find /usr/local/netdata/netdata/usr/share/netdata/web -type f -exec chmod 0664 \{\} \; OK [/root/netdata]# find /usr/local/netdata/netdata/usr/share/netdata/web -type d -exec chmod 0775 \{\} \; OK [/root/netdata]# chown -R netdata:netdata /usr/local/netdata/netdata/var/lib/netdata OK [/root/netdata]# chown -R netdata:netdata /usr/local/netdata/netdata/var/cache/netdata OK [/root/netdata]# chown -R netdata:netdata /usr/local/netdata/netdata/var/log/netdata OK [/root/netdata]# chmod 755 /usr/local/netdata/netdata/var/log/netdata OK [/root/netdata]# chown netdata:root /usr/local/netdata/netdata/var/log/netdata OK [/root/netdata]# chown -R root /usr/local/netdata/netdata/usr/libexec/netdata OK [/root/netdata]# find /usr/local/netdata/netdata/usr/libexec/netdata -type d -exec chmod 0755 \{\} \; OK [/root/netdata]# find /usr/local/netdata/netdata/usr/libexec/netdata -type f -exec chmod 0644 \{\} \; OK [/root/netdata]# find /usr/local/netdata/netdata/usr/libexec/netdata -type f -a -name \*.plugin -exec chmod 0755 \{\} \; OK [/root/netdata]# find /usr/local/netdata/netdata/usr/libexec/netdata -type f -a -name \*.sh -exec chmod 0755 \{\} \; OK [/root/netdata]# chown root:netdata /usr/local/netdata/netdata/usr/libexec/netdata/plugins.d/apps.plugin OK [/root/netdata]# chmod 0750 /usr/local/netdata/netdata/usr/libexec/netdata/plugins.d/apps.plugin OK [/root/netdata]# setcap cap_dac_read_search\,cap_sys_ptrace+ep /usr/local/netdata/netdata/usr/libexec/netdata/plugins.d/apps.plugin OK [/root/netdata]# chown root:netdata /usr/local/netdata/netdata/usr/libexec/netdata/plugins.d/freeipmi.plugin OK [/root/netdata]# chmod 4750 /usr/local/netdata/netdata/usr/libexec/netdata/plugins.d/freeipmi.plugin OK [/root/netdata]# chown root:netdata /usr/local/netdata/netdata/usr/libexec/netdata/plugins.d/cgroup-network OK [/root/netdata]# chmod 4750 /usr/local/netdata/netdata/usr/libexec/netdata/plugins.d/cgroup-network OK [/root/netdata]# chown root /usr/local/netdata/netdata/usr/libexec/netdata/plugins.d/cgroup-network-helper.sh OK [/root/netdata]# chmod 0550 /usr/local/netdata/netdata/usr/libexec/netdata/plugins.d/cgroup-network-helper.sh OK [/root/netdata]# chmod a+rX /usr/local/netdata/netdata/usr/libexec OK [/root/netdata]# chmod a+rX /usr/local/netdata/netdata/usr/share/netdata OK --- Install netdata at system init --- Installing systemd service... [/root/netdata]# cp system/netdata.service /etc/systemd/system/netdata.service OK [/root/netdata]# systemctl daemon-reload OK [/root/netdata]# systemctl enable netdata Created symlink from /etc/systemd/system/multi-user.target.wants/netdata.service to /etc/systemd/system/netdata.service. OK --- Start netdata --- [/root/netdata]# /usr/bin/systemctl stop netdata OK [/root/netdata]# /usr/bin/systemctl restart netdata OK OK. NetData Started! ------------------------------------------------------------------------------- Downloading default configuration from netdata... [/root/netdata]# curl -s -o /usr/local/netdata/netdata/etc/netdata/netdata.conf.new http://localhost:19999/netdata.conf OK [/root/netdata]# mv /usr/local/netdata/netdata/etc/netdata/netdata.conf.new /usr/local/netdata/netdata/etc/netdata/netdata.conf OK OK New configuration saved for you to edit at /usr/local/netdata/netdata/etc/netdata/netdata.conf [/root/netdata]# chown netdata /usr/local/netdata/netdata/etc/netdata/netdata.conf OK [/root/netdata]# chmod 0664 /usr/local/netdata/netdata/etc/netdata/netdata.conf OK --- Check KSM (kernel memory deduper) --- --- Check version.txt --- --- Check apps.plugin --- --- Generate netdata-uninstaller.sh --- --- Basic netdata instructions --- netdata by default listens on all IPs on port 19999, so you can access it with: http://this.machine.ip:19999/ To stop netdata run: systemctl stop netdata To start netdata run: systemctl start netdata Uninstall script generated: ./netdata-uninstaller.sh Update script generated : ./netdata-updater.sh netdata-updater.sh can work from cron. It will trigger an email from cron only if it fails (it does not print anything when it can update netdata). Run this to automatically check and install netdata updates once per day: sudo ln -s /root/netdata/netdata-updater.sh /etc/cron.daily/netdata-updater --- We are done! --- ^ |.-. .-. .-. .-. .-. . netdata .-. .- | '-' '-' '-' '-' '-' is installed and running now! -' '-' +----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+---> enjoy real-time performance and health monitoring...
mysql slave reset and fixing relay log read failure
Suddenly your slave server reset without a clean shutdown and when it came up again you saw the error of this kind:
2016-02-26 10:41:50 876 [ERROR] Slave SQL: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave. Error_code: 1594 2016-02-26 10:41:50 876 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.005014' position 152146793
So what we know, our master server is OK, our slave server was reset by unknown issue, so the problem is only in out slave logs. Mysql sever shows the status with:
mysql> SHOW SLAVE STATUS\G;
There are multiple lines of information, but the most important in our situation is these two lines:
Relay_Master_Log_File: mysql-bin.005014 Exec_Master_Log_Pos: 152146793
This is the place where the slave server stopped at (as you can see from the logs above, newer versions of MySQL print these two values in the log, but older versions do not print them in the log, so check them with the above command!).
The slave server stopped at file mysql-bin.005014 and position 152146793 and could not continue, because its files are corrupted. We can reset the position issuing a CHANGE MASTER command, which will clean up the relay logs and the slave will start the replication from this position – no data will be lost. Before issuing the following commands save the relay log files, they can be useful if you have later errors. Here is the command:
STOP SLAVE; CHANGE MASTER TO MASTER_HOST='1.1.1.1', MASTER_USER='replusr', MASTER_LOG_FILE='mysql-bin.005014', MASTER_LOG_POS=152146793; START SLAVE;
There three commands above
- Stop the replication in the slave, because the replication is still running and the slave is logging the binary log received from the master
- Change master command to reset the logs with the right position
- Start the replication in the slave
The replication must continue without errors!
In some cases after we issue the above commands and the replication starts it immediately stops with error of
Duplicate entry
.
Last_SQL_Errno: 1062 Last_SQL_Error: Error 'Duplicate entry '3918722' for key 'PRIMARY'' on query. Default database: 'testdb'. Query: 'INSERT INTO `testtable` (`tabid`, `tabip`, `stat`, `ins`) VALUES ('83908', '2591777309', '1', NOW())'
So we did everything right, but our replication is again broken? The problem is that when there is such reset, it could happen the autoincrement of the table is reserved,but not used, because the server was reset just in the middle of the insert operation or it could be inserted properly, but the server was reset in the middle of updating the replication metadata! So you have two options:
- Change the autoincrement value of the table if there is no record with ID of the duplicate entry, just select it:
SELECT * FROM testdb WHERE id=[ID_FROM_THE_ERROR]
If there is no ID with such value, change the autoicrement of the table with
ALTER TABLE tbl AUTO_INCREMENT = [ID_FROM_THE_ERROR];
- Skip the duplicate entry query with
STOP SLAVE; SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1; START SLAVE;
or parallel replication use
STOP SLAVE; START SLAVE UNTIL sql_after_mts_gaps; SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1; START SLAVE;
You could trace the problem reading the relay logs at the position it stopped.
Often there is an issue with the last recorded position, so you should examine why you have duplicate entry. Check if the entry is inserted and if it is, just skip it! But then if you hit again a duplicate entry or another error, you should reinitialize the slave dumping the replicated databases from the master!
Here is the full log of status command, when there is a problem with the corrupted mysql relay logs:
mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 1.1.1.1 Master_User: repluser Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.005014 Read_Master_Log_Pos: 246696051 Relay_Log_File: mysqld-relay-bin.009911 Relay_Log_Pos: 152146956 Relay_Master_Log_File: mysql-bin.005014 Slave_IO_Running: Yes Slave_SQL_Running: No Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 1594 Last_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave. Skip_Counter: 0 Exec_Master_Log_Pos: 152146793 Relay_Log_Space: 246698113 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULL Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 1594 Last_SQL_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave. Replicate_Ignore_Server_Ids: Master_Server_Id: 2 Master_UUID: ce8a6c29-cf8e-11e5-9d39-000000000001 Master_Info_File: /var/lib/mysql/master.info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: 180226 11:54:51 Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: Executed_Gtid_Set: Auto_Position: 0 1 row in set (0.00 sec)
Create a simple spamassassin rule to catch words
Not so often we need to write our custom rules for fighting against spam, but sometimes we need it, because a spammer just wanted to target specifically our server or clients. If you use spamassassin here what you can do to create a simple rule to find words and rate the message with a desired score, which will (probably) mark it as a spam.
The template is as follows:
- headers search, the example template is for the Subject header, but you could any other header name.
header <RULENAME> Subject =~ /word1, word2, word3, ..., wordN/ score <RULENAME> <score> describe <RULENAME> <description>
- body search
body <RULENAME> /word1, word2, word3, ..., wordN/ score <RULENAME> <score> describe <RULENAME> <description>
Set these 3 lines (or the 6 above for the headers and body) in your user_prefs.cf file, which is probably here:
- /etc/mail/spamassassin/local.cf – CentOS 7
- /etc/spamassassin/ – Ubuntu 16/17, Gentoo
- ~/.spamassassin/user_prefs.cf – custom file per user
Here is example of the rules:
header CONTAINS_VIG Subject =~ /apple, orange/ score CONTAINS_VIG 1.5 describe CONTAINS_VIG Bad Word fruits in the Subject body CONTAINS_PEN /apple, orange/ score CONTAINS_PEN 1.5 describe CONTAINS_PEN Bad Word in the Body
Catch messages in the Subject and body containing apple and orange and add to the scoring system 1.5, for your purses you may need to increase the scoring drastically it depends on your required score for spam (check for it in local.cf).
* Update
As of Rob Morin proposed in the comments it is a good idea to add “/i” to catch lower and capital letters (“ignore case”) like this:
header CONTAINS_VIG Subject =~ /apple, orange/i score CONTAINS_VIG 1.5 describe CONTAINS_VIG Bad Word fruits in the Subject body CONTAINS_PEN /apple, orange/i score CONTAINS_PEN 1.5 describe CONTAINS_PEN Bad Word in the Body
bacula fatal error – Unable to connect to Storage daemon
Bacula is an open software enterprise backup system! Check out the official site here
Complex but useful software, which could automate the whole backup process of all your servers.
Some errors are easy to track some are not, so here is one error with a misleading error message if you do not know or forget the details of how the daemons works.
Here is the error extracted from the logs:
01-Sep 00:45 backup01-de-dir JobId 8789: No prior Full backup Job record found. 01-Sep 00:45 backup01-de-dir JobId 8789: No prior or suitable Full backup found in catalog. Doing FULL backup. 01-Sep 00:45 backup01-de-dir JobId 8789: Job srv123us.2017-09-01_00.45.28_34 waiting 103 seconds for scheduled start time. 01-Sep 00:47 backup01-de-dir JobId 8789: Start Backup JobId 8789, Job=srv123us.2017-09-01_00.45.28_34 01-Sep 00:47 backup01-de-dir JobId 8789: Using Device "web" to write. 01-Sep 00:51 srv123us-fd JobId 8789: Warning: bsock.c:112 Could not connect to Storage daemon on 1.1.1.1:9103. ERR=Connection timed out 01-Sep 01:17 srv123us-fd JobId 8789: Fatal error: bsock.c:118 Unable to connect to Storage daemon on 1.1.1.1:9103. ERR=Interrupted system call 01-Sep 01:17 srv123us-fd JobId 8789: Fatal error: job.c:1893 Failed to connect to Storage daemon: 1.1.1.1:9103 01-Sep 01:17 backup01-de-dir JobId 8789: Fatal error: Bad response to Storage command: wanted 2000 OK storage 01-Sep 01:17 backup01-de-dir JobId 8789: Error: Bacula backup01-de-dir 7.0.5 (28Jul14): Build OS: x86_64-pc-linux-gnu ubuntu 16.04 JobId: 8789 Job: srv123us.2017-09-01_00.45.28_34 Backup Level: Full (upgraded from Incremental) Client: "srv123us" 7.0.5 (28Jul14) x86_64-pc-linux-gnu,ubuntu,16.04 FileSet: "web" 2017-11-07 17:19:45 Pool: "web-full" (From Job FullPool override) Catalog: "ucdn" (From Client resource) Storage: "web" (From Job resource) Scheduled time: 01-Sep-2018 00:47:11 Start time: 01-Sep-2018 00:47:11 End time: 01-Sep-2018 01:17:23 Elapsed time: 30 mins 12 secs Priority: 10 FD Files Written: 0 SD Files Written: 0 FD Bytes Written: 0 (0 B) SD Bytes Written: 0 (0 B) Rate: 0.0 KB/s Software Compression: None VSS: no Encryption: no Accurate: no Volume name(s): Volume Session Id: 4719 Volume Session Time: 1510075534 Last Volume Bytes: 0 (0 B) Non-fatal FD errors: 2 SD Errors: 0 FD termination status: Error SD termination status: Waiting on FD Termination: *** Backup Error ***
But when we check the status of client from “bconsole” (Bacula’s management Console), everything seems OK, the backup server (Director daemon = bacula-dir) connects and get the report from the client daemon (Bacula File service = bacula-fd) in the server, even when you run a backup job, the status report is OK, the backup is running on the client, here is the output:
srv@local ~ # bconsole Connecting to Director localhost:9101 1000 OK: 1 backup01-de-dir Version: 7.0.5 (28 July 2014) Enter a period to cancel a command. *status Status available for: 1: Director 2: Storage 3: Client 4: Scheduled 5: All Select daemon type for status (1-5): 3 The defined Client resources are: 1: srv1us 2: srv2us 3: srv123us Select Client (File daemon) resource (1-3): 3 Connecting to Client srv123us at 108.61.250.36:9102 srv123us-fd Version: 7.0.5 (28 July 2014) x86_64-pc-linux-gnu ubuntu 16.04 Daemon started 23-Feb-17 00:43. Jobs: run=1 running=0. Heap: heap=98,304 smbytes=571,344 max_bytes=571,361 bufs=97 max_bufs=97 Sizes: boffset_t=8 size_t=8 debug=0 trace=0 mode=0,0 bwlimit=0kB/s Plugin: bpipe-fd.so Running Jobs: JobId 8789 Job srv123us.2017-09-01_00.45.28_34 is running. Incremental Backup Job started: 01-Sep-17 00:45 Files=0 Bytes=0 AveBytes/sec=0 LastBytes/sec=0 Errors=0 Bwlimit=0 Files: Examined=5 Backed up=0 SDReadSeqNo=6 fd=5 Director connected at: 01-Sep-17 01:10 ==== Terminated Jobs: ====
As you can see, everything seems OK of the status, there was a running job in the client server and it seemed the backup process had been running without errors for more then 20 minutes, but then suddenly got Fatal error (the first log):
01-Sep 00:51 srv123us-fd JobId 8789: Warning: bsock.c:112 Could not connect to Storage daemon on 1.1.1.1:9103. ERR=Connection timed out 01-Sep 01:17 srv123us-fd JobId 8789: Fatal error: bsock.c:118 Unable to connect to Storage daemon on 1.1.1.1:9103. ERR=Interrupted system call 01-Sep 01:17 srv123us-fd JobId 8789: Fatal error: job.c:1893 Failed to connect to Storage daemon: 1.1.1.1:9103 01-Sep 01:17 backup01-de-dir JobId 8789: Fatal error: Bad response to Storage command: wanted 2000 OK storage
And the problem is that, the Director (backup server) connects to the File Service of the client (the daemon on the client), but the opposite connection is not possible! When the backup is ready, the client daemon bacula file service connects to the bacula storage service (which could be on the same server with the director, but it could be on another server) to send the backup files and here is the problem! Client could not connect to the storage! So always check the two way connections: backup server -> client server-port:9102 and backup server-port:9103 (or storage server) <- client.
In the world of bacula:
bacula-dir -> bacula-fd:9102
bacula-sd:9103 -> bacula-fd
Misleading error on causal look it seems like bacula-sd is returning error to bacula-fd (which would mean that bacula-fd could connect to bacula-sd after all), but in reality bacula-dir received and logged that bacula-fd did not connect to bacula-sd resulting in Fatal error.
In our situation the firewall of the backup server was denying the connections from the client, but it could be a DNS resolve issue or another network problem. Most common problems are firewall or DNS resolve issues. The solution – just add accept rule for the IP of the client to connect to port 9103 of the backup (storage) server.
SUPERMICRO IPMI/KVM module tips – reset the unit and the admin password
After the previous howto “SUPERMICRO IPMI to use one of the one interfaces or dedicated LAN port” (in the howto is showed how to install the needed tool for managing the IPMI/KVM unit under console) of setting the network configuration there are a couple of interesting and important tips when working with the IPMI/KVM module. Here are they are:
- Reset IPMI/KVM module – sometimes it happen the keyboard or mouse not to work when the Console Redirection is loaded, it is easy to reset the unit from the web interface, but there are case when the web interface is not working – so ssh to your server and try one of the following commands:
* warm reset – it’s like a reboot, inform the IPMI/KVM to reboot itself.ipmitool -I open bmc reset warm
It does not work in all situations! So try a cold reset
* cold reset – resets the IPMI/KVM, it’s like unplug and plug the power to the unit.ipmitool -I open bmc reset cold
- Reset the configuration of an IPMI/KVM module to factory defaults. It is useful when something goes wrong when upgrading the firmware of the unit and the old configuration is not supported or it says it is, but at the end the unit does not work properly. In rare cases it might help when the KVM (Keyboard, Video, Monitor part aka Console redirection does not work)
Here is the command for resetting to factory defaults:ipmitool -I open raw 0x3c 0x40
- Reset admin password – reset the password for the administrator login of the IPMI/KVM unit. It’s trivial losing the password so with the help of the local console to the server you can reset the password to a simple one and then change it from the web interface.
ipmitool -I open user set password 2 ADMIN
The number “2” is the ID of the user, check it with:
[root@srv0 ~]# ipmitool -I open user list ID Name Callin Link Auth IPMI Msg Channel Priv Limit 1 true false false Unknown (0x00) 2 ADMIN true false false Unknown (0x00) 3 true false false Unknown (0x00) 4 true false false Unknown (0x00) 5 true false false Unknown (0x00) 6 true false false Unknown (0x00) 7 true false false Unknown (0x00) 8 true false false Unknown (0x00) 9 true false false Unknown (0x00) 10 true false false Unknown (0x00)
Sometimes if a hacker got to your IPMI/KVM you could see the user table with the above command. There was a serious bug aka backdoor in some of these units, the ID of the ADMIN user or even the username could be changed, so you should use the list command to list the current user table.
Use set name to set the username of the user.ipmitool -I open user set name 2 ADMIN
- Set a new network configuration. It’s worth mentioning again the howto for this purpose – “SUPERMICRO IPMI to use one of the one interfaces or dedicated LAN port”
All commands using the network option of the ipmitool
ipmitool -I lanplus -H 192.168.7.150 -U ADMIN -P ADMIN bmc reset warm ipmitool -I lanplus -H 192.168.7.150 -U ADMIN -P ADMIN bmc reset cold ipmitool -I lanplus -H 192.168.7.150 -U ADMIN -P ADMIN raw 0x3c 0x40 ipmitool -I lanplus -H 192.168.7.150 -U ADMIN -P ADMIN user set password 2 ADMIN ipmitool -I lanplus -H 192.168.7.150 -U ADMIN -P ADMIN user list
The IP 192.168.7.150 is the IP of your IPMI/KVM module, which you want to change with the above commands.
Tunneling the IPMI/KVM ports over ssh (supermicro ipmi ports)
The best security for the remote management unit in your server such as IPMI/KVM is to have local IP. All IPMI/KVM IP should be switched to a separated switch and a local sub-network used for the LAN Settings. So to be able to connect to the IPMI/KVM module you need a VPN connection to gain access to the local sub-network used for your servers’ management modules. However, sometimes the VPN cannot be used or it just happened the server is down, or you are at a place restricting unknown ports (or ports above 1024), which your VPN uses (that’s why the VPN server should use only one port from the most popular – 80, 443, but that’s a thing for another howto…) and so on. So you end with no ability to connect to the VPN server or you think you do not need at all a VPN server, because you always could use
openssh
to do the trick of tunneling ports from your computer to the IPMI/KVM module of your server through a server, which has an access to the local sub-network of the IPMI/KVM modules.
So here is what you need to get to the remote management of your server just using ssh for tunneling:
STEP 1) A server, which has access to the IP network of the IPMI/KVM modules.
Let’s say you set to all your servers’ IPMI/KVM modules IPs from network 192.168.7.0/24, so your server must have an IP from 192.168.7.0/24, for example 192.168.7.1, add it as an alias or to a dedicated LAN connected to the switch, in which of all your IPMI/KVM modules are plugged in. This server will be used as a transfer point to a selected IPMI/KVM IP.
STEP 2) Tunnel local selected ports using ssh to the server from STEP 1)
Use this command:
ssh -N -L 127.0.0.1:80:[IPMI-IP]:80 -L 127.0.0.1:443:[IPMI-IP]:443 -L 127.0.0.1:5900:[IPMI-IP]:5900 -L 127.0.0.1:623:[IPMI-IP]:623 root@[SERVER-IP]
For example using 192.168.7.150 for an IPMI/KVM IP:
[root@srv0 ~]# ssh -N -L 127.0.0.1:80:192.168.7.150:80 -L 127.0.0.1:443:192.168.7.150:443 -L 127.0.0.1:5900:192.168.7.150:5900 -L 127.0.0.1:623:192.168.7.150:623 root@example-server.com
With the above command you can use the web interface (https://127.0.0.1/, you could replace 127.0.0.1 with a local IP or a local IP alias of your machine), the java web start “Console Redirection” (the KVM – Keyboard, Video and Mouse) and you can mount Virtual Media from your computer to your server’s virtual CD/DVD device. Unfortunately to use properly the Virtual CD/DVD you must tunnel the UDP on port 623 (not only TCP 623), which is a little bit tricky. To tunnel the UDP packets
socat – Multipurpose relay (SOcket CAT)
program must be used.
STEP 3) Tunnel local selected ports using ssh to the server from STEP 1) and UDP port using socat
[root@srv0 ~]# socat -T15 udp4-recvfrom:623,reuseaddr,fork tcp:localhost:8000 [root@srv0 ~]# ssh -L8000:localhost:8000 -L 127.0.0.1:80:192.168.7.150:80 -L 127.0.0.1:443:192.168.7.150:443 -L 127.0.0.1:5900:192.168.7.150:5900 -L 127.0.0.1:623:192.168.7.150:623 root@example-server.com socat tcp4-listen:8000,reuseaddr,fork UDP:192.168.7.150:623
This will start a UDP listening socket on localhost port 8000. Every packet will be relayed using TCP to localhost 8000, which will be tunneled using ssh command to the remote server, where there is a started another socat listening TCP socket on port 8000, which will relay every packet to the UDP port 623 of IP 192.168.7.150. Replace the IP 192.168.7.150 with your IPMI/KVM IP.
* Here are the required ports for SUPERMICRO IPMI functionality in X9 and X10 motherboards
-
X9-motherboards, the ports are
TCP Ports
HTTP: 80
HTTPS: 443
SSH: 22
WSMAN: 5985
Video: 5901
KVM: 5900
CD/USB: 5120
Floppy: 5123
Virtual Media: 623
SNMP: 161UDP ports:
IPMI: 623 -
For X10-motherboards, the ports are
TCP Ports
HTTP: 80
HTTPS: 443
SSH: 22
WSMAN: 5985
Video: 5901
KVM: 5900 , 3520
CD/USB: 5120
Floppy: 5123
Virtual Media: 623
SNMP: 161UDP ports:
IPMI: 623
You could add the required port to the ssh command above if you need it!
Virtual Device mounted successfully
Successful mount in Console Redirection with Virtual Media:
if you are logged in the server and mount an ISO with the Virtual Device you’ll probably have this in “dmesg”:
[46683751.661063] usb 2-1.3.2: new high-speed USB device number 8 using ehci-pci [46683751.795048] usb 2-1.3.2: New USB device found, idVendor=0ea0, idProduct=1111 [46683751.795051] usb 2-1.3.2: New USB device strings: Mfr=0, Product=0, SerialNumber=0 [46683751.795365] usb-storage 2-1.3.2:1.0: USB Mass Storage device detected [46683751.795553] scsi6 : usb-storage 2-1.3.2:1.0 [46683752.795730] scsi 6:0:0:0: CD-ROM ATEN Virtual CDROM YS0J PQ: 0 ANSI: 0 CCS [46683752.806839] sr0: scsi3-mmc drive: 40x/40x cd/rw xa/form2 cdda tray [46683752.806842] cdrom: Uniform CD-ROM driver Revision: 3.20 [46683752.806933] sr 6:0:0:0: Attached scsi CD-ROM sr0 [46683752.806971] sr 6:0:0:0: Attached scsi generic sg1 type 5
Set IP to the IPMI/KVM server module with ipmitool
IPMI/KVM module is a pretty useful add-on module to every server. In fact, every server should have IPMI module installed for fast management of the server in critical cases!
Here are the commands to set a static IP to the IPMI/KVM module with ipmitool using a console to the server:
ipmitool -I open lan set 1 ipsrc static ipmitool -I open lan set 1 ipaddr [IPADDR] ipmitool -I open lan set 1 netmask [NETMASK] ipmitool -I open lan set 1 defgw ipaddr [GW IPADDR] ipmitool -I open lan set 1 access on
- [IPADDR] – the IP address of the IPMI/KVM
- [NETMASK] – the netmask of the network
- [GW IPADDR] – the gateway of the network
Here is a real world example of setting properly the LAN settings of the IPMI module.
[root@srv0 ~]# ipmitool -I open lan set 1 ipsrc static [root@srv0 ~]# ipmitool -I open lan set 1 ipaddr 192.168.6.45 Setting LAN IP Address to 192.168.6.45 [root@srv0 ~]# ipmitool -I open lan set 1 netmask 255.255.255.0 Setting LAN Subnet Mask to 255.255.255.0 [root@srv0 ~]# ipmitool -I open lan set 1 defgw ipaddr 192.168.6.1 Setting LAN Default Gateway IP to 192.168.6.1 [root@srv0 ~]# ipmitool -I open lan set 1 access on Set Channel Access for channel 1 was successful. [root@srv0 ~]#
To see the current settings use:
[root@srv0 ~]# ipmitool -I open lan print Set in Progress : Set Complete Auth Type Support : NONE MD2 MD5 PASSWORD Auth Type Enable : Callback : MD2 MD5 PASSWORD : User : MD2 MD5 PASSWORD : Operator : MD2 MD5 PASSWORD : Admin : MD2 MD5 PASSWORD : OEM : MD2 MD5 PASSWORD IP Address Source : Static Address IP Address : 192.168.6.45 Subnet Mask : 255.255.255.0 MAC Address : 00:25:90:18:8b:c9 SNMP Community String : public IP Header : TTL=0x00 Flags=0x00 Precedence=0x00 TOS=0x00 BMC ARP Control : ARP Responses Enabled, Gratuitous ARP Disabled Default Gateway IP : 192.168.6.1 Default Gateway MAC : 00:00:00:00:00:00 Backup Gateway IP : 0.0.0.0 Backup Gateway MAC : 00:00:00:00:00:00 802.1q VLAN ID : Disabled 802.1q VLAN Priority : 0 RMCP+ Cipher Suites : 1,2,3,6,7,8,11,12 Cipher Suite Priv Max : aaaaXXaaaXXaaXX : X=Cipher Suite Unused : c=CALLBACK : u=USER : o=OPERATOR : a=ADMIN : O=OEM Bad Password Threshold : Not Available
*Dependencies
Installation of ipmitool:
- CentOS 7
yum -y install ipmitool
- Ubuntu 16+
apt-get install ipmitool
emerge -vu sys-apps/ipmitool
*Troubleshooting
If you receive errors when you execute ipmitool:
[root@srv0 ~]# ipmitool -I open lan set 1 ipaddr 192.168.6.45 Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0: No such file or directory [root@srv0 ~]# ipmitool -I open lan set 1 netmask 255.255.255.0 Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0: No such file or directory [root@srv0 ~]# ipmitool -I open lan set 1 defgw ipaddr 192.168.6.1 Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0: No such file or directory
The kernel module for the IPMI/KVM is not loaded by the system, so just execute:
[root@srv0 ~]# modprobe ipmi_si [root@srv0 ~]# modprobe ipmi_devintf
And then you could use ipmitool commands above to set the network configuration of the IPMI/KVM add-on module.
megacli – restart a rebuild with a disk in failed state
Sometimes we need to start a rebuild with a disk in failed state when using a LSI hardware controller, but if we just return the good state of the failed disk, it will return immediately in the array and our filesystem will be broken for sure! In addition it happens that when we replace a disk the new disk to be in failed state, too.
So here are simple and tested steps for proper resetting a failed state of a disk to a good state and starting a rebuild. In the example below the disk in failed state is [32:1], replace with the proper [enclosure_id:slot_id] in your case.
- Make “Failed State” in “Unconfigured(BAD)”
megacli -pdmarkmissing -physdrv[32:1] -aAll
- Prepare for removal (this command could fail, not a critical one)
megacli -pdprprmv -physdrv[32:1] -a0
- Make the state of the disk “Unconfigured(Good), Spun Up”
megacli -PDMakeGood -PhysDrv[32:1] -a0
- Start rebuild (this command could fail) – if the command fails continue with the next step, if not, the rebuild is restarted successfully.
megacli -PDRbld -Start -PhysDrv[32:1] -a0
Or
megacli -pdlocate -start -physdrv[32:1] -a0
One of the two commands will probably start the rebuild, but if the two fail then continue to the next step.
- Start rebuild, first clean the foreign configuration and then make the device hot spare (only if 4 the above command failed)
megacli -CfgForeign -Clear -aALL #set global hostspare megacli -PDHSP -Set -PhysDrv [32:1] -a0
* If you need to unset/remove a global hotspare:
megacli -PDHSP -Rmv -PhysDrv [32:1] -aN