Database (DB) Repair/Restore
Support
Please contact
support@verops.com
if you would like an interactive walkthrough of DB repair/restore/recovery.
Intro
On some filesystems (or if you are running Simscope under an incorrectly configured Docker/Kubernetes instance, it is possible the database can be damaged.
If this happens, you may need to either:
- Restore the database from backup.
- Repair the database.
Backup / Snapshots
Note: you must maintain a Backup (or Snapshot) of your Simscope instance.
For production, Simscope needs a constant backup or snapshot, in case of hardware failure.
Backup options available. Note that snapshots are the preferred mechanism, as they are fast processes.
Mode | Description | Preferred |
---|---|---|
Directory snapshots | under NFS/NetApp | ✅ |
Volume snapshots | under Docker/Kubernetes | ✅ |
Manual rsync backup | Nightly or weekly. Note this is quite slow |
Restore from Backup
To Restore a Simscope from Backup,
- Stop the Simscope server (if running)
- Rename the current Simscope
db/
directory to a backup name (iedb.bkp/
) - Copy the
snapshot/TIMESTAMP/db/
to the location of thedb/
directory- It's best to use:
rsync -a /path/to/snapshot/TIMESTAMP/db/ /path/to/simscope/db/
- It's best to use:
- Temporarily turn off RabbitMQ in your
simscope.config
, to ensure Simscope does not attempt to start processing new data. - Start Simscope server with the
--readonly
mode set.
If the database has issues, you can stop the server, and restore from a different backup.
If everything is okay, then you can turn back on writes:
- Edit your
simcsope.config
and re-enable the RabbitMQ - Simscope again without the
--readonly
option.
Kubernetes/Docker
It is possible to misconfigure Kubernetes and Docker, such that they incorrectly allow multiple Simscope server container instances (or pods) writing to the database in parallel.
- If this occurs, the database may be corrupted.
You must ensure atomic access to the database volume (ie only one container can be serving a Simscope volume).
ReadWriteOnce flag
If hosting Simscope within Kubernetes volumes, you must ensure your Persistent Volume is configured with atomic access.
Set the volume accessMode
to:
ReadWriteOnce
- (do not use the default of ReadWriteMany)
Example PersistentVolumeClaim
with ReadWriteOnce
✅ (note this is using a manual storageClassName
):
apiVersion: v1
kind: PersistentVolume
metadata:
name: simscope-db
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 20Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/mnt/simscope_db"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: simscope-db-claim
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 25Gi
Debug: Readonly Server Mode
To run Simscope in read-only mode, add the command-line argument --readonly
- This will turn of database writes, and turn of RabbitMQ reads, so you can inspect the server without changing the data (or processing new RabbitMQ data)
Repair Procedure
To repair Simscope DB tables, start the server with the --repair
option.
- This will walk the tables and repair records (if they are recoverable).
- This can take a long time, depending on your database size.
Example:
$ bin/simscope serve --config /path/to/simscope.config --db /path/to/simscope/db/ --readonly --repair
You will see lines similar to this in the log:
[INFO ] Repairing db table=/path/to/simscope/db/tagged
DB Consistency Check
To run a consistency/integrity check on the DB tables:
> PRAGMA integrity_check;
ok
For a quick check:
> PRAGMA quick_check;
ok