Backup. Stress down.

Here’s a simple backup script to drop onto your Linux based server. It’ll back up and rotate file names, once a day.

I wanted something simple but powerful. It just requires the ubiquitous date command, present in most systems, but will keep enough backups to cover most situations. One per month over 12 month rotation, one per week over 5 week rotation, and one per day over seven day rotation.

The code simply uses the fact that the days and months change in a predictable way. For instance Monday comes around every seven days, so a file called Monday will be replaced every seven days if we try to save another file called Monday, as with all the days, simply replacing older files.

It’s on my BitBucket here:
https://bitbucket.org/akademy/backup-script/src/master/
which you can check out for more details. But also see the code below.

backup_rotate_store () {

	local usage
	usage="Usage: 'backup_rotate_store <directory> <original_filename>'.
The original file should be in <directory>. Pass in the current name of the file."

	# Check for parameter $1
	if [ -z "$1" ]
	then
		echo "First parameter not set to a directory"
		echo "${usage}"
		return 1
	fi

	# Check for parameter $2
	if [ -z "$2" ]
	then
		echo "Second parameter not set to original filename"
		echo "${usage}"
		return 1
	fi

	local directory
	directory=$1
	local original_filename
	original_filename=$2

	local DAY
	DAY=$(date +%A)

	# Day backupds (Sunday, Monday, etc.)
	mv "${directory}/${original_filename}" "${directory}/${DAY}.${original_filename}"

	local DATE=
	DATE=$(date +%d)
	if (( $DATE == 1 || $DATE == 8 || $DATE == 15 || $DATE == 22 || $DATE == 28 )); then

		local EXTENSION
		EXTENSION='th'
		if (( $DATE == 1 )); then
			EXTENSION='st'
		fi
		if (( $DATE ==22 )); then
			EXTENSION='nd'
		fi

		# Weeks backup (1st, 7th, etc.)
		cp --archive "${directory}/${DAY}.${original_filename}" "${directory}/${DATE}${EXTENSION}.${original_filename}"

		if (( $DATE == 28 )); then
			local MONTH
			MONTH=$(date +%B)

			# Months backup (January, February, etc.)
			cp --archive "${directory}/${DAY}.${original_filename}" "${directory}/${MONTH}.${original_filename}"
		fi

	fi
}

To use it, just move the file to backup to where it should be stored then call the function. e.g.:

echo "My backup testfile" > testfile

DEST=/data/backups
FILENAME=testfile.txt

mv testfile "${DEST}/${FILENAME}"

backup_rotate_store ${DEST} ${FILENAME}

Enjoy, and may it bring you luck.

Containerise and streamline

A modern container architecture – like one based on Docker – has various advantages:
  • Fast to utilise
  • Fully documenting
  • Easy to reuse containers
  • Easy to update containers
  • Improved security

 

Some definitions

  • A container is essentially a complete machine only containing the parts necessary for a particular piece of software to run. A container can be based on another container – called a base container – to inherit its capabilities and be adapted further.
  • A container exists on a host machine. The host machine runs the container application which can run one or more containers.
  • A container system involves many containers working together.
  • A container hub is a repository of already configured containers each for a particular piece of software. (e.g. Docker Hub)

 

Advantages

Fast setup

Docker means agile (faster setup). Once the container application is installed a system of containers can be setup with a single command. This involves retrieving a base system and building each specific container. Any system can be quickly rebuilt. For developers, multiple systems can run on a single machine for easy testing.
It is also possible to retrieve already configured containers with a single command. Servers such as MySQL, Solr, WordPress can be retrieved and running within seconds.

Container definition

A container is specified in a single file. All the parts that are needed are installed through that. This text-file can be added to a normal version control system.
This file will hold the full specification for setting up a particular piece of software and thus can be used to rebuild the software even outside of a container architecture.
A further text-file can be used to setup the full container system. Again, compatible with version control.

Container separation

Each container runs as a separate entity. This means it is simple to reuse each part. It also means a container system can have one of more of it’s containers replaced without affecting the others. Each container can be built and tested separately.

Updates

Updates to containers can happen without downtime if containers are simply switched from an old to a new one. New container versions (or base systems) can also be retrieved from a hub and individual containers rebuilt using the newer versions.

Own hubs

It is possible and desirable to have a private hub holding our own containers. This enables reuse and standard practice across servers and projects and speeds up future development work.
It can also hold a base system which we can use to build all our systems.

Improved security

Root privileges obtained through compromised software can only affect the container they occur in, other containers and the host system itself run separately and are not affected.

Separation of work files

Containers can have virtual folders which connect the host file system with the containers. Configuration files, database storage, logs, etc can be accessed from the host for example for easy backup. Virtual folders can use folders mounted on the host and therefore use different drive storage.

Simpler monitoring

As all the softwares necessary to run a server are controlled under a single container application it should be clearer what needs to be running. The container application also monitors its containers and can restart those that fail.
Reboot scripts are also simplified as essentially this involves starting the one container application.

WebViews – Seeing all your website.

So what’s a WebView? They are just small windows showing a webpage. Here’s an image of a page with four views on, click it to go to the page:

As you can see, WebViews provides a way to see many of your webpages at once, no need to load multiple pages or to click through, you just need to open this one page. Furthermore, it’ll do some error checking for you too.

Continue reading “WebViews – Seeing all your website.”

SSH – A Brief Software Engineer’s Masterclass

SSH is a secure protocol for communicating between computers. There are many useful tools built on top of this protocol and they should be a part of every Software Engineers toolkit. This blog will detail how to connect to remote computers super quick and more securely, several ways to transfer files between computer (and edit them), and how to connect to ports so that services (such as databases) appear to be running locally.

Image of console using SSH

These commands are most easily ran on Linux OS’s but they have equivalents on other systems too. In particular the rise in popularity of mini computers, like the Raspberry Pi, which have no screens means these techniques are much more widely needed than ever.

Continue reading “SSH – A Brief Software Engineer’s Masterclass”

Gaia star data with D3 – part 2 – prettier, faster

Welcome back! (if that statement isn’t appropriate you’ll want to check out part 1 first: Gaia star data with D3 – part 1). We are going to make everything look much better, and then do a couple of optimizations.

Let’s make the sky look more natural, i.e. Black. In the current SVG specification there’s no way to set a background colour, so to work around that we’ll place a rectangle across the whole image.

Continue reading “Gaia star data with D3 – part 2 – prettier, faster”

Gaia star data with D3 – part 1

TL;DR: I’m going to show you how to take a fairly large dataset – in this case the galactic stars – and make some nice visualizations with a JavaScript library called D3. They look great.

Basic view of data.

Did you see the recent news from the ESA Gaia mission? It will eventually give the precise position and motion of one billion stars in our Galaxy (which is actually “only” one percent of the stellar population!). An initial release of data happened on 13th September 2016 and included the position, motion and distance of two million stars.

I wanted to take this data and play around with it and I like to play around with data with the help of D3. This is a web based visualisation tool, which unfortunately is restricted by the capabilities of today’s browsers. This means I’ve had to reduce the data points, however, I suspect we can still get some nice results.

Continue reading “Gaia star data with D3 – part 1”

Generating Webpages

Check out my post about the work I’m doing at the University of Oxford’s eResearch Centre. It’s about taking semantically linked data and generating a useful, website like view of the data.

http://www.semanticaudio.ac.uk/blog/generating-webpages-for-digital-objects/

I’m using nodeJS and Dust with Jena’s Fuseki SPARQL database to select and display the data. More to come.

Full text follows.

Continue reading “Generating Webpages”

Creating Excel files from CSV with Python

When you need to extract data from a system you can bet that someone will claim they have to have it in a Microsoft Excel format rather than the simple CSV format you had in mind. I’m unlikely to need to tell you this – as you’re already here – but that there are huge and annoying differences between these two formats. However, I’m not going to list them here, instead I want to introduce you to a nice python library: openpyxl ( https://pypi.python.org/pypi/openpyxl ).

This beautiful Excel library is a great example of how a good library should be written – it takes a complex problem and hides it away behind a nicely designed interface, and having written an Excel parser myself I can assure you that was unlikely to be an easy task! Basically, this library lets you read and write Excel files and requires next to no knowledge of the Excel format.

My little piece of code takes a list of CSV files with a name and inserts them all into a new excel file, you’ll get one sheet per CSV. It also improves the display by bolding the first line (assuming these are column titles) and adjusts the width of each column depending on the size of the strings within the column.

You can find the full code here: https://bitbucket.org/akademy/csvtoexcel/src/e92654d86ba35e221332c4365c78662b48f274c0/excel_writer.py?at=master&fileviewer=file-view-default

And you can use it like this:

	ew = ExcelWriter()
	ew.convert( {
		# Specify the csv files, whether it has column titles (default: yes) and the names
		"sheets" : [
			{
				"filelocation" : "test/work.csv",
				"sheetname" : "Works"
			},
			{
				"filelocation" : "test/person.csv",
				"sheetname" : "People",
				"has_titles" : False
			}
		],
		# Specify the output name
		"outputname" : "test/test_outout_file.xlsx"
	}

openpyxl can do much more – pretty much anything you can do in Excel – check out the docs for more info: https://openpyxl.readthedocs.io

So install the library and run the code!

Node.js – the programmers “Save the World” tutorial

What this tutorial is

This tutorial was designed to help you understand how to create your own node.js program – not how to copy and paste someone else’s code! (However… you might have some luck finding the complete code at https://bitbucket.org/akademy/save-the-world/src).

It is assumed you are a programmer, with some experience of javascript and other programming languages such as Python, Ruby or Java. It’s really written for other programmers and tries to get to the points quickly (except for the odd alien invasion)!

It may be useful to imagine: Earth has only minutes to live and you are our only hope – your teacher is barking orders at you in a manic attempt to teach you the skills you’ll need to defeat the invading armada. You need to set up a node.js server before the laser bolts start burning! Quick!

Continue reading “Node.js – the programmers “Save the World” tutorial”