Adding an RTC to your Raspberry Pi

I use a RPi 3 as a secondary DNS and DHCP server, and time synchronization is important for that.  Due to some technicalities with how my network is set up, this means that I need a real-time clock on the RPi so that it can have at least some idea of the correct time when it powers up instead of being absolutely dependant on NTP for that.

Enter the DS3231 RTC (available on eBay for a few bucks).  The Pi Hut has an excellent tutorial on setting this up for a RPi, which I’m going to summarize here.

Configure I2C on the RPi

From a root shell (I’m assuming you’re using Raspbian like me);

apt-get install python-smbus
 apt-get install i2c-tools

Then, edit your /boot/config.txt and add the following down the bottom;

dtparam=i2c_arm=on
dtoverlay=i2c-rtc,ds3231

Edit your /etc/modules and add the following line;

i2c-dev

Now reboot.  If you do an i2cdetect -y 1 you should see the DS3231 listed as device 0x68.  If you do, great.

Configure Raspbian to use the RTC

After rebooting, the new device should be up, but you won’t be using it yet.  Remove the fake hardware clock with;

apt-get --purge remove fake-hwclock

Now you should be able do hwclock -r to read the clock, and then hwclock-w to write the current time to it.

And lastly, to make it pull time from the RTC on boot, put the following into /etc/rc.local before the exit 0;

hwclock -s

And you can then add a cronjob in /etc/cron.weekly to run hwclock -w once a week.

Done!

Backing up KVM Virtual Machines with Duplicity + Backblaze

As part of my home DR strategy, I’ve started pushing images of all my virtual machines (as well as my other data) across to Backblaze using Duplicity.  If you want to do the same, here’s how you can do it.

First up, you will need a GnuPG keypair.  We’re going to be writing encrypted backups.  Store copies of those keys somewhere offsite and safe, since you will absolutely need those to do a restore.

Secondly, you’ll need a Backblaze account.  Get one, then generate an API key.  This will be comprised of an account ID and an application key.  You will then need to create a bucket to store your backups in.  Make the bucket private.

Now that’s done, I’m assuming here that you have your /var/lib/libvirt library where your VMs are stored on its own LV.  If this isn’t the case, make it so.  This is so you can take a LV snapshot of the volume (for consistency) and then replicate that to Backblaze.

#!/bin/bash

# Parameters used by the below, customize this
BUCKET="b2://ACCOUNTID:APPLICATIONKEY@BUCKETNAME"
TARGET="$BUCKET/YOURFOLDERNAME"
GPGKEYID="YOURGPGKEYIDHERE"
LVNAME=YOURLV
VGPATH=/dev/YOURVG

# Some other parameters
SUFFIX=`date +%s`
SNAPNAME=libvirtbackup-$SUFFIX
SOURCE=/mnt/$SNAPNAME

# Prep and create the LV snap
umount $SOURCE > /dev/null 2>&1
lvremove -f $VGPATH/$SNAPNAME > /dev/null 2>&1
lvcreate --size 10G --snapshot --name $SNAPNAME $VGPATH/$LVNAME || exit 1

# Prep and mount the snap
mkdir $SOURCE || exit 1
mount -o ro,nouuid $VGPATH/$SNAPNAME $SOURCE || exit 1

# Replicate via Duplicity
duplicity \
 --full-if-older-than 3M \
 --encrypt-key $GPGKEYID \
 --allow-source-mismatch \
 $SOURCE $TARGET

# Unmount and remove the LV snap
umount $SOURCE
lvremove -f $VGPATH/$SNAPNAME
rmdir $SOURCE

# Configure incremental/full counts
duplicity remove-all-but-n-full 4 $TARGET
duplicity remove-all-inc-of-but-n-full 1 $TARGET

Configure the parameters above to suit your environment.  You can use gpg --list-keys to get the 8-digit hexadecimal key ID of the key you’re going to encrypt with.  The folder name in your bucket you use is arbitrary, but you should only use one folder for one Duplicity target.  The 10G LV snap size can be adjusted to suit your environment, but it must be large enough to hold all changes made while the backup is running.  I picked 10Gb, because that seems OK in my environment.

Obviously this means I need to have 10Gb free in the VG that the libvirt LV lives in.

Retention here will run incrementals each time it’s run, do a full every 3 months, ditch any incrementals for any fulls except the latest one, and keep up to 4 fulls.  With a weekly backup, this will amount to a 12 month recovery window, with a 3-monthly resolution after 3 months, and a weekly resolution less than 3 months.  Tune to suit.  Drop that script in /etc/cron.daily or /etc/cron.weekly to run as required.

Important.  Make sure you can do a restore.  Look at the documentation for duplicity restore for help.

Splunkd High CPU after leap second addition?

Had my alerting system yell at me about high CPU load on my Splunk Free VM;

A bit of examination revealed that it was indeed at abnormally high load average (around 10), although there didn’t appear to be anything wrong.  Then a quick look at dmesg dropped the penny;

Jan 1 10:29:59 splunk kernel: Clock: inserting leap second 23:59:60 UTC

Err.  The high CPU load average started at 10:30am, right when the leap second was added.

A restart of all the services resolved the issue.  Load average is back down to its normal levels.

Powershell Remoting for Non-Domain Test Machines

NOTE – This isn’t particularly secure, but it works.  It’s a bit better than configuring WinRM in unencrypted mode though.

Got some non-domain joined Windows machines and you want to get WinRM running in a hurry so you can do some stuff remotely?  Do this.

On the server (the thing you are remoting to);

Invoke-WebRequest -Uri https://github.com/ansible/ansible/blob/devel/examples/scripts/ConfigureRemotingForAnsible.ps1 -OutFile ConfigureRemotingForAnsible.ps1
.\ConfigureRemotingForAnsible.ps1
winrm quickconfig

That script is taken from Ansible, and configures a host with a self-signed SSL cert for use with WinRM.  The final line then configures up the WinRM listeners and firewall rules.

Then, on the client (the thing you’re remoting from);

# enter local admin creds here
$creds = get-credential  

$so = New-PSSessionOption -SkipCACheck -SkipCNCheck
Invoke-Command -Computername YOURSERVERHERE -UseSSL -SessionOption $so -Credential $creds -ScriptBlock { get-childitem env: }

You should see a dump of the local environment variables on the target machine, indicating that the invoke worked.  You can now do whatever Powershell remoting stuff you want to do.

Note, this doesn’t actually check the CA cert provided, so you can be MITM’ed and have your credentials captured.  For better security you should use a properly signed certificate on the server and trust it on the client correctly, but this will work fine for a home setup where you’re in control of all the layers (network, client and server).

Good luck.

Netflow Collector on Splunk – Interesting Bug

The Splunk Add-on for Netflow appears to have a bug.  If you run through the configure.sh script accept all the defaults, it refuses to ingest any Netflow data.

This is because its script deletes all ASCII netflow data that’s older than -1 day old.

You can easily fix this by either rerunning configure.sh again and typing in every value, or edit /opt/splunk/etc/apps/Splunk_TA_flowfix/bin/flowfix.sh and change the following line;

# Cleanup files older than -1
find /opt/splunk/etc/apps/Splunk_TA_flowfix/nfdump-ascii -type f -mtime +-1 -exec rm -f {} \;

Change the +-1 to +1.  This tells the script to clean up all ASCII netflow data older than 1 day (ie, not everything older than some time in the future).

Splunk integration with Docker

I’ve changed over my log aggregation system from ElasticStack to Splunk Free over the past few days.  The primary driver for this is that I use Splunk at work, and since Splunk Free allows 500Mb/day of ingestion, that’s plenty for all my home stuff.  So, using Splunk at home means I gain valuable experience at using Splunk professionally.

What we’ll be talking about here is how you integrate your Docker logging into Splunk.

Configure an HTTP Event Collector

Firstly, you’ll need to enable the Splunk HTTP Event Collector.  In the Splunk UI, click Settings -> Data Inputs -> HTTP Event Collector -> Global Settings.

Click Enabled alongside ‘All Tokens’, and enable SSL.  This will enable the HTTP Event Collector on port 8088 (the default), using the Splunk default certificate.  This isn’t enormously secure (you should use your own cert), but this’ll do for now.

Now, in the HTTP Event Collector window, click New Token and add a token.  Give it whatever details you like, and set the source type to json_no_timestamp.  I’d suggest you send the results to a new index, for now.

Continue the wizard, and you’ll get an access token.  Keep that, you’ll need it.

Configure Docker Default Log Driver

You now need to configure the default logging method used by Docker.  NOTE – Doing this will break the docker logs command, but you can find everything in Splunk anyway.  More on that soon.

You will need to override the startup command for dockerd to include some additional options.  You can do this on CentOS7 by creating a /etc/systemd/system/docker.service.d/docker-settings.conf with the following contents;

[Service]
ExecStart=
ExecStart=/usr/bin/dockerd --log-driver=splunk --log-opt splunk-token=PUTYOURTOKENHERE --log-opt splunk-url=https://PUTYOURSPLUNKHOSTHERE:8088 --log-opt tag={{.ImageName}}/{{.Name}}/{{.ID}} --log-opt splunk-insecureskipverify=1

The options should be fairly evident.  The tag= option configures the tag that is attached to the JSON objects outputted by Docker, so it contains the image name, container name, and unique ID for the container.  By default it’ll be just the unique ID, which frankly isn’t very useful post-mortem.  The last option allows the use of the Splunk SSL certificate.  Get rid of this option when you use a proper certificate.

Getting the driver in place

Now you’ve done that, you should be able to restart the Docker host, then reprovision all the containers to change their logging options.  In my case, this is a simple docker-compose down followed by docker-compose up, after a reboot.

The docker logs command will be broken now, but you can instead use Splunk to replicate the functionality, like this;

index=docker host=dockerhost | spath tag | search tag="*mycontainer*" | table _time,line

That will drop out the logs from the last 60 minutes for the container mycontainer running on the host dockerhost.

You can then start doing wizardry like this;

index=docker | spath tag | search tag="nginx*" 
| rex field=line "^(?<remote_addr>\S+) - (?<remote_user>\S+) \[(?<time_local>.+)\] \"(?<request>.+)\" (?<status>\d+) (?<body_bytes>\d+) \"(?<http_referer>.+)\" \"(?<http_user_agent>).+\" \"(?<http_x_forwarded_for>).+\"$"
| rex field=request "^(?<request_method>\S+) (?<request_url>\S+) (?<request_protocol>\S+)$"
| table _time,tag,remote_addr,request_url

To dynamically parse NGINX container logs outputted by Docker, split up the fields, and then list them by time, remote IP, and the URL requested.

I’m sure there’s better ways of doing this (such as parsing the logs at index time instead of at search time), but this way works pretty well and should function as a decent starting point.

NGINX Rate Limiting for Unsecured Apps

Some applications don’t properly support IP blackholing in the case of failed login attempts.  There’s a few ways to handle that, but one nice way is to make use of nginx in the front of the application to apply rate limiting.

I’m considering using nginx as a reverse proxy for your application here as out of scope for this article.  It’s a good idea to get used to using it to front your applications and control access to them.

Rate Limiting in NGINX

We’ll be making use of the ngx_http_limit_req module.  Simply put, you create a zone using limit_req_zone, then define allowed locations that will use the zone using limit_req.

The mental abstraction you can use for the zone is a bucket.  The zone definition describes a data table which will hold IP addresses (in this case), and how many requests they’ve made.  The requests (which are water in the bucket in this analogy) flow out a ‘hole’ in the bucket at a fixed rate.  Therefore, if requests come in faster than the rate, they will ‘fill’ the bucket.

The ‘size’ of the bucket is determined by the parameters you’ve set on limit_req for the allowed burst size.  So a large burst size enables a lot of requests to be made in a time period that exceeds the recharge rate, but it’ll fill the bucket up eventually.  They then slowly recharge at the described rate.

IMPORTANT – If you do not use the nodelay option in limit_req, what happens is that nginx delays incoming requests to force them to match the rate – irrespective of bursts.  In this article, we’ll use nodelay, because we want to flat out return errors when the burst size is exceeded.

Configuring Rate Limiting

In the http context of your nginx.conf, insert a zone definition like this;

limit_req_zone $binary_remote_addr zone=myzone:10m rate=1r/m;

This defines a new zone named myzone which will be populated with the binary forms of remote addresses of clients of size 10Mb.  This will hold a large number of addresses, so it should be fine.  It will recharge limits at a rate of one per minute (which is very slow, but this is intentional, as you’ll see).

Then, let’s assume your app has a login page that you know is at /app/login, and the rest of the app is under /.  You could write some locations like this;

location = /app/login {
    limit_req zone=myzone burst=10 nodelay;

    # whatever you do to get nginx to forward to your app here
}

location / {
    # whatever you do to get nginx to forward to your app here
}

That way, calls to /app/login will be rate limited, but the rest of your app will not.

In the above example, calls to /app/login from a single IP will be rate limited such that they can make a burst of 10 calls without limits, but then are limited to an average rate of one per minute.

For something that’s a login page, this should be sufficient to allow legitimate logins (and likely with a mistyped password or two), but it’ll put a big tarpit on dictionary attacks and the like.

Netflow with ELK Stack and OpenWRT

Now we’re getting into some pretty serious magic.  This post will outline how to put together OpenWRT and ELK Stack to collect network utilization statistics with Netflow.  From there, we can use Kibana to generate visualizations of traffic data and flows and whatever else you want to leverage with the power of Elasticsearch.

I’m using a virtualized router instance running OpenWRT 15.05.1 (Chaos Calmer) on KVM with the Generic x86 build.  Using a hardware router is still doable, but you’ll need to be careful about CPU utilization of the Netflow exporter.  Setting this up will require a number of components, which we’ll go through now.

You will need an OpenWRT box of some description, and an ELK Stack already configured and running.

OpenWRT Setup

You’ll need to install softflowd, which is as easy as;

opkg update
opkg install softflowd

Then edit /etc/config/softflowd and set the destination for flows to go to something like;

option host_port 'netflow.localdomain:9995'

Start up the Softflow exporter with /etc/init.d/softflowd start and it should be working.

Note, default config will be using Netflow version 5.  Let that stand for now.  Also, leave the default interface on br-lan – that way it’ll catch flows for all traffic reaching the router.

Logstash Configuration

If you’re using the ELK Stack Docker project like me, you’ll need to set up the Docker container to also listen on port 9995 UDP.  At any rate, you need to edit your logstash.conf so that you have the following input receiver;

# Netflow receiver
input {
  udp {
    port => 9995
    type => netflow
    codec => netflow
  }
}

This is an extremely simple receiver which takes in Netflow data on port 9995, sets the type to netflow and then processes it with the built-in Netflow codec.

In your output transmitter, you’ll then want something like this example;

output {
        if ( [type] == "netflow" ) {
                elasticsearch {
                        hosts => "elasticsearch:9200"
                        index => "logstash-netflow-%{host}-%{+YYYY.MM.dd}"
                }
        } else {
                elasticsearch {
                        hosts => "elasticsearch:9200"
                        index => "logstash-%{type}-%{+YYYY.MM.dd}"
                }
        }
}

What this does is pretty straightforward.  Everything gets sent to the Elasticsearch engine at elasticsearch:9200.  But, messages with the type of netflow get pushed into an index that has the IP address that the flow was collected from in it (this will probably be your router).

Restart Logstash and you should start getting flows in within a few minutes.

Kibana Setup

From there, just go into Kibana and add a new index pattern for logstash-netflow-*.  You can then visualize / search all your Netflow data to your heart’s content.

Nice!

Customizing OwnCloud using Docker

I’m messing around with OwnCloud at the moment, a solution to provide cloud-like access to files and folders through a webapp using your own local storage.  As is my want, I’m doing it in Docker.

There’s a minor catch though – the official OwnCloud Docker image does not include smbclient, which is required to provide access to Samba shares.

Here’s how to take care of that.

FROM owncloud:latest
RUN set -x; \
 apt-get update \
 && apt-get install -y smbclient \
 && rm -rf /var/lib/apt/lists/* \
 && rm -rf /var/cache/apt/archives/*

The above Dockerfile will use the current owncloud:latest image from Docker Hub, and then install smbclient into it.  You want to do the update, install and cleanup in one step so it gets saved as only one layer in the Docker filesystem, saving space.

You can then put that together with the official MySQL Docker Image and a few volumes to have a fully working OwnCloud setup with docker-compose.

version: '2'

services:
  mysql:
    image: mysql:latest
    restart: unless-stopped
    environment:
      - MYSQL_ROOT_PASSWORD=passwordgoeshere
    volumes:
      - ./data/mysql:/var/lib/mysql:rw,Z

  owncloud:
    hostname: owncloud.localdomain
    build: owncloud/
    restart: unless-stopped
    environment:
      - MYSQL_ROOT_PASSWORD=passwordgoeshere
    ports:
      - 8300:80
    volumes:
      - ./data/data:/var/www/html/data:rw,Z
      - ./data/config:/var/www/html/config:rw,Z
      - ./data/apps:/var/www/html/apps:rw,Z
    depends_on:
      - mysql

Create the directories that are mounted there, set the password to something sensible, and docker-compose up !

One thing though.  OwnCloud doesn’t have any built-in account lockout policy, so I wouldn’t go putting this as it is on the ‘Net just yet.  You’ll want something in front of it for security, like nginx.  You’ll also want HTTPS if you’re doing that.

More on that later.

How to convert an MP4 to a DVD and burn it on Linux

If you’re using Vagrant with VirtualBox on Windows, create a new directory, throw the source mp4 in it, then create a Vagrantfile like this;

Vagrant.configure("2") do |config|
  config.vm.box = "bento/ubuntu-16.04"

  config.vm.provider "virtualbox" do |vb|
  vb.customize ["storageattach", :id, "--storagectl", "IDE Controller", "--port", 0, "--device", 0, "--type", "dvddrive", "--passthrough", "on", "--medium", "host:X:"]
  end
end

Edit the host:X: to be the drive letter of your physical DVD drive.

Then bring up the VM with;

vagrant up
vagrant ssh
sudo -s -H

Now that’s done, do this.  You can start from here if you’re already on Linux or have some other means of getting a VM ready.  I assume you’re going to want to make a PAL DVD, and that your DVD is in /dev/sg0 (check with wodim --devices);

apt-get install dvdauthor mkisofs ffmpeg wodim
ffmpeg -i input.mp4 -target pal-dvd video.mpg
export VIDEO_FORMAT=PAL
dvdauthor -o dvd/ -t video.mpg
dvdauthor -o dvd/ -T
mkisofs -dvd-video -o dvd.iso dvd/
wodim -v dev=/dev/sg0 speed=8 -eject dvd.iso

All done.  Assuming everything went well, you have a freshly burned DVD, all using open source Linux software, with no horrible adware that tends to come with Windows DVD burning software.

You can then get rid of the VM with vagrant destroy.