Setting up a WordPress development environment with Vagrant

Now and then I need to set up some simple WordPress websites for friends or relatives.
The setup I used until now was a virtual machine I configured solely for the development of those sites. The idea was that I could easily move around the virtual machine to another laptop when needed.

However, I failed to make good backups of the virtual machine so I was searching for a better alternative for a while now.

One possible solution was to create a development environment in the cloud with Azure but this seemed a bit overkill.

A while ago I already gave Vagrant a try, but it didn’t work out very well (lots of error messages etc…)
However, I decided to try it again and I didn’t encounter any blocking issues this time.

The hard part was configuring a bootstrap file to set up a WordPress environment with the initial database configuration.
The reason for this post is outlining these steps to create a virtual machine with a basic WordPress installation using Vagrant.

If you never used Vagrant before I recommend to follow the “Getting started” page on their website.
If you’re able to run a basic virtual machine (this is where it went wrong with my first Vagrant try) as described on that page, come back and continue with this post here.

 Vagrant file

If you followed along the Getting started page, then a Vagrant file should not be an unfamiliar thing to you.
There are four changes I made to the file but two of them are optional changes.

Enable provisioning with a shell script (mandatory to follow along)
config.vm.provision :shell, path: "bootstrap.sh"
Set the permissions on the shared folder (mandatory)
config.vm.synced_folder ".", "/vagrant", :mount_options => ["dmode=777","fmode=666"]

This is necessary to make sure WordPress has the correct permissions to upload files. As you may notice the settings above are unsecure (777 setting) but for development this shouldn’t be an issue.

I used a Ubuntu box as setup
config.vm.box = "ubuntu/trusty64"
Create a private network (optional)
config.vm.network "private_network", ip: "192.168.100.2"

I then created an entry in my hosts file to map this IP to a specific URL (dev.caffeinetocode.be for example)

 Bootstrap file

This file is the shell script we configured in the Vagrant file when we enabled provisioning.

First, let me show you the complete file.
Almost everything of the configuration below is copied from the WordPressWithVagrant repository on GitHub, but I needed to make some additional tweaks to get it work.

#!/usr/bin/env bash

sudo debconf-set-selections <<< 'mysql-server-5.5 mysql-server/root_password password averycomplexpassword'
sudo debconf-set-selections <<< 'mysql-server-5.5 mysql-server/root_password_again password averycomplexpassword'
sudo apt-get update
sudo apt-get -y install mysql-server-5.5 php5-mysql apache2 php5

if ! [ -L /var/www ]; then
  rm -rf /var/www
  ln -fs /vagrant/public /var/www

  a2enmod rewrite

  sed -i 's:<Directory /var/www/>:<Directory /vagrant/public/>:' /etc/apache2/apache2.conf
  sed -i 's:/var/www/html:/vagrant/public:' /etc/apache2/sites-enabled/000-default.conf
  service apache2 restart
fi

if [ ! -f /var/log/databasesetup ];
then
    echo "CREATE USER 'wordpressuser'@'localhost' IDENTIFIED BY 'wordpresspass'" | mysql -uroot -paverycomplexpassword
    echo "CREATE DATABASE test" | mysql -uroot -paverycomplexpassword
    echo "GRANT ALL ON test.* TO 'wordpressuser'@'localhost'" | mysql -uroot -paverycomplexpassword
    echo "flush privileges" | mysql -uroot -paverycomplexpassword

    mysql test -u root -prootpass < /vagrant/wp_setup/database/initial_database_setup.sql

    touch /var/log/databasesetup
fi

Let’s review each step.

sudo debconf-set-selections <<< 'mysql-server-5.5 mysql-server/root_password password averycomplexpassword'
sudo debconf-set-selections <<< 'mysql-server-5.5 mysql-server/root_password_again password averycomplexpassword'

After installing MySQL it will ask some questions to complete the installation.
Because we don’t want to enter these details over and over again when booting up the virtual machine we can use the command debconf-set-selections which allows for easier installation.

So here we tell MySQL to use the password averycomplexpassword for the root user when installing MySQL.

sudo apt-get update
sudo apt-get -y install mysql-server-5.5 php5-mysql apache2 php5

Next we will update all installed packages and install the required components the run a WordPress site.
No further explanation needed I guess.

if ! [ -L /var/www ]; then
  rm -rf /var/www
  ln -fs /vagrant/public /var/www

  a2enmod rewrite

  sed -i 's:<Directory /var/www/>:<Directory /vagrant/public/>:' /etc/apache2/apache2.conf
  sed -i 's:/var/www/html:/vagrant/public:' /etc/apache2/sites-enabled/000-default.conf
  service apache2 restart
fi

By default, Vagrant provides a shared folder (/vagrant) which is accessible by the host and guest.
I’ve created an extra folder inside the vagrant folder where all my WordPress files will be stored.

In this part it will create a symbolic link from the public folder in the shared folder to the /var/www folder.
It also changes two configuration files of Apache to use the shared Vagrant folder as web root.

It then restarts the Apache service.

if [ ! -f /var/log/databasesetup ];
then
    echo "CREATE USER 'wordpressuser'@'localhost' IDENTIFIED BY 'wordpresspass'" | mysql -uroot -paverycomplexpassword
    echo "CREATE DATABASE test" | mysql -uroot -paverycomplexpassword
    echo "GRANT ALL ON test.* TO 'wordpressuser'@'localhost'" | mysql -uroot -paverycomplexpassword
    echo "flush privileges" | mysql -uroot -paverycomplexpassword

    mysql test -u root -prootpass < /vagrant/wp_setup/database/initial_database_setup.sql

    touch /var/log/databasesetup
fi

Next and final step is setting up the database.
When configuring WordPress for the first time you will need to uncomment the following line:

mysql test -u root -paverycomplexpassword < /vagrant/wp_setup/database/initial_database_setup.sql

This SQL file will be created after you runned through the WordPress installation.
But basically this part of the script checks if there’s a certain file available which should have been created the first time. We do this check because we don’t want to run this part of the script again when resuming or reloading the configuration.

It creates a new MySQL user which gets access to our database.

If everything went well you should be able to access your WordPress site by the specified IP-address in your Vagrant file (or URL if you also configured it in your hosts file).

Creating a initial SQL script for WordPress

After completing the installation step of your WordPress site, access your VM with

vagrant ssh

Enter the following command in the command prompt

mysqldump -h localhost -u root -paverycomplexpassword test (this is the database name) > initial_database_setup.sql

Now copy this file to a folder inside your Vagrant folder (I used wp_setup/database).
Uncomment the line in the script as stated previously and destroy your VM to test if everything works.

vagrant destroy

Restart your VM with

vagrant up

And when everything is booted up, you should be able to access your WordPress site without any additional configuration required!

xbuild and “the PCL reference assemblies not installed” error message

I was trying to setup a build script with FAKE to build a Xamarin solution via a bash script which I could then use in TeamCity.
In Xamarin Studio I could build my solution without any errors but when trying to run my script in a terminal, I received this error: “PCL reference assemblies not installed”.

I searched for several solutions untill I uninstalled Xamarin and Mono.
When reinstalling Xamarin I suddenly noticed that it was installing Mono version 4.0.2.

The error message, however, showed version 4.0.1.

After noticing this difference, I remembered I installed Mono via Homebrew to run ASP.NET vNext on Mac OS X.
After uninstalling it with

brew uninstall mono

, my script runned without any errors.

Sigh…

Centralize your logs with the ELK stack

At my current customer I recently had the opportunity to play around with logstash.
There are several log files spread across different servers which makes it difficult to easily identify the most critical errors.

logstash aids in this by centralizing all these log files in one place.
We use a combination of Elasticsearch, logstash and Kibana or more better known as the ELK stack.

logstash will collect all the log files of our servers, parse them, send them to Elasticsearch in a uniform format and then use Kibana to visualize these logs.
I just want to cover how we’ve set things up and give some useful tips along the way.

I’ll skip the part of how to install logstash because this is quite straightforward.
Just take a look at the documentation of logstash.

Input configuration

logstash is all about configuration which is accomplished by using plugins of which several are included by default.
The file plugin is the first plugin we need; here we will define where logstash needs to search for our log files.

input {
    file {
        path => "/Logs/**/general.log"
        start_position => "beginning"
    }
}

start_position => “beginning” instructs logstash to read from the beginning of the file. The default behavior is to only read the new log lines since logstash started.
However it’s important to know that logstash will keep track which lines are already processed. If logstash is restarted, it won’t start again from the beginning of the file.

start_position only has influence on new log files.
This is of course useful in production environments but while testing your setup this can be quite annoying.
By adding sincedb_path => “/dev/null” to the file plugin configuration you force logstash to read the complete file again.

File locks

The documentation isn’t very clear about how many times a file is read but based on the source code in GitHub, you need to take two parameters into account.

input {
    file {
        stat_interval => 1
        discover_interval => 5
    }
}

The values in the example above are the default values and can be omitted.
stat_interval: if all files are processed (defined by the file plugin), how many seconds does logstash needs to wait before processing all files again?
discover_interval: how many seconds logstash will wait before processing another file in the list

If the wildcard in the path property would match against 3 files, it will process the first file, wait for 5 seconds and then process the second file and wait another 5 seconds and so on…
If the three files are processed, it will wait 1 second and process the list again.

 

Filters configuration

First plugin covered, on to the next… filters!
Filters are used to define how you want to parse the log files.

filter {
    multiline {
        pattern => "^%{TIMESTAMP_ISO8601}"
        negate => true
        what => previous
    }

    grok {
        match => ["message", "(?m)%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:log-level} %{GREEDYDATA:information}"]
        tag_on_failure => ["error_message_not_parsed"]
        remove_field => ["message"]
        break_on_match => false
    }

    grok {
        match => [ "path", "/Logs/(?<server>[^/]+)/(.*).*" ]
        tag_on_failure => ["path_not_parsed"]
    }
}

The first filter is the multiline filter.
By default logstash will create a new record for each line in the log file.
In our case (and most cases I presume) stacktraces are included in the logs so we would rather group log lines together as one message.

The multiline filter is configured as follows: group all lines of the log together in one message untill you reach another timestamp.

Now that we’ve grouped the necessary log lines together, we can start splitting up the message into different fields with Grok.

Grok is a plugin with a predefined set of regular expressions.
It’s possible that each log file is in a different format especially when your log files are read from different sources (log4net error logs, IIS logs, Apache logs, …).

By using Grok patterns you can target each type of log file seperately.

In the Grok filter for the message you will notice that the expression starts with (?m).

grok {
    match => ["message", "(?m)%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:log-level} %{GREEDYDATA:information}"]
    tag_on_failure => ["error_message_not_parsed"]
    remove_field => ["message"]
    break_on_match => false
}

This is required because without it the multiline filter defined previously is overridden. The Grok pattern GREEDYDATA stops at a newline by default (see https://logstash.jira.com/browse/LOGSTASH-509)

The other patterns are quite straightforward, it will first put the timestamp in a seperate field, then the log level and it finally puts what’s left of the message in another field.

Let me show you an example.
This is the error message:

2014-12-12 03:21:49,285 ERROR http://www.mywebsite.be The request was aborted: The request was canceled.
 System.Net.WebException: The request was aborted: The request was canceled. ---> System.IO.IOException: Cannot close stream until all bytes are written.
 at System.Net.ConnectStream.CloseInternal(Boolean internalCall, Boolean aborting)
 --- End of inner exception stack trace ---
 at System.Net.ConnectStream.CloseInternal(Boolean internalCall, Boolean aborting)
 at System.Net.ConnectStream.System.Net.ICloseEx.CloseEx(CloseExState closeState)
 at System.Net.ConnectStream.Dispose(Boolean disposing)
 at System.IO.Stream.Close()

This is the outputted result:

 timestamp: 2014-12-12 03:21:49,285
 log-level: ERROR
 information: http://www.mywebsite.be The request was aborted: The request was canceled.
 System.Net.WebException: The request was aborted: The request was canceled. ---> System.IO.IOException: Cannot close stream until all bytes are written.
 at System.Net.ConnectStream.CloseInternal(Boolean internalCall, Boolean aborting)
 --- End of inner exception stack trace ---
 at System.Net.ConnectStream.CloseInternal(Boolean internalCall, Boolean aborting)
 at System.Net.ConnectStream.System.Net.ICloseEx.CloseEx(CloseExState closeState)
 at System.Net.ConnectStream.Dispose(Boolean disposing)
 at System.IO.Stream.Close()

The last line of the Grok filter states that it shouldn’t stop after a successful match (which it does by default) but that it should also execute the next Grok filter.
That filter will extract some information from the file path to better identify the original source of the log file.

grok {
    match => [ "path", "/Logs/(?<server>[^/]+)/(.*).*" ]
    tag_on_failure => ["path_not_parsed"]
}

At GitHub you can find an overview of the available Grok patterns.
It’s possible to create your own Grok pattern and a helpful tool to debug your Grok pattern is http://grokdebug.herokuapp.com

Next step is to configure where we want to store all the information logstash just processed.
We’ve choosen Elasticsearch because we already used this for other purposes, it didn’t need much setup anymore and because it’s blazing fast!

I won’t cover Elasticsearch here, there is a lot of documentation available on the web.

output {
    elasticsearch {
        host => "your-elasticsearch-server"
        index => "name-of-the-index-for-your-logs"
        protocol => node
        node_name => name_of_your_node
        cluster => elasticsearch_cluster_name
        template => "/etc/logstash/mapping/es-template-logstash-with-ttl.json"
        template_overwrite => true
    }
}

The host is the fully qualified domain name of your Elasticsearch server.
The protocol is set to node so logstash will create a non-data node which is responsible for the gathering of the logs but not for indexing the data.

When setting the protocol to node, you can also use the option node_name.
If you don’t configure this value, Elasticsearch will give the node a random name.

Another option for the protocol setting is the value http which will use the HTTP API of Elasticsearch.

By default logstash comes with a default Elasticsearch template file.
We’ve extended this template file to also include a TTL (time-to-live) value for the Elasticsearch documents.

This allows us to keep the size of the Elasticsearch index under control.
Each log message in the index will automagically get cleaned up if it’s older then the defined TTL (30 days in our setup).

    "mappings": {
        "_default_": {
            "_all": {
                "enabled": true
            },
            "_ttl" : { "enabled" : true, "default": "30d" },
            [...]

If everything went according to plan, your logs are now available in one place and in one format.
If you have still some issues, you can start the logstash service via command-line with the option –verbose or –debug.

The debug flag however generates so much noise it’s not easy to identify the error.
In most cases the verbose flag should quickly point you in the right direction.

Kibana

Now we can setup Kibana to easily read through the logs and do some visualisation magic!

To install Kibana, you just need a webserver.
Drop the files of the Kibana package in the webserver folder and you’re up and running!

The only file you need to edit is the config.js in the Kibana directory.
Search for elasticsearch in the configuration file and replace it with the FQDN of the Elasticsearch server.

You can also change the default_route property and point it to ‘/dashboard/file/logstash.json’ which provides a default setup for logstash.
Fire up the URL of your webserver and you should see a default dashboard.

These are some examples of dashboards we’ve made.

kibana-bar-graph

kibana-line-graph

I will cover Kibana in more depth in another blog post.

To summarize everything, here is a sample configuration of all the settings I discussed:

input {
    file {
        path => "/Logs/**/general.log"
        start_position => "beginning" #read from the beginning of the files
    }
}

filter {
    multiline {
        pattern => "^%{TIMESTAMP_ISO8601}" #group all lines of the log together in one message untill you reach a timestamp
        negate => true
        what => previous
    }

    grok {
        match => [ "message", "(?m)%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:log-level} %{GREEDYDATA:information}" ] #add the timestamp, log level and rest of the message in seperate fields
    }

    grok { 
        match => [ "path", "/Logs/(?<server>[^/]+)/(.*).*" ] #add parts of the file path in seperate fields
    }
}

output { 
    elasticsearch {
        host => "FQDN" #the FQDN of the elasticsearch server
        protocol => node #create a non-data node which is responsible for the gathering of the logs but not for indexing the data
        cluster => "elasticsearch_cluster" #cluster name of your elasticsearch; aids in the discovery of the nodes
        index => "log-servers" #write the logs to this index in ES (if the index doesn't exist, it will be created automagically)
        template => "path to ES template"
        template_overwrite => true #make sure to use the template defined in this config and not to use one defined in elasticsearch
    }
}

If you have remarks or questions, just leave a message!

Soundcheck!

Ik gooi het even over een andere boeg voor de blog want de laatste tijd kwam ik totaal niet toe aan het schrijven van posts, vooral omdat het enkel gericht was naar de nerdy topics.

Daarom een verse start en zal dit dus eerder een samenvatting worden van allerlei verschillende topics. Zo hoop ik wat meer dingen te schrijven en te delen…
Laten we eens kijken waar dat ons brengt!

We gaan onmiddellijk de muzikale toer op.
Sinds december ben ik overgestapt van Spotify naar Rdio. De reden waarom en wat ik momenteel van Rdio vind, leg ik wel eens uitgebreid uit op een ander moment.

Via de stations van Rdio heb ik vandaag Frightened Rabbit leren kennen.

Rdio stations

Rdio stations

Beluister zeker eens het album “Pedestrian Verse”; ik was vrij snel verkocht!

Albums van Frightened Rabbit via Rdio

Albums van Frightened Rabbit

Ik heb ze leren kennen via de voorstellen van Rdio op basis van The National dus als dat jouw wel ligt, zou ik het zeker eens beluisteren!

Andere gerelateerde artiesten volgens Rdio? The Shins, Band of Horses, Arcade Fire, …