5 Reasons why should you use Reverse Proxies such as HAProxy?

1. Security

Port Limitations

As a business owner or system administrator, security is your priority. Competitors or evil hackers shouldn’t have the opportunity to hinder your business processes causing you to lose money. A reverse proxy will expose only e.g. port 80 and 443 on your website domain name. Your web server is not exposed to the public to be attacked on SSH, MySQL port or any other service it might have on.

Note of warning: A hacker can still gain access to your system if your web application has security vulnerabilities such as MySQL Injection vulnerability.

IP Based ACLs

Almost all popular CMS have a back-office or management section for administrative purposes. Problem is everyone has access to these back-office links and hence bots can brute-force the usernames and password to gain access to private informations.

acl is_admin path_beg /wp-admin
acl is_local src
http-request deny if !is_local is_admin

In this example, if someone tries to access a wordpress admin panel from an IP other than the subnet, the request will be denied

2. Performance monitoring

HAProxy can be configured to send logs to a syslog server (which can be local too). You can then analyze these logs and store them into a high-speed time series database such as ElasticSearch. You can use the Collectiva service to setup and analyze the logs for you

You can visualize the data so as to display the 75th percentile response time. Using percentiles instead of mean average provides a better view on the real world performance as outliers are ironed out. An example is let’s say your website usually serves clients under 100ms 99% of the time but there is one cron that takes 1 minute to run every 5 minutes. Your average response time will be much higher. It becomes difficult to know when is the server actually slowing down.

The graph below shows the 75th response timefrom HAProxy logs

3. SSL Certificate Management

Securing your website with SSL is no longer a luxury feature. It’s a must and it’s free.

SSL termination is what I mostly use because it’s a lot easier to maintain.

Source: digitalocean.com

Your web application might be a simple PHP/Apache, NodeJS daemon, Python daemon whatever. Each of these applications have there own ways to implement SSL certificates in their configuration files or panels. I ain’t got time to go learn all these platforms to implement SSL on my websites. Now imagine for each web-service you have multiple server backends. It’s not easy to keep all the certificates in sync when you’re renewing the certificates.

Having a load balancer at the front handles all the certificates for all domains is really convenient. As a system administrator, I don’t have to care what the developers are doing as long as they speak HTTP.

NOTE: Your application needs to take into account the `XForwardedProto` of the HTTP header for this to work. Else you can just force the application to print https links all the time. 

4. Redundancy

Downtime is bad for business. No matter how good your servers are or how highly qualified your system administrators are, your servers and services are eventually going to fail at some point in time. The best you could do is plan for it. Having multiple servers for your website or webservice is the becoming the norm specially with the rise of containers such as LXC and Docker technologies.

The reverse proxy takes in all the requests, checks if there are servers capable of serving them then forwards the requests to the latter.

Let’s say you have 3 apache servers serving the same exact content and 1 of them dies, the load balancer will redirect the rest of requests to the other 2. The end-users nor your boss will know something is wrong unless you tell them something actually went wrong.

5. Load Balancing

We were born small. Through the years our body size increased till we reach our designated height. It’s the same principle with startup businesses. At first they don’t have  lots of clients and resources. Those who survive have to more and more clients to serve.

Having a load balancer helps you start a website with just 1 tiny backend server. As more requests come in, just replicate the backend server more and more until the all the end-users are served. And also when times are bad, you can also reduce the amount of backend servers.

Run your business lean.

6. Freedom of Infrastructure (Bonus)

Just a combination of all the points discussed above. I think the only constant in an IT infrastructure is reverse proxies/load balancers known as frontal servers. The rest should be able to quickly adapt to new technologies, programming languages and paradigms.

What is your favorite load balancer and why? Tell us in the comments

A sad news

HAProxy doesn’t yet support HTTP/2 protocol as of time of writing. It was supposed to be in HAProxy 1.6 but it’s not in version 1.7 yet.

High Load on one of ElasticSearch Node on Ceph

I had this 1 Elasticsearch node that had higher load than his colleagues.

Previously, they were all living in perfect harmony

It all started when I rebooted the hypervisor the elastic node was on.

I looked in the KVM virsh files to see if the node had differences with the others. I noticed only this node wasn’t using the `virtio` driver for network and disk. I changed from `ide` and `e1000` driver to `virtio` for disk and network respectively. Rebooted the node but still couldn’t match the performance of his counterparts.

This problem had to be solved because the ElasticSearch cluster performance is directly affected by the slowest node in the cluster. If a node is slow, it’s better it’s not in the cluster. The 75h percentile requests was more than 1.5s. Usually it was around 400ms in peak hours. My 99.9th percentile exceeded 50 seconds. It was really dangerous. The cluster receives 1 million documents per minute.

`iotop -a` showed the same processes running but had high IO on `[jbd2/vdb-8]`. It just confirmed our problem. But no solution as of yet.

I noticed on the network graph that the node was not able to send more than 600MB per 5 mins at all times when previously it could.

There must be some kind of restriction on network. It must be when the hypervisor rebooted, the network negotiation had some issues. Comparing values from 2 hypervisors confirmed the hypothesis

root@hypervisor0:~# mii-tool eth0
eth0: negotiated 100baseTx-FD, link ok

root@hypervisor1:~# mii-tool eth0
eth0: negotiated 1000baseT-FD flow-control, link ok

We can see the speed difference is major here. The VM reported high disk IO because Ceph relies on network to read/write data.

Monitor your cluster farm with Collectiva (Beta), a product of nayarweb.com

Mauritians won’t be silent!

Mauritius is going through a very difficult phase. Lots of time we hear that just posting on Facebook won’t change a thing. Mauritians just talk, no action. Mauritians are cowards.
The prophet of Islam reportedly said:
“Whosoever of you sees an evil, let him change it with his hand; and if he is not able to do so, then [let him change it] with his tongue; and if he is not able to do so, then with his heart — and that is the weakest of faith.” [Muslim]
We cannot actually stop a minister/ti-copin from stealing. We don’t even have access to his office. Forget even the papers. We don’t know what’s happening there.
Our voice is what we common Mauritians have — at-least for now. We can see bad things happening. We should talk about it.
I don’t have a fucking clue what ICAC et al. do. They are supposed to have powerful hands to stop corruption. Maybe they are helpless too. If your are helpless ICAC, voice it out! We Mauritians are paying you big money.
If or When (i hope not) Mauritius becomes like North Korea, we’ll put our votes in ballots in favour of Government despite the heart says otherwise because it then becomes a question of survival.
I hope Mauritius never becomes like North Korea. Mauritius is a beautiful country with amazing people. We can’t allow politicians to ruin our reputation and divide us.

Reading Files Line by Line in Python

You write a script in python which has to read a file, loop through each line to do some work. Seems easy right?

f = open('loulou.txt', 'r')
lines = f.readlines()
for line in lines:
	print line

Problem is when your file becomes big, let’s say 1500MB, the code will still run on your development laptop but it will fail on a server on cloud with 1GB of RAM. You’ll have an Out Of Memory error which can lead to some very important process being killed if OOM priorities are not mastered.

You can use the Collectiva service by NAYARWEB.COM to get alerted near-realtime whenever Out of Memory occurs on one of your servers.

To fix the code above, you can read the file line by line from the disk:

f = open('loulou.txt', 'r')
for line in f:
        print line

Is this always better? No. The current version will run slightly slower than the first code depending on your storage backend.

It’s amazing how little differences like these can have a huge impact on production servers.

Happy Monitoring to all System Admins out there 🙂

Car Not Start on Rainy Days

If I drove VW Polo everyday, it would start perfect and run perfect for months. But if I let the car sit for more than 24 hours, on the next day it would have lots of trouble to start the engine.

I’d have to wait for the sun to warm it a little and then It would turn on as if nothing was ever wrong with it.

The Cause:

It turns out that the rotor in the distributor had a huge hole in it.

So my dad cut a tiny piece of wire and put it inside the hole linking the 2 extremities of the hole. I think the inner stuff is graphite. He put it back in the car. And lo! It started despite the heavy rain! 😀

The Solution

But we decided to buy a new rotor in a car shop nearby for Rs 300. My car’s been great since. Noticed some performance improvement.

Take car of your car. Drive safe.