In this short abstract I’ll show you how did I make my work on trying to create a highly scalable, geo localized and distributed system using the DNS.
What we are going to create is a simple but powerful DNS system that can handle queries for a domain returning records based on the user’s geo location.
To accomplish this task we have to choose a good opensource DNS server. My choice was powerdns (http://wiki.powerdns.com/trac).
Powerdns is a great piece of software. It’s a powerful DNS server daemon that can be configured to fit in different DNS environments.
You can save domain zones into different backends (MySQL, Oracle, bind zone file, ldap, etc), and you can have primary and secondary DNS servers with automatic zone replication. This is all what you need to create a full featured DNS system.
One of the powerdns backends do accomplish the geo lookup task, and it’s called “geobackend” (http://wiki.powerdns.com/trac/browser/trunk/pdns/modules/geobackend/README).
Our test environment will consist in a primary DNS server (powerdns as a master), a secondary DNS server (powerdns as a slave) and a geo lookup DNS server (powerdns as master with geobackend enabled). We will enable automatic zone transfer between the primary and the slave server, so that if you add a new record on the master it will be automatically created on the slave.
So, we need 3 servers with powerdns installed. The installation process may be different in each case, but if you are using debian, the task can be as simple as running by root the following command:
apt-get install pdns-server
Now you need the backend where you will save the zone data. May be you want to choose “MySQL” for the master and “bind format file” for the slave DNS. The geo dns server will not need a zone backend because its single task is to retrieve the caller’s IP address and fetch its geographic location from a particular location file, then lookup this location from a map file and return back to the calling user the associated CNAME record that’s into the map file.
A quick brain guideline is given below.
My system (yourdomain.com) is composed like this:
ns1.yourdomain.com (primary DNS server with mysql backend)
ns2.yourdomain.com (secondary DNS server with auto zone replication on bind zone file)
ns1.geo.yourdomain.com (geo lookup DNS server with geobackend)
I executed the steps below:
On ns1.yourdomain.com you have to:
1) install powerdns with the gmysql backend
2) install MySQL server, create a database and grant a user on that DB
3) configure powerdns as master, with gmysql backend connecting to MySQL
4) please note that this server is authoritative to the “yourdomain.com” zone
5) delegate the “geo” zone with a NS record to the geo dns server: “geo IN NS ns1.geo.yourdomain.com”
6) create the glue record for the geodns with the record: “ns1.geo IN A ip_geo_dns_server”
On ns2.yourdomain.com you have to:
1) install powerdns with the bind backend
2) configure powerdns to be a slave with bind backend and enable ns1.yourdomain.com as a supermaster
3) please note that this server is authoritative to the “yourdomain.com” zone
On ns1.geo.yourdomain.com you have to:
1) install powerdns with the geo backend
2) configure powerdns as master with geobackend
3) please note that this server is authoritative to the “geo.yourdomain.com” zone
4) create a map file to handle the association between your country location (eg: uk) and the CNAME that the server will reply
5) download the location database zone, for example I use: zz.countries.nerd.dk (http://countries.nerd.dk/)
If you need how to do that in details please do not hesitate to write me a email. You will find it into my contact page.
Ciao, Dino.
Hi people. On August 2011 has been discovered that apache httpd server is vulnerable to a simple to perform DOS attack. A simple perl exploit has been released called apache killer that make a big number of parrallel crafted HTTP calls (HEAD method) with the “Range” header. This make possibile to the attacker to consume memory and cpu on the attacked server bringing apache and the system down in no time. The attacker does not need large bandwidth to perform the attack.
Anyone using apache httpd in production environment is encouraged to upgrade to the latest apache version that solve the security problem.
If you cannot upgrade to >= 2.2.20 you can use mod_rewrite to deny requests with the Range header like the rewrite regexp below. This is what you need on your httpd.conf:
RewriteCond %{HTTP:range} ^bytes=[^,]+(,[^,]+){0,4}$
RewriteRule .* – [F]
You can find the exploit script on google. I will not put it here.
Ciao, Dino.
When you work as a horizontal support tech consultant for a very big company you may have to do with people that has basic linux/opensource knowledge and commercial system stuff (win, unix) skills. You may even have to do (… !?!?) with very prepared people (…).
Someone here takes care of apache installations and apache itself is very robust and stable so we all can sleep quietly, the problem comes out when someone with less OS system skills fires “kill -9” on apache processes to stop it.
[Thu Aug 11 17:47:01 2011] [error] (28)No space left on device: Cannot create SSLMutex
Configuration Failed
And apache does no longer come up and running.
The trouble may be weird because one could read “No space left on device” without reading the real error message: “Cannot create SSLMutex”.
But the problem is really easy to understand if you try to figure out what is going on at syscall OS level when apache starts up. This can be done with “strace” command on linux (“truss” or “tusc” on other expensive unix environments…).
The problem is caused by IPC SysV semaphores still standing up on the system from the previous apache kill.
The definitive solution here is to stop apache with “apachectl” command or calling “kill -15” (and not kill -9 !!!!) on the apache father process. This way you are instructing apache to stop gracefully, the father kills his childs and cleanup semaphores and the like, the clean way.
To solve your problem you have to cleanup hanging semaphores. You could reboot linux, but this is avoided on any serious environment, so which is the magic command?
If apache runs with “apache” user you can call this command to clean up semaphores created by the “apache” user:
ipcs -s | grep apache | perl -e ‘while (<STDIN>) {@a=split(/\s+/); print `ipcrm sem $a[1]`}’
You could do “ipcs -s | grep apache” to see the semaphores first, and then call ipcrm on each to clean it up.
Try to start apache now and the problem would solve.
The other way is to change the apache serialization mechanism from semaphores to pthread mutexes or fcntl. To do so you have to:
1) set “AcceptMutex fcntl” on httpd.conf
2) set “SSLMutex pthread” on httpd-ssl.conf
Hope this help someone… 🙂
Ciao, Dino.
I recently discovered a wonderful DNSBL service reporting you where public Internet IPs are from.
The service is countries.nerd.dk: http://countries.nerd.dk/more.html
You can for example block any mail at your mailserver coming from china or russia, simply integrating this DNSBL with your MTA.
You can even get the country of your IP with dig!
Warning: You need to swap IP octets. If for example the IP to check is 192.162.132.171, you have to call 171.132.162.192.zz.countries.nerd.dk.
root@nbvirtdns1:/# dig TXT 201.65.24.151.zz.countries.nerd.dk
…
201.65.24.151.zz.countries.nerd.dk. 1047 IN TXT “it”
…
Hi.
Today we get how to use Huge Pages with Java from a Linux powered system.
While a Linux system generally splits memory segments into pages of 4 kb, Huge Pages are memory pages large 2Mb or more.
This is proved to increment speed when the application make use of large quantity of ram, like Java with a large heap (2 GB or more).
It’s correct to say that this is not always the correct configuration choice because the memory setted to be dedicated to Huge Pages cannot be accessed by the kernel (buffer cache) or by the applications, so that memory is subtracted from the virtual memory pool of the system. Since it’s very fast to make it a try and decide if use it or not, let me play with it.
Here in this example we will set 2,5 GB of RAM to be used as Huge Pages. Your mileage may vary.
HPM (huge page memory) is expressed in GB.
First: set the quantity of memory (bytes) to be defined as a shared memory segment
This is quickly found calculating this simple formula: ((HPM * 1024 * 1024 * 1024) – 1).
In our example: ((2,5 * 1024 * 1024 *1024) – 1) = 2684354559
Set it up online with this command:
echo 2684354559 > /proc/sys/kernel/shmmax
If you want to set it permanent at the next system reboot, append those two lines to your /etc/sysctl.conf file:
# Shared memory – max segment size: 2,5 Gb (-1 b)
kernel.shmmax = 2684354559
Second: set the number of reserved large memory pages
This is the number of reserved pages. Each page is large 2 Mb, so finding the number of pages to reserve is simple:
((HPM * 1024) / 2). In our example: ((2,5 * 1024) / 2) = 1280
Set it up online with this command:
echo 1280 > /proc/sys/vm/nr_hugepages
If you want to set it permanent at the next system reboot, append those two lines to your /etc/sysctl.conf file:
# Enable kernel to reserve 2,5GB / 2Mb large pages
vm.nr_hugepages = 1280
Third: set the system group id enabled to use huge pages
Java programs usually should not be fired by the root user. In my case, the group id of my program is “1001”.
Set it up online with this command:
echo 1001 > /proc/sys/vm/hugetlb_shm_group
If you want to set it permanent at the next system reboot, append those two lines to your /etc/sysctl.conf file:
# System group id that can use huge pages (hugepages gid: 1001)
vm.hugetlb_shm_group = 1001
Fourth: run the java program with the Huge Page support
In our example we are using the JVM distributed by Oracle. Other Java vendors may use different parameters to enable Huge Pages. They can even call Huge Pages differently.
The program can now be fired with “-XX:+UseLargePages -XX:LargePageSizeInBytes=2m”
My complete java parameters for my java program are:
java -d64 -server -Xms1900m -Xmx1900m -Xss192k -XX:+UseLargePages -XX:LargePageSizeInBytes=2m -XX:+UseParNewGC
Ciao ciao.
Dino Ciuffetti.
Today i posted to the orientdb mailinglist and I’ve written about liborient, my very first orientdb C library implementation.
We are searching for new developers to join. This is what I putted to the list.
Hi all.
I’m making an attempt to write a proof of concept, simple, LGPLv3
OrientDB C library for linux.
The library is written in best effort, so don’t kill me if you see bad
code for now…
As a starting point, there is already a very first implementation of
some simple binary protocol methods.
For those there are interested, this is the API that it’s just (it
seems…) working with the latest OrientDB SVN version:
http://www.tuxweb.it/temp/apishot/liborient/liborient_8h.html#func-members
You can view development code here:
http://svn.tuxweb.it/cgi-bin/viewvc.cgi/liborient/trunk/main/liborient/src/
INSTALL:
1) Install the latest GNU autoconf, automake and libtool
2) svn co http://svn.tuxweb.it/SVN/projects/liborient/trunk/main/liborient
3) cd liborient
4) ./autogen.sh
5) ./configure –prefix=/tmp/liborient
6) make
7) make check
8) make install
Warning: this is a very first proof of concept implementation that I
started to study OrientDB. Do not use it in production environments.
Even if I think “the scalable way”, I’m a Linux SysAdmin and not a
full time developer, so may be the API is not well designed and the
code is ugly.
We need people that write code. If you are interested, please join in
and contribute.
This is a sample C program that links liborient… and works 🙂
http://svn.tuxweb.it/cgi-bin/viewvc.cgi/liborient/trunk/main/liborient/test/single_orient.c?view=markup
<snip>
orientdb *oh;
o_conh *och;
unsigned long cid;
// create a new liborient handler
oh = orient_new();
// setup library debug level to “ORIENT_DEBUG”
orient_debug_setlevel(oh, ORIENT_DEBUG);
// setup debug callback
orient_debug_sethook(oh, &your_debug_function);
// preparing to open a new binary connection handler for orientDB
och = orient_prepare_connection(oh, ORIENT_PROTO_BINARY, “localhost”, “2424”);
// setting admin credentials
orient_set_credentials(oh, och, ORIENT_ADMIN, “root”, “pippo”);
// setting user credentials
orient_set_credentials(oh, och, ORIENT_USER, “reader”, “reader”);
// create the real connection with orientdb server
cid = orient_connect(oh, och, timeout);
// open the database “demo”
orient_dbopen(oh, och, cid, “demo”, timeout);
// get the DB size
dbsize = orient_db_size(oh, och, cid, timeout);
// get the total number of records
records = orient_db_countrecords(oh, och, cid, timeout);
// close the database
orient_dbclose(oh, och, cid, timeout);
// free library stuff
orient_free(oh);
</snip>
Any thoughts?
Ciao, Dino Ciuffetti.
OrientDB is a fast, scalable, open source object / graph database server written in Java.
After more than 20 years of RDBMS predominance it’s now time to switch to non relational database systems, specially where scalability and query response time are two fundamental things to achieve a better user (web or not) experience.
So, how to get up and running quickly with orientdb?
Here we will build on a linux system the latest development version from source in no time: the simple way ™.
First thing to do is to download JDK Java Standard Edition 6 from http://www.oracle.com/technetwork/java/javase/downloads/index.html.
Please note that you will need the JDK and not JRE.
After that you will need Apache Ant. Download it from here: http://ant.apache.org/bindownload.cgi.
# cd /opt
# tar jxf apache-ant-1.8.2-bin.tar.bz2
Installed? Good. Now install subversion (svn). You can install it for example using your favorite distribution specific package manager, for example if you are using debian or ubuntu you could use apt-get utility, like that:
# apt-get update; apt-get install subversion
You have now to create a directory where you like on the system and begin to download the OrientDB development snapshot:
# mkdir /home/dino/orientdb-source
# cd /home/dino/orientdb-source
# svn checkout http://orient.googlecode.com/svn/trunk/ orient-read-only
When finished cd to orient-read-only.
You have now to set your JDK and ANT bin directories into your PATH system variable. You can do it by this way:
# export PATH=/opt/apache-ant-1.8.2/bin:/opt/jdk1.6.0_25/bin:$PATH
You can now begin to compile orient source code.
# ant clean
# ant
# ant test
# ant install
Ok. If compiled successfully, you now have to startup orient for the first time.
# cd /home/dino/releases/1.*-SNAPSHOT/bin
# chmod 754 *.sh
# ./server.sh
Ok. Now stop it with CTRL+C and modify the configuration file as you like:
# cd ../config
# vi orientdb-server-config.xml
The first thing to configure, if you need to publish the service on your network/internet, is the bind address. For example, to bind on any ip on the system: <listener ip-address=”0.0.0.0″ port-range=”2424-2430″ protocol=”distributed”/>
The second parameter to change is the root password: <user name=”root” password=”pippo” resources=”*”/>
Now start orientdb again:
# cd /home/dino/releases/1.*-SNAPSHOT/bin
# nohup ./server.sh 1>/dev/null 2>/dev/null &
You should now have done.
Connect your browser to http://127.0.0.1:2480/ and begin to play with your brand new orientdb studio web console:
host: localhost
user: writer
password: writer
database: demo
You can find open and solved issues here: http://code.google.com/p/orient/issues/list
Subscribe yourself to the orientdb users mailing list service: http://groups.google.com/group/orient-database, and enjoy!!
Hi.
If you have a guest VirtualBox VM with the system clock out of sync with your host system, and when you try to set the guest system time manually it get automatically setted out of sync, you may want to run this command on your host system:
VBoxManage modifyvm <vm_name> –biossystemtimeoffset <time offset in ms>
Hope you get out of this trouble quickly. That took me hours to debug it!!
I would not recommend you to host sites that way, you have to be sure that your ISP give you public IP(s) and setup your router to port forward ports 80, 443, 53, and so on.
There are other problems too:
1) if you want to host more than one site with SSL you must have one public IP for each SSL site or use different SSL ports for each site, because name virtualhosting with SSL is not possible;
2) dsl lines are not designed to be stable. The connection can go down and make your site not visible. This is a major problem if you make the mistake to have your own DNS server on it!! The ISP assigned public IP address can change more than one time a day and you have to sync the DNS zone each time.
3) dsl ips are putted into DNS based blacklists zones. You may not be reached from various HTTP proxy servers around the world. For the same reason you cannot send mails, for example originated from your sites.
4) adsl lines are asymmetric (unbalanced for download). You have few kbytes per second in upload, that is just what you need to publish web sites, so this can be a problem when you have just more than 3 users.
5) you probably have problems with High-Availability and Load-Balancing on domestic hardware and you may have blackouts.
6) DNS subsystem may need primary and secondary DNS servers.
The best way (imho) is to use services like slicehost where you have a HA virtual server slice running linux, public IP addresses, free primary and secondary DNS hosting service, large public bandwidth, disk space… and not last your own root password that you can use to have maintenance on your own server for your own.
https://manage.slicehost.com/customers/new?referrer=af57db3020e04bb27352e271753a7a18