- Oct 16, 2010
- 7,186
- 10
- 8,927
Hi!
Today, I'll be sharing some background details about the infrastructure behind AS Project. I think some of showed interest in that, and maybe a couple more will read it out of curiosity.
AS Project Overview:
AS Project is currently powered by Celato Security, a rather small cluster I assembled in late 2008, mainly for educational purposes. When I built the servers and created the first draft of the network behind ASP, I was merely a business student and a total idiot when it came to computers. Therefore, I needed to do a bit of research before AS Project could launch.
Originally, when I first planned Celato Security, it consisted out of only three servers, each with the same specs. Those were Core 2 Duos 8400 with 2GB Ram and 500GB HDD space. They sat on a 100Mbit/s post and had a bandwidth cap of only 10TB. To avoid legal issues I decided to host outside of the US. Monthly cost was at $250.
Planning for ASP continued to progress in early 2009. Before the project even left the planning stage, a major revision was made. I added a distribution server, so we can increase the amount of content provided. It was a Quad Core Q6400 with 4GB RAM and two 500GB HDDs. It was on another 100Mbit/s port with a ridiculously small bandwidth cap of 3TB. Nothing fancy, but it would become instrumental in our continued success. This server increased monthly costs by $130.
During 2009 and early 2010, I gained plenty of crucial experience about designing effective systems and making them work together. I also learned which combination of hardware would be best for a specific task.
Back then we started to host our images on our own, private server. Soon hotlinking became a problem: Thousands of people hotlinked from our personal image repository because of the low loading times and permanent retention. This added immense stress on our wallet - around $350 every month had to be paid for other people's greed and laziness. We quickly came up with an effective counter tactic. We started watermarking all of our boxshots and screenshots, which impeded the bandwidth theft somewhat and kept the project affordable.
All the time the cluster underwent upgrades before it finally ended at its current stage.
Current Setup:
This setup is backed by two outsourced offloading services, which provide DDOS protection and also work as CDNs to serve static files. Cloudflare is one of those two services.
The whole network ran smoothly during 2010 and and most of 2011. However, at the end of the 2011 we experienced a major traffic surge overloading the whole setup. Traffic tripled in a short amount of time and reached a point where it was twice the amount we set as maximum for the current setup. Still, with extreme server optimizations we managed to serve 500 000 pageviews daily to over 2 000 000 monthly visitors. Given how standard usage for a webserver like this puts it at a mere 150 000 pageviews daily, this comes close to a miracle.
Apart from the servers mentioned above, we also use several VPS for a score of minor tasks, services, and projects.
Our original plan saw us upgrading our infrastructure somewhere between the second and third quarter of 2012. But with the still ongoing traffic increase we'd be pretty much dead till then. As Project Manager of ASP, I decided to expedite the upgrade and get it done during the fourth quarter of 2011. The upgrade will be a major one - nearly the complete Celato Security cluster will be replaced with more powerful machines, bringing us effectively to Celato Security Mark II. Tentative estimates put completion of the upgrades to early December.
The only server kept is Cordelia since it was a recent acquisition and does not yet need replacement.
Admittedly, I never in my life set up something as powerful as Celato Security Mark II.
It already started with carrier selection. We not only want a premium bandwidth carrier providing high quality tier 1 backbones (Tinet, Level 3), but also one that is affordable. This is not an easy task as we're continuing to host outside of the U.S.A. This means bandwidth is at least five times more expensive than a state-side solution would be.
The hardware selected was also chosen to serve us for a long time and on a major scale. After days of deliberation, the final setup was decided upon:
Future Setup:
Finnel's two replacement servers each have a Xeon E3 1230 with 8GB of ECC RAM and use a 1Gbit/s burstable line.
This is an obvious increase in processing power, but one may wonder why the bandwidth downgrade. The answer is simple and consists out of two parts. On the old server bandwidth wasn't as much the problem as disk I/O was. Both a single server with fast enough disks and two smaller servers with dedicated 1Gbit/s line would be a lot more costly.
Dedicated 1Gbit/s lines have always been very expensive, and two of them would've caused our operational expenses to skyrocket. Instead we opted to go with the a lot more affordable burstable option, which still gives us dedicated 100Mbit/s lines with a little extra from time to time.
The switch actually occurred last week. So far, the huge amount of workload distributed nicely among the servers, and stress was reduced to near nothing. Overall, our experiences show we have approximately the same amount of usable bandwidth (at those speeds the servers from filehosts actually start to slow you down anyway) coupled with serious performance gains. The ancient Q8400 just can't compete with the new Xeons, the disks used are faster, and the new bandwidth cap is still generous enough with 30TB for each upload and download, per server.
Given that our costs increased by a mere $10, the switch has just upsides. This means we can now upload even more anime, eroge, and hentai for you!
Now, Cake's replacement will also consist out of two servers armed with Xeon E3 1230s. The RAM will be a lot higher, though, with 16GB EEC for each server. One of them will function as webserver, the other as database server.
To my surprise, a lot of people asked why we'd go with two smaller servers instead of one big one. And why we didn't opt for a Dual Xeon. This made me seriously reconsider my setup, but eventually I decided staying with two smaller servers brings a lot of upsides over one big machine.
There's quite a few advantages:
Now, since I'm talking about hardware already, I may as well add in what HDDs are used in the different servers:
The webserver comes with four Toshiba MBF2300RC in a RAID-5 configuration. These 10 000RPM disks give us an array providing 900GB storage space and fast access speeds. We considered some other disks as well (like the Seagate Momentus XT), but the high amount of caching provided by our CDNs to users all over the world mean these disks are pretty much the best available choice.
And while 900GB of high performance storage should last us a long time, we can always use the new backup server to feed in more space. More about that server below.
But first, the database server: It actually uses two different kinds of storage: Half Intel 320 SSDs and half trusty Seagate Savvio 143GB 15.2K RPM.
This split is mostly caused by the fact SSDs are not designed for heavy writing duty. Therefore, the database log and buffer will be put on the Seagate HDDs instead since they are both write intensive (bad for SSDs) and sequential (meaning speed difference between SSDs and high performance HDDs is negligible).
Those two servers and the backup system will be connected by a 1Gbit private LAN.
The on-site backup server will have 40TB of usable space, where we'll backup our precious goods (anime and eroge, hourly database backup and webserver backups).The backup server will also run on a Xeon E3 and 16GB RAM. It will use ZFS filesystem. The drives used will be low performance, green drives since we don't expect high usage.
Besides simple storage, the backup server has another important role: Should we ever need more hosting space (for whatever reason), we can always link it via iSCSI to our webserver. Don't get us wrong when we said it will use low performance disks. There'll be four 120GB SSDs used for caching which will provide great random read speeds and with 24 HDDs in the array, write speed should still be a lot faster than what you'd see in a desktop PC, enough to max out a 1Gbit line which is the bottle neck of the entire cluster.
To keep things stable, a powerful firewall using deep packet inspection and advanced QoS will be between the Celato Security Mark II cluster and the rest of the world. As SQL database we'll actually be using MariaDB, which should provide even better performance and stability than MySQL.
All together the cluster will cost around $8 000 to set up and a monthly $330 for bandwidth and colocation space. This upgrade should enable us to serve 10 - 15 million visitors every month. It will be quite a bit before we have to go back to the design table, meaning we can provide stability and speed for a long, long time.
Hardware Overview:
Webserver:
Supermicro Chassis 1017C-TF Intel C202 1U
Xeon E3 1230
16GB ECC RAM
4 x Toshiba 10K 300GB MBF2300RC
Adaptec RAID 6805
Database Server:
Supermicro Chassis 1017C-TF Intel C202 1U
Xeon E3 1230
16GB ECC RAM
2 x Intel 320 Series 120GB SSD
2 x Seagate Savvio 15.2K 146GB
LSI 9261 Megaraid w/ FastPath & BBU
Backup Server:
Supermicro Chassis SC846E2-R900B 4U w/ X9SCL-F C202
Xeon E3 1230
16GB ECC RAM
24 x 2TB Seagate Barracuda Green
4 x Corsair Force SSD 120GB Cached
Intel SASUC8I SAS Initiator
Projected Expenditures:
Initial Investment: 7982$
Progressive Upgrade: 2004$ by end of 2012
Equipment Maintenance: 50$ per month
Operational Expenditure: 785$ per month
Expenditure by the end of 2012:
Infrastructure: 9986$
Operational Expenditure: 9420$
Expenditure for 2012 - 2013:
Infrastructure: 600$
Operational Expenditure: 9420$
Total expenditure from 2012 to fiscal year 2014: 38846$
Currency: USD
Error Margin: +/- 5%
I welcome any feedbacks, suggestions, or comments about our setup.
Today, I'll be sharing some background details about the infrastructure behind AS Project. I think some of showed interest in that, and maybe a couple more will read it out of curiosity.
AS Project Overview:
AS Project is currently powered by Celato Security, a rather small cluster I assembled in late 2008, mainly for educational purposes. When I built the servers and created the first draft of the network behind ASP, I was merely a business student and a total idiot when it came to computers. Therefore, I needed to do a bit of research before AS Project could launch.
Originally, when I first planned Celato Security, it consisted out of only three servers, each with the same specs. Those were Core 2 Duos 8400 with 2GB Ram and 500GB HDD space. They sat on a 100Mbit/s post and had a bandwidth cap of only 10TB. To avoid legal issues I decided to host outside of the US. Monthly cost was at $250.
Planning for ASP continued to progress in early 2009. Before the project even left the planning stage, a major revision was made. I added a distribution server, so we can increase the amount of content provided. It was a Quad Core Q6400 with 4GB RAM and two 500GB HDDs. It was on another 100Mbit/s port with a ridiculously small bandwidth cap of 3TB. Nothing fancy, but it would become instrumental in our continued success. This server increased monthly costs by $130.
During 2009 and early 2010, I gained plenty of crucial experience about designing effective systems and making them work together. I also learned which combination of hardware would be best for a specific task.
Back then we started to host our images on our own, private server. Soon hotlinking became a problem: Thousands of people hotlinked from our personal image repository because of the low loading times and permanent retention. This added immense stress on our wallet - around $350 every month had to be paid for other people's greed and laziness. We quickly came up with an effective counter tactic. We started watermarking all of our boxshots and screenshots, which impeded the bandwidth theft somewhat and kept the project affordable.
All the time the cluster underwent upgrades before it finally ended at its current stage.
Current Setup:
- Cake: A Dual Processor Harpertown with 6GB RAM and two 1TB RE HDDs in RAID-1. Functions as webserver and has a dedicated 100Mbit/s line.
- Finnel: A Single Processor Q8400 with 8GB RAM and four normal 2TB HDDs in RAID-0. Functions asdistribution server and has a dedicated 1Gbit/s line.
- Cordelia: A Single Processor X3480 with 8GB RAM and two 1.5TB RE HDD. Functions as ASL Project server and has a dedicated 100Mbit/s line.
- Yusa: A Single Processor, Q8400 with 8GB RAM, two 1.5TB RE HDD and a 120GB SSD. This server was donated for oreno.imouto.org - the world's favorite scan site. It has a dedicated 1Gbit/s line.
This setup is backed by two outsourced offloading services, which provide DDOS protection and also work as CDNs to serve static files. Cloudflare is one of those two services.
The whole network ran smoothly during 2010 and and most of 2011. However, at the end of the 2011 we experienced a major traffic surge overloading the whole setup. Traffic tripled in a short amount of time and reached a point where it was twice the amount we set as maximum for the current setup. Still, with extreme server optimizations we managed to serve 500 000 pageviews daily to over 2 000 000 monthly visitors. Given how standard usage for a webserver like this puts it at a mere 150 000 pageviews daily, this comes close to a miracle.
Apart from the servers mentioned above, we also use several VPS for a score of minor tasks, services, and projects.
Our original plan saw us upgrading our infrastructure somewhere between the second and third quarter of 2012. But with the still ongoing traffic increase we'd be pretty much dead till then. As Project Manager of ASP, I decided to expedite the upgrade and get it done during the fourth quarter of 2011. The upgrade will be a major one - nearly the complete Celato Security cluster will be replaced with more powerful machines, bringing us effectively to Celato Security Mark II. Tentative estimates put completion of the upgrades to early December.
The only server kept is Cordelia since it was a recent acquisition and does not yet need replacement.
Admittedly, I never in my life set up something as powerful as Celato Security Mark II.
It already started with carrier selection. We not only want a premium bandwidth carrier providing high quality tier 1 backbones (Tinet, Level 3), but also one that is affordable. This is not an easy task as we're continuing to host outside of the U.S.A. This means bandwidth is at least five times more expensive than a state-side solution would be.
The hardware selected was also chosen to serve us for a long time and on a major scale. After days of deliberation, the final setup was decided upon:
- All current servers will be replaced with two new ones each. The new servers will have the latest Xeons available to us.
- A global backup server will be added.
Future Setup:
Finnel's two replacement servers each have a Xeon E3 1230 with 8GB of ECC RAM and use a 1Gbit/s burstable line.
This is an obvious increase in processing power, but one may wonder why the bandwidth downgrade. The answer is simple and consists out of two parts. On the old server bandwidth wasn't as much the problem as disk I/O was. Both a single server with fast enough disks and two smaller servers with dedicated 1Gbit/s line would be a lot more costly.
Dedicated 1Gbit/s lines have always been very expensive, and two of them would've caused our operational expenses to skyrocket. Instead we opted to go with the a lot more affordable burstable option, which still gives us dedicated 100Mbit/s lines with a little extra from time to time.
The switch actually occurred last week. So far, the huge amount of workload distributed nicely among the servers, and stress was reduced to near nothing. Overall, our experiences show we have approximately the same amount of usable bandwidth (at those speeds the servers from filehosts actually start to slow you down anyway) coupled with serious performance gains. The ancient Q8400 just can't compete with the new Xeons, the disks used are faster, and the new bandwidth cap is still generous enough with 30TB for each upload and download, per server.
Given that our costs increased by a mere $10, the switch has just upsides. This means we can now upload even more anime, eroge, and hentai for you!
Now, Cake's replacement will also consist out of two servers armed with Xeon E3 1230s. The RAM will be a lot higher, though, with 16GB EEC for each server. One of them will function as webserver, the other as database server.
To my surprise, a lot of people asked why we'd go with two smaller servers instead of one big one. And why we didn't opt for a Dual Xeon. This made me seriously reconsider my setup, but eventually I decided staying with two smaller servers brings a lot of upsides over one big machine.
There's quite a few advantages:
- Server applications that we are using don't scale well to 16 logical cores. Two Xeons on one server would've meant just that: 16 logical cores. Any cores over 12 are rarely used by the OS because of how thread handling was designed. Research also shows that currently 8 cores are pretty much the sweet spot for server performance, which is another argument for single Xeon setups. (Note: This claim might not be entirely true but it was true with our setup)
- Dual Xeon boards are only available with the old LGA1366 socket. This would mean we'd have to use the aging, second generation, 45nm Nehalem processors. While they cost approximately the same as newer generation processors, they have much lower clock speed (meaning less raw power) and lack the optimizations Intel introduced in later generation CPUs, specifically the second generation, 32nm Sandy Bridge processors used in conjunction with LGA1155.
- Webservers exhibit different behavior than database servers and thus need different parts. Combining both into one server limits hardware and software which can be used. It also drives the cost (for the same performance) up a lot. Jack-of-all-trades need more power than single-purpose machines and cost for faster components increases exponentially.
- Two servers offer higher redundancy. In the rare case of a server failure, the second server can still temporarily take over. While this reduces performance significantly until repairs have been done, it at least avoids a complete outage leaving people with unhappy "server down" notifications seen frequently on some less organized forums.
- More servers need more space, which increases monthly costs.
- Since two servers use more parts, the failure rate increases. This is mostly offset by the redundancy included with having more than one server, though.
- Building two servers means more components, which translates to higher cost. This is somewhat offset by being able to use slower components, but I didn't do a full cost-comparison calculation since a single server setup has too many other advantages to make it viable just for a minor percentage in price reduction.
- More parts also means higher maintenance and replacement costs. Since server parts don't fall under a certain price, the slower parts used in a dual-server setup eventually may end up costing the same as the faster ones of a single-server setup.
Now, since I'm talking about hardware already, I may as well add in what HDDs are used in the different servers:
The webserver comes with four Toshiba MBF2300RC in a RAID-5 configuration. These 10 000RPM disks give us an array providing 900GB storage space and fast access speeds. We considered some other disks as well (like the Seagate Momentus XT), but the high amount of caching provided by our CDNs to users all over the world mean these disks are pretty much the best available choice.
And while 900GB of high performance storage should last us a long time, we can always use the new backup server to feed in more space. More about that server below.
But first, the database server: It actually uses two different kinds of storage: Half Intel 320 SSDs and half trusty Seagate Savvio 143GB 15.2K RPM.
This split is mostly caused by the fact SSDs are not designed for heavy writing duty. Therefore, the database log and buffer will be put on the Seagate HDDs instead since they are both write intensive (bad for SSDs) and sequential (meaning speed difference between SSDs and high performance HDDs is negligible).
Those two servers and the backup system will be connected by a 1Gbit private LAN.
The on-site backup server will have 40TB of usable space, where we'll backup our precious goods (anime and eroge, hourly database backup and webserver backups).The backup server will also run on a Xeon E3 and 16GB RAM. It will use ZFS filesystem. The drives used will be low performance, green drives since we don't expect high usage.
Besides simple storage, the backup server has another important role: Should we ever need more hosting space (for whatever reason), we can always link it via iSCSI to our webserver. Don't get us wrong when we said it will use low performance disks. There'll be four 120GB SSDs used for caching which will provide great random read speeds and with 24 HDDs in the array, write speed should still be a lot faster than what you'd see in a desktop PC, enough to max out a 1Gbit line which is the bottle neck of the entire cluster.
To keep things stable, a powerful firewall using deep packet inspection and advanced QoS will be between the Celato Security Mark II cluster and the rest of the world. As SQL database we'll actually be using MariaDB, which should provide even better performance and stability than MySQL.
All together the cluster will cost around $8 000 to set up and a monthly $330 for bandwidth and colocation space. This upgrade should enable us to serve 10 - 15 million visitors every month. It will be quite a bit before we have to go back to the design table, meaning we can provide stability and speed for a long, long time.
Hardware Overview:
Webserver:
Supermicro Chassis 1017C-TF Intel C202 1U
Xeon E3 1230
16GB ECC RAM
4 x Toshiba 10K 300GB MBF2300RC
Adaptec RAID 6805
Database Server:
Supermicro Chassis 1017C-TF Intel C202 1U
Xeon E3 1230
16GB ECC RAM
2 x Intel 320 Series 120GB SSD
2 x Seagate Savvio 15.2K 146GB
LSI 9261 Megaraid w/ FastPath & BBU
Backup Server:
Supermicro Chassis SC846E2-R900B 4U w/ X9SCL-F C202
Xeon E3 1230
16GB ECC RAM
24 x 2TB Seagate Barracuda Green
4 x Corsair Force SSD 120GB Cached
Intel SASUC8I SAS Initiator
Projected Expenditures:
Initial Investment: 7982$
Progressive Upgrade: 2004$ by end of 2012
Equipment Maintenance: 50$ per month
Operational Expenditure: 785$ per month
Expenditure by the end of 2012:
Infrastructure: 9986$
Operational Expenditure: 9420$
Expenditure for 2012 - 2013:
Infrastructure: 600$
Operational Expenditure: 9420$
Total expenditure from 2012 to fiscal year 2014: 38846$
Currency: USD
Error Margin: +/- 5%
I welcome any feedbacks, suggestions, or comments about our setup.