<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Talk Unafraid &#187; architecture</title>
	<atom:link href="http://www.talkunafraid.co.uk/tag/architecture/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.talkunafraid.co.uk</link>
	<description>EVE Online, Ruby on Rails and Security</description>
	<lastBuildDate>Wed, 01 Sep 2010 17:12:31 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Building Backchat, Part 2</title>
		<link>http://www.talkunafraid.co.uk/2010/05/building-backchat-part-2/</link>
		<comments>http://www.talkunafraid.co.uk/2010/05/building-backchat-part-2/#comments</comments>
		<pubDate>Sat, 08 May 2010 13:26:46 +0000</pubDate>
		<dc:creator>James Harrison</dc:creator>
				<category><![CDATA[Projects]]></category>
		<category><![CDATA[architecture]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[politics]]></category>
		<category><![CDATA[ruby]]></category>
		<category><![CDATA[sinatra]]></category>
		<category><![CDATA[stuffisawesome]]></category>

		<guid isPermaLink="false">http://www.talkunafraid.co.uk/?p=885</guid>
		<description><![CDATA[Or: How I learned to give up on projects. Okay, so, Backchat was hugely interesting as a project. Eventually, I produced a set of graphs using the classifier that showed sentiment over time. These graphs aren&#8217;t too accurate but are fairly good at showing how things were going. However, after this I pretty much dropped [...]]]></description>
			<content:encoded><![CDATA[<p>Or: How I learned to give up on projects.</p>
<p>Okay, so, Backchat was hugely interesting as a project. Eventually, I produced a set of graphs using the classifier that showed sentiment over time. <a href="http://assets.talkunafraid.co.uk/2010/05/frequency.png" rel="lightbox[885]"><img class="alignright size-medium wp-image-886" title="Tweets over Time for Debate #2" src="http://assets.talkunafraid.co.uk/2010/05/frequency-150x300.png" alt="" width="150" height="300" /></a>These graphs aren&#8217;t too accurate but are fairly good at showing how things were going. However, after this I pretty much dropped the project. This was mainly due to exams cropping up and stealing my time away, but also because of how difficult it was to approach a sensible level of accuracy.</p>
<p>In my &#8216;final&#8217; design I ended up using a bigram classifier. I added parsing of the tweets to pull out mentions of words, URLs and users, and then used this to generate my training sets, which improved things a lot. This gave me several thousand tweets for each training set, which worked okay. However, even with this classifier, which was doing a lot better than most others, my results weren&#8217;t very reliable on a tweet-by-tweet basis. Still, it wasn&#8217;t too shoddy, and the graphs on the right are fairly reliable I think in terms of general sentiment.</p>
<p>The AMQP-linked network of processors worked extremely well, and resulted in good throughput- I used two parsers, two classifiers and one classifier loader in the end; I was unable to achieve realtime performance due to network constraints. Sadly my ISP at home had decided that I&#8217;d used too much bandwidth and clamped me down to 128 kilobits a second. That said, thanks to the streaming API I did not (as far as I know, except for a few hundred to ratelimiting) lose any tweets, I just received them out of order and then reconstructed the correct order using the timestamps for each tweet. The machine I was using for this also pretty much went flat out on disk I/O and CPU usage, but was able to keep up- it&#8217;s a fairly old box, only a Pentium 4 with a couple of gigs of RAM.</p>
<p>In any case this was an interesting project and I&#8217;ll be open sourcing the data and source in the coming weeks if anyone wants to have a poke at it. While the debates are now gone and done, I&#8217;m sure people can come up with some great uses for sentiment analysis outside of UK politics.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.talkunafraid.co.uk/2010/05/building-backchat-part-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Architecture for the future</title>
		<link>http://www.talkunafraid.co.uk/2010/01/architecture-for-the-future/</link>
		<comments>http://www.talkunafraid.co.uk/2010/01/architecture-for-the-future/#comments</comments>
		<pubDate>Thu, 14 Jan 2010 23:49:11 +0000</pubDate>
		<dc:creator>James Harrison</dc:creator>
				<category><![CDATA[EVE Metrics]]></category>
		<category><![CDATA[Odds and Ends]]></category>
		<category><![CDATA[api]]></category>
		<category><![CDATA[architecture]]></category>
		<category><![CDATA[code]]></category>
		<category><![CDATA[eve metrics]]></category>
		<category><![CDATA[hardware]]></category>
		<category><![CDATA[servers]]></category>
		<category><![CDATA[software]]></category>
		<category><![CDATA[stuffisawesome]]></category>
		<category><![CDATA[work in progress]]></category>

		<guid isPermaLink="false">http://www.talkunafraid.co.uk/?p=616</guid>
		<description><![CDATA[After that EVE-centric post on scalability (thanks to HighScalability.com for linking in, hope it was an interesting read), I figured it was time to return to EVE Metrics and other sites- accVIEW and ISKsense. In the next week we will be migrating to a new server. It&#8217;s in the same datacenter with the same host, [...]]]></description>
			<content:encoded><![CDATA[<p>After that EVE-centric post on scalability (thanks to HighScalability.com for linking in, hope it was an interesting read), I figured it was time to return to EVE Metrics and other sites- accVIEW and ISKsense.</p>
<p>In the next week we will be migrating to a new server. It&#8217;s in the same datacenter with the same host, is a slightly faster machine but has four times as much RAM (8GB) and an additional 10kRPM hard drive. As part of the migration to the new server we&#8217;ll be making some changes to the software architecture running the show.</p>
<p>The main difference is that we&#8217;re moving away from Passenger, also known as mod_rails. It has some advantages in low-memory conditions, but we&#8217;ve had more trouble than it&#8217;s worth, so we&#8217;ll be moving back to running application servers manually as daemons. For this we&#8217;ll be using the excellent Thin application server. For the sites running PHP on the server (this blog, for example), we&#8217;ll be using PHP FPM as we are currently; we&#8217;ve had no issues with that. Both of those will be sitting as reverse proxies behind nginx. Nginx has done very well as a web server and it&#8217;s very fast, as well as being easy to configure.</p>
<p>There is only one other major change; we&#8217;ll be sitting nginx itself behind Varnish, a high performance HTTP cache. This will let us more efficiently leverage HTTP caching in our applications and speed up requests dramatically. Right now we don&#8217;t use HTTP caching that much; we&#8217;d like to change this, particularly in EVE Metrics&#8217; API so we can let Varnish handle a good portion of the thousands of API calls we get asking for the price of trit or what have you. All in all it&#8217;ll mean reduced load on the application cluster, which means we can keep that smaller and lighter, which in turn means more room for the database in memory.</p>
<p>That translates to better performance on the more complex components in the site, ie market pages, your account page, corporate pages, and that better performance means we can build more- we&#8217;re waiting for the new capacity before we add asset support, one of the things we&#8217;re really looking forward to adding, since it will let us add a whole new level of functionality by giving lots more information to processes like our inferred trade detector and our planned fulfilled orders listings. Plus we&#8217;ll be adding asset valuation tools, of course.</p>
<p>The architecture I&#8217;ve described above will basically be &#8216;it&#8217; for now; we have more complication at the application and DB layer (We still use MySQL for a few legacy applications, so we have a tiny MySQL server running). The complication at the app layer mainly consists of things like background processing tools, and for EVE Metrics tasks that are actually executed on a VPS and the results uploaded back to the server (we now do all the major CSV dumps on Makurid&#8217;s VPS).</p>
<p>As the guy who ends up fixing all this when it goes wrong, simplicity is always my main priority, but the added complexity of Thin and Varnish should be well worth it in the long run.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.talkunafraid.co.uk/2010/01/architecture-for-the-future/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>EVE Scalability Explained</title>
		<link>http://www.talkunafraid.co.uk/2010/01/eve-scalability-explained/</link>
		<comments>http://www.talkunafraid.co.uk/2010/01/eve-scalability-explained/#comments</comments>
		<pubDate>Tue, 12 Jan 2010 02:28:09 +0000</pubDate>
		<dc:creator>James Harrison</dc:creator>
				<category><![CDATA[Odds and Ends]]></category>
		<category><![CDATA[architecture]]></category>
		<category><![CDATA[ccp]]></category>
		<category><![CDATA[commentary]]></category>
		<category><![CDATA[design]]></category>
		<category><![CDATA[EVE]]></category>
		<category><![CDATA[servers]]></category>

		<guid isPermaLink="false">http://www.talkunafraid.co.uk/?p=606</guid>
		<description><![CDATA[OK, I&#8217;ve seen a bunch of posts on the EVE blogosphere about this recently and it&#8217;s always been a tricky topic to understand. This post aims to demystify EVE&#8217;s architecture and explain in simple terms what EVE&#8217;s current issues with scaling for fleet fights are, and approaches for fixing them. So first a disclaimer: I [...]]]></description>
			<content:encoded><![CDATA[<p>OK, I&#8217;ve seen a bunch of posts on the EVE blogosphere about this recently and it&#8217;s always been a tricky topic to understand. This post aims to demystify EVE&#8217;s architecture and explain in simple terms what EVE&#8217;s current issues with scaling for fleet fights are, and approaches for fixing them. So first a disclaimer: I do not work for CCP, I don&#8217;t get behind the scenes information. This is a post compiled from several years working on EVE third party development and talking to people who do work at CCP, people who have worked at CCP, and the community at large. To the best of my knowledge this is mostly correct, but I make no promises. If you&#8217;re looking for an exact technical description, look elsewhere.</p>
<p>So, let&#8217;s start with the basics. This is the (somewhat simplified) hardware layout for Tranquility (click to enlarge).</p>
<p><a href="http://assets.talkunafraid.co.uk/2010/01/EVE_Architecture.png" rel="lightbox[606]"><img class="aligncenter size-medium wp-image-607" title="EVE Hardware Architecture" src="http://assets.talkunafraid.co.uk/2010/01/EVE_Architecture-300x150.png" alt="" width="300" height="150" /></a>To sum up in words: There are proxy servers that receive your data and route you to the appropriate sol server, which is running on a sol node or reinforced sol node. These servers communicate with a single, shared database server, which is also used for web services like the API and the MyEVE website (and, soon, Spacebook).</p>
<p>There&#8217;s an important distinction to be made here and one that is vital to understanding EVE&#8217;s architecture- <em>nodes and servers are not the same thing</em>. Nodes refer to the<strong> actual physical hardware</strong> (at time of writing, IBM Blade servers) that <strong>may run one or more sol servers</strong>. Each sol server is, as the name implies (Sol is the name for our sun) responsible for one solar system in EVE. It is a<strong> software server </strong>process, handling everything that goes on in a system- combat, mining, market, and so on.</p>
<p>EVE&#8217;s scalability issues stem from this design, but let&#8217;s look at what those issues are. Can EVE handle 56,000 players? Yep, easily. Tranquility will be able to handle many more than that without issue, and because of this design the capacity can be easily expanded by increasing the number of sol nodes for sol servers to run on, spreading the load efficiently and easily. Will you be able to fit 3000 people onto a gate? Nope. Why? Well, because EVE was designed so that the capacity of the <em>whole cluster</em> expanded well, not individual systems. This was a design decision made back in the early days of EVE and it has served EVE well, with the exception of fleet combat and Jita. So how to handle the edge cases?</p>
<p>Well, where does lag come from? Proxy servers have an easy job and they are not a bottleneck in the vast majority of circumstances. The main issues they cause are disconnects; when a proxy server fails, a good chunk of EVE&#8217;s inhabitants disappear till they reconnect. The lag is in combat and in high concurrency systems- like Jita, where loads of people trade, talk in local, and fly around suicide ganking each other. This lag stems from intensive processes that have to be done; mathematical steps like calculating transversal velocities between objects, things that have complexity values (algorithmically speaking) of <img src='http://s.wordpress.com/latex.php?latex=O%28n%5E2%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='O(n^2)' title='O(n^2)' class='latex' /> or worse. If you didn&#8217;t understand that- well, it just means the more ships you have, the more difficult things get, exponentially.</p>
<p>Obviously, there are optimisations that can be done, better algorithms, and CCP uses them, but the fact remains; this is a lot of work for a computer. Loads. Absolutely shedloads. <strong>And that&#8217;s all this challenge gets- one computer, at most. </strong>In bad cases, it won&#8217;t even get that-most sol nodes run multiple servers, the reason why lag sometimes seems to cross between systems- it really can, and does. Reinforced nodes just have more firepower and a guarantee of exclusivity, but they&#8217;re still only one computer. And as Google has taught industry, lots of small computers are cheaper, easier to fix, and faster than a single box computer.</p>
<p><strong>True scalability will come to EVE when a sol server can be distributed seamlessly (without rebooting or dropping clients) and near-instantly across multiple sol nodes.</strong> That will mean that fleet fights can take all the resources they need, will mean that CCP gets to maintain cheaper hardware, making scaling the hardware cheaper and easier. And you maintain the scalability of the cluster, assuming you keep some hardware spare for sol nodes to grow onto in the event of a fight.</p>
<p><strong>What needs to be done to achieve this?</strong> Why haven&#8217;t we got this yet? Well, it&#8217;s a heck of a lot of work. It&#8217;s a huge technical challenge, leaving internet spaceships out of it. Then there&#8217;s the hardware prerequisites; you need insanely fast low-latency networking (Infiniband, Fibre Channel, etc), and the extra nodes. It&#8217;s a huge investment for CCP, but one they&#8217;ll have to make eventually unless they find another way of solving the problem; but any other solution is likely to break immersion and cohesion in the game (grid sharding, etc), and so unlikely.</p>
<p>I hope that helps explain some of the thinking behind EVE&#8217;s architecture and <em>why </em>you lost that titan last night. And why it&#8217;s likely you&#8217;ll lose a few more before it&#8217;s fixed.</p>
<p><em>Minor second disclaimer: It&#8217;s 2:30 AM and I&#8217;m tired as hell, so this may contain errors. Feel free to point any out in the comments.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.talkunafraid.co.uk/2010/01/eve-scalability-explained/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
	</channel>
</rss>
