Looking to the future of EVE Metrics

EVE Metrics has become a disorganised, sprawling project. A year or two ago I set out with the idea in my head to build something like EVE Central, but for as much data as possible. I’m a data junkie in a way- I’m the sort of person who gets a bit of a kick from being able to see complex relationships between seemingly unrelated data and do that sort of analysis on a large scale. In short- <3 databases. I set out to do this with very little Ruby/Rails experience, a fairly solid MySQL experience after my earlier projects doing high-volume event logging using Garry’s Mod for Half-Life 2 to provide an audit trail for an FPS-based roleplaying environment, and a modicum of webapp experience.

What I hadn’t really considered was the market. Read on to read a short bit of background and find out where the site’s heading.

EVE Metrics originally started out as a statistics aggregator for everything. I figured I could pull in killboard data, market data, map data, everything the API offered, and roll it into a huge database and make pretty statistics. After sitting down and thinking I thought that tackling one bit at a time would be a good plan- so I started with the market. It seemed logical; the largest bit of the project, in theory, and interconnected with all other elements of EVE at some level.

The market is huge. And it’s really interesting, if you’re into large-scale datasets, statistics and computational economics. After about 6 months of dabbling in the various APIs and getting Salvis working with me on the uploader, I started to harbour increasing envy towards Dr. Eyjólfur Guðmundsson – CCP’s resident EVE economist. Having a job working on the market has gotta be awesome, honestly.

And then with the project in a ‘hey, stuff sorta works!’ state, Hexxx from EBANK went ahead and plugged it. We recently hit the 1,000 user mark, which I’m hugely proud of. But there’s a catch, of course…

Right now the database holds:

  • 1,936,686 unique market orders
  • 11,082,158 historic price indexes
  • 10,330,669 historic movement records
  • 606,091 EVE Metrics API call records
  • 117,914 bits of uploaded data still waiting to be imported
  • 2,069,769 historic system jump histories
  • 1,968,425 historic system kill histories
  • 165,798 non-market price reports

It’s a lot of data. But the way in which it’s organised and structured, and most importantly the processes involved in getting the data in and analysing it, are not designed for this kind of volume. I figured I’d have a few thousand orders in this, but the cache uploader changed that. We got some people in Jita and other trade hubs interested and they’re just churning out uploads- over half a million distinct uploads so far- which results in a lot of data.

Right now, EVE Metrics grabs the upload when it hits our API, stuffs it in a job queue, and forgets about it. Some behind-the-scenes workers perform an upload filtering algoritm which marks any outliers in the uploaded data, mark old orders in the database as expired, and then add any new orders and update existing ones. Then the raw data is thrown back in the queue to be sent out to users of the Webhook service, and then discarded. When you go to a page, then, we get to do the following:

  • Retrieve all the market orders for that item that aren’t expired or outliers for the regions the user has set as favourites or default market regions
  • Split the dataset up into Jita/non-Jita and buy/sell arrays
  • Crunch numbers to generate averages for those arrays
  • Render all the market orders on the webpage, including loading in all the appropriate system/station/region data for each order

The first three steps take a little while. On the larger orders the mere step of retrieval from the database can take multiple seconds. And that’s no good. We want millisecond response times. EVE Metrics should be snappy, responsive. Upload processing can take a whole 20 seconds for a Jita upload of tritanium- the backlog of nearly 120,000 jobs should speak worlds about how ungood that is.

And all this leads to unhappy servers, which lead to broken websites and unhappy users, and that’s no good.

So, I’m hitting the reset button.

The data will remain, your user accounts will be ported over, and I will endeavour to keep legacy APIs supported and old links redirecting to their new homes. But the codebase and website will be entirely new. The frontend will be getting a new skin which is minimalist and flexible and has proven to be very popular with users of accVIEW in the past month.

I’ll be writing everything for speed, performance, and general scalability. I want this to perform well under all circumstances and have a delay from viewing an item in EVE to having the new data on the site of under a minute. And I’m going to be writing tests for everything to ensure behaviour of the site is predictable and doesn’t randomly break.

On the downside, EVE Metrics will be hobbling along unmaintained for a month or two while I write EM2. It’s managing fine now, mind you, but in half a year’s time it may not be working quite as well…

I’ll be posting up a survey for EVE Metrics users to let them decide what they’d like to see in 2.0. API integration is an absolute certainty, but things like a JS-sidebar-style-market-browser and competition filters are also on the table. I’d rather hear your opinions now than after I’ve gotten halfway through doing something that would make your idea really tricky to implement 🙂