Skip to content
Sep 1 10

Getting set up with Ruby and Rails

by James Harrison

I’ve had a lot of people asking for help setting up a Ruby on Rails environment recently so figured I’d put a post together detailing how I set up my boxes.

This won’t be a guide for everyone, but it’s a tried and tested setup that not only performs well, but is also well set up to work with most gems and the development tools you’ll want. This guide covers both development and production environment setups.

read more…

Sep 1 10

Nebulous

by James Harrison

A few days ago I said I’d blog about Nebula, a project myself, Makurid and others have been working on on and off for a few weeks now. So here we go:

After seeing Mynxee get some wheels in motion with CSM5, I felt a bit annoyed at my inability to contribute a lot to the workings of the CSM. At my age, I am not eligible to run for candidacy, as the drinking age is 21 in Iceland and CCP don’t want delegates they can’t take down to the pub (At least, I guess – the age of majority is 18, so it’s not a legal thing). Anyway, I decided to use my normal approach to fixing problems: There’s an app to be written for that…

So with the help of Selene from EOH Poker, we rallied a whole 3 CSM and ex-CSM people into a collaborative text editor, and wrote a spec. The tool in question was to manage proposals as they went through the CSM process – with the wiki and AH forums being identified as inadequate by both the CSM and CCP, but with no activity from CCP on providing a solution, I saw this as an opportunity to get the community involved and really push to get some CSM processes streamlined. We called it Nebula, and formed a new team of developers, with an open source codebase and instructions on how to contribute. We actively sought out new developers and tried to encourage other third party developers to get involved, and were met with mostly positive results. Team Excellence was born.

The CSM involvement we had was good. We ended up with a spec that was, as the title of this blog suggests, a little nebulous, but had most of the major concepts well sketched out and defined. We got to work.

It’s now been a fairly long time since we last had CSM contact, though those who have been trying to get involved in this (Mynxee and  Trebor to name a couple) have done so well, and continue to provide input; the main issue we have is that out of the 8 (since Ankh was removed) active CSM members and 5 reserves (or 9 and 4, depending how you look at it- either way, a total of 13 players), we’ve gotten in touch with and received input from three at most.  This is worrying for a number of reasons.

If the chairwoman of the CSM is having this much difficulty in managing to have some people meet and discuss matters for even a one-off meeting or event, then I start to wonder how well the communication works between meetings on other issues. For example; I posted a long while ago about the API in the assembly hall forum. The matter has been taken up by two CSM members, including Mynxee, and may be raised at a future meeting. I got an evemail from Dierdra Vaal a couple of weeks later, asking me if I thought anything was wrong with the API and if I’d support him raising an issue about it. This is the sort of thing that irks and worries me – if there’s no real communication between CSM members except during their meetings with CCP, then how the hell do they hope to present any sort of unified face to CCP or address issues that are concerns to the playerbase, and not individuals within the CSM? With no communication outside of those meetings, issues aren’t being discussed till it’s far too late to be discussing them, people are missing entire issues, and this all has a knock-on effect on how much the CSM actually achieves.

Frankly, I couldn’t imagine being a member of the CSM without there being a mandatory-usage CSM delegate IRC channel or some similar chat mechanism – heck, even a mailing list or forum – where the CSM can talk amongst itself, and get the ego trips out of the way before CCP gets involved. And I suspect that the fact that a lot of people on the CSM see no problem with the current state of affairs is a good indication of just where the priorities of those members lie – because it’s certainly not with the playerbase as a whole. The CSM is bigger than EVE’s petty squabbling of alliances and corporations; we’re talking about an elected council of people who can help steer the course of a company which is putting food on the table for hundreds, and innovating hugely. The CSM as an entity and as an idea does not deserve the majority of the people it has been saddled with so far.

Nebula as it stands is frozen, awaiting information from the CSM and the motivation of the developers to work on it. I know that myself and Makurid, who thus far have written the vast majority of the code, are having more and more difficulty finding the motivation to work on any EVE Online projects, let alone projects that involve such a depressing facet of EVE, and indeed force us to try and interact with it. Certainly for now, I will be stepping aside as a developer of Nebula, and reducing the amount of time I spend on my other EVE projects. Aside from anything else, as fun as it used to be, EVE apps don’t put food on the table, and wherever my career may go, developing the apps I’ve built further may not be the smartest move – there’s other projects, other opportunities. It’s just a shame that at the moment, EVE doesn’t seem to be working out for myself and other third party devs.

Jul 13 10

PostgreSQL recovery tips

by James Harrison

Having run into some disk issues last night we’d not expected, I had some scary moments trying to find any resources for PostgreSQL recovery scenarios relating to disk failure. I chalk this up to most PostgreSQL users being sensible and using RAID1 or similar. We’re doing things on the mother of all shoestring budgets, though, so when disks start spewing things like:

[11180714.763689] ata2.00: exception Emask 0x0 SAct 0x1f SErr 0x0
action 0x0
[11180714.763726] ata2.00: irq_stat 0x40000008
[11180714.763760] ata2.00: cmd 60/08:20:17:6c:a4/00:00:31:00:00/40 tag
4 ncq 4096 in
[11180714.763761]          res 41/40:00:19:6c:a4/ff:00:31:00:00/40
Emask 0x409 (media error) <F>
[11180714.763864] ata2.00: status: { DRDY ERR }
[11180714.763893] ata2.00: error: { UNC }
[11180714.765974] ata2.00: configured for UDMA/133
[11180714.765989] sd 1:0:0:0: [sdb] Unhandled sense code
[11180714.765991] sd 1:0:0:0: [sdb] Result: hostbyte=DID_OK
driverbyte=DRIVER_SENSE
[11180714.765995] sd 1:0:0:0: [sdb] Sense Key : Medium Error [current]
[descriptor]
[11180714.765999] Descriptor sense data with sense descriptors (in hex):
[11180714.766001]         72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
[11180714.766010]         31 a4 6c 19
[11180714.766014] sd 1:0:0:0: [sdb] Add. Sense: Unrecovered read error
- auto reallocate failed
[11180714.766019] end_request: I/O error, dev sdb, sector 832859161
[11180714.766062] ata2: EH complete

… you really, really panic.

sdb in this case is one of our large 500GB disks we use for archival, so I knew damage would be limited to some of the archive tables at worst. With some help from the #postgresql channel on Freenode I got to work.

First things first: Shut down PostgreSQL. service postgresql-8.4 stop in our case.

Next, we copy all the data we can off that disk. Turned out it was just throwing errors on one file, so I duplicated the tablespace except that file onto our other 500GB disk. That one file was pg_data/16394/461543 in our tablespace – 16394 being the evemetrics_production database OID, but I didn’t know what this file was.

Once I’d moved all the data across to another disk I umounted the old one and got to work on bringing the server back up without the file.

At this point it’s worth noting one thing I did before all this cropped up: I’d taken a full backup with pg_dump, which had completed without errors. This lead me to think that the file we were looking at was an index or some system catalog.

Next, I sorted out the tablespace symlink for our sdb tablespace:

# cd /var/lib/postgresql/8.4/main/pg_tblspc/
# ls -lar
lrwxrwxrwx  1 postgres postgres   14 2010-05-14 20:32 461544 -> /disk2/pg_data
lrwxrwxrwx  1 postgres postgres   20 2010-05-14 20:44 461543 -> /disk1/pg_data
# rm 461543
# ln -s /disk2/disk1/pg_data 461543

With our tablespace now pointing at the backup copy, I brought the server back online with service postgresql-8.4 start.

Before you do anything else, you now have to update your tablespace entry on the server or Bad Things happen.

UPDATE pg_tablespace WHERE spclocation = '/disk1/pg_data' SET spclocation = '/disk2/disk1/pg_data'

Now we could find out what that file was:

postgres@pandora:/disk2$ /usr/lib/postgresql/8.4/bin/oid2name -d evemetrics_production -f 461577
From database "evemetrics_production":
  Filenode               Table Name
-----------------------------------
    461577  index_on_api_request_id
And there you have it- an index! We got off seriously lucky here, and RAID1/10/5 would’ve saved us if we had the money for it. All we had to do was issue a REINDEX command on that table and we were good to bring everything back up. Moral of the story? Backup often, backup early, and use RAID. Also, reliable disks are _so_ worth the money.
Jul 10 10

EVE Metrics 3

by James Harrison

Okay, as someone who feels an obligation to post at least once a month I’m ashamed. 8th of May? Time to sort that out.

So here’s a post about EVE Metrics 3.

We’re getting close to having everything polished and ready for release. The main issue at hand thus far has been the homepage; I say this like it’s a small thing but it’s required me to learn some new and interesting things about CSS, Makurid’s done some excellent work to produce some feed scrapers and elements for the lower portion of the page… there’s a lot to it.

I thought that it’d be good to list a few of the changes we’ve made for version 3.

  • Complete redesign of the site thanks to Rettic
  • Market detail pages have been entirely renovated
  • Various pages which haven’t been improved in some time have now been tidied up
  • API key management has been improved
  • API key permissions management has been improved
  • Backend processors for API functions and upload processing have been improved and made more reliable
  • My Metrics has been entirely renovated, now with sparklines for wallets and a new layout
  • Orders and transactions have been moved from My Metrics into their own detail pages, with a summary on My Metrics
  • Journal information has been added and given it’s own detail page
  • Player Owned Structure integration has been added, though still in it’s infancy
  • Sensitive portions of the site now make use of SSL transport encryption (HTTPS) automatically
  • Wormhole pages have been updated
  • Improvements to the corporation pages through refactoring to share view code between character and corporation pages
  • Graph improvements
  • Complete test coverage of every line of code (Nah, just kidding, we’re still pretty thin on those test things for large chunks of UI code)
  • 0.2% more cowbell
  • 5% other features I’ve not listed above, plus 100% more polish overall

Excited? We are! There’s a lot of work in the lines above and I think you’ll like the results. We’re not sticking to any firm release schedule because we’re terrible at sticking to them; we’re students, not full-time developers (incidentally, if anyone’s got any jobs available for temporary/contract work, SE UK preferred (or work-at-home), 6 weeks max, let me know!). That said, we hope to have a release before the end of July.

We’ve also been rewriting our uploader! That’s right- Linux, Mac and Windows support all in one neat package. The GUI’s not anything special but it works, and we’ll be polishing it and getting it release-ready before long. Huge thanks should be directed to TTimo, who has been the driving force behind this with some welcome Python experience, and Makurid for assisting him in developing the new client.

Once we’ve gotten that polished, packaged and rolled out, we’ll be running a 5 billion ISK contest to promote it- the three winners (each receiving a portion of the 5 billion pool) will be selected from the most active uploaders for the week or two after the competition is announced. We’ll finalize all the details and have it posted up when we’re ready to go ahead with that, of course. If you’d like to make that 5 billion figure larger, you can contribute ISK to the character MMMetrics Agent ingame! So far, thanks go out to Rilcon, Chribba and Entity for contributing to the current pool. The new uploader will be released after the new site – we’d like to change one thing at a time so we can iron out all the kinks. Once we’ve gotten it out and tested it thoroughly we’ll roll out the upgrade- your existing uploader will prompt you to update when you start it.

One last thing – I’ve personally submitted a proposal to the CSM regarding development efforts from CCP surrounding the API and EVE Gate. If you’ve not done so already, I strongly urge you to read my proposal and support it if you feel, like I do, that CCP have made some serious mistakes lately in this regard. The thread can be found here. It has been picked up and supported by at least 3 CSM members so far, but the community support will help considerably to drive CCP to consider it seriously.

May 8 10

Building Backchat, Part 2

by James Harrison

Or: How I learned to give up on projects.

Okay, so, Backchat was hugely interesting as a project. Eventually, I produced a set of graphs using the classifier that showed sentiment over time. These graphs aren’t too accurate but are fairly good at showing how things were going. However, after this I pretty much dropped the project. This was mainly due to exams cropping up and stealing my time away, but also because of how difficult it was to approach a sensible level of accuracy.

In my ‘final’ design I ended up using a bigram classifier. I added parsing of the tweets to pull out mentions of words, URLs and users, and then used this to generate my training sets, which improved things a lot. This gave me several thousand tweets for each training set, which worked okay. However, even with this classifier, which was doing a lot better than most others, my results weren’t very reliable on a tweet-by-tweet basis. Still, it wasn’t too shoddy, and the graphs on the right are fairly reliable I think in terms of general sentiment.

The AMQP-linked network of processors worked extremely well, and resulted in good throughput- I used two parsers, two classifiers and one classifier loader in the end; I was unable to achieve realtime performance due to network constraints. Sadly my ISP at home had decided that I’d used too much bandwidth and clamped me down to 128 kilobits a second. That said, thanks to the streaming API I did not (as far as I know, except for a few hundred to ratelimiting) lose any tweets, I just received them out of order and then reconstructed the correct order using the timestamps for each tweet. The machine I was using for this also pretty much went flat out on disk I/O and CPU usage, but was able to keep up- it’s a fairly old box, only a Pentium 4 with a couple of gigs of RAM.

In any case this was an interesting project and I’ll be open sourcing the data and source in the coming weeks if anyone wants to have a poke at it. While the debates are now gone and done, I’m sure people can come up with some great uses for sentiment analysis outside of UK politics.

Apr 18 10

Experiments in CL, NLP: Building Backchat, Part 1

by James Harrison

Okay, so I may have something wrong with me. As soon as anything important (in my view) comes up, I have to build an app for it. Well, sometimes. Still, the impulse is strong, and so at 2:30 AM or thereabouts I registered a domain name and got to work.

The aim of the project is this: To build a tool to do real-time analysis of Tweets for any event in terms of the sentiment of those tweets towards the various subjects of an event

I am fairly good at doing simple apps quickly. I had all but one component of this app done by the first Leader’s Debate here in the UK (allowing me to collect my data set for future development- around 185,000 tweets from 35,000 users). I’ve thrown in a handy diagram which details the data collection portion of the app as it stands. But here’s the quick overview:

  • Streamer – Uses the Twitter streaming API to receive new tweets and for each tweet, throws them onto the appropriate AMQP exchanges
  • Parser – Receives a tweet and loads it into the database. Doesn’t actually do any parsing as such yet, but could be extended to do so (extracting URIs and hashtags are the things I’m thinking of)
  • Classifier – Receives a tweet and does clever stuff on it to determine all subjects and associated sentiments, passing the results back to AMQP
  • ClassificationLoader – Receives the results from the Classifier and loads them into the database

Now, for starters this app isn’t done yet, so this is all strictly subject to change. For instance, I’d like to have the DB loader pass the tweet on to the classifier instead of the streamer since that’ll let the classifier store with reference to a DB object, and a few things like that. However, this distributed component structure means that I can run multiple copies of every component in parallel to cope with demand, across any number of computers. EC2 included, of course, but I can also use my compute cluster at home where network speed/latency isn’t a huge issue. Right now I don’t need that, but it’s nice to have and doesn’t involve a lot more work. It also lets me be language-agnostic between components, which leads me to…

CL/NLP. Short for computational linguistics/natural language processing, this is a seriously badass area of computer science. It’s still a developing field and a lot of great work is being done in it. As a result, the documentation barely exists, there are no tutorials, no how-to manuals, and what help you have assumes innate knowledge of the field. And I know nothing (Well, I know a fair bit now) about linguistics or computational linguistics or NLP. So, getting started was hard work. I ran into knowtheory, a chap in the Datamapper IRC channel of all places who happened to be a linguist interested in CL and whom has helped out substantially with my methods here.

I’ve gone through about 5 distinct versions and methods for my classifier. The first three were written in Python using the NLTK toolkit, which is great for some stuff but hard to use, especially to get results. That, and using NLTK was giving me very good results but at the cost of speed- several seconds to determine the subjects of a tweet, let alone do sentiment analysis or work out grammatical polarity and all that. Now, getting perfect results at the cost of speed was one way to go, and for all I know it might still be the way to go, but I decided to try a different plan of attack for my fifth attempt. I started fresh in Ruby using the raingrams gem for n-gram analysis, and the classifier gem to perform latent semantic indexing on the tweets.

I boiled this down to a really, really simple proof of concept (It’s worth noting that I spent _days_ on the NLTK approach. Those of you who know me will know that days are very, very rarely used to describe the amount of time I’d spend on one component of an app to get it to a barely-working stage). I figured I could train two trigram models (using sets of three words) for positive and negative sentiment respectively, then use the total probabilistic chance of a given tweet’s words (split into trigrams) appearing in either model as a measure of distance. Positive tweets should have a higher probability in the positively trained model, and a lower probability in the negatively trained one. Neat thing is, this technique sort of worked. I trained LSI to pick up on party names etc, and added common words into an unknown category so that any positive categorization would be quite certain. This doesn’t take into account grammatical polarity or anything like that, but still. Then, using the classifications, I can work out over my initial dataset what the end result was; and here it is:

# Frequencies
Total: 183518 tweets
Labour: 30871, Tory: 35216, LibDem: 25124
# Average Sentiment
#  calculated by sum of (positive_prob - negative_prob)
#  divided by number of tweets for the party
Labour: -0.000217104050691102
Tory: -0.000247080522382047
LibDem: 0.000394512163310021
# Total time for data loading, training and computation
# I could speed this up with rb-gsl but didn't have it installed
real    13m5.759s
user    12m35.800s
sys     0m12.170s

So according to my algorithm, the liberal democrats did very well while labour and especially tories didn’t do so well. Which, if you read the papers, actually fits pretty well. However, algorithmically speaking the individual results on some tweets can be pretty far out, and so there’s lots of room for improvement. And my final approach I think has to consider part-of-speech tagging and chunking, but I need to work out a way to do that faster to be able to integrate it into a realtime app.

All in all, working on Backchat has so far been hugely rewarding and I’ve learned a lot. I’m looking further into CL/NLP and looking at the fields of neural network classifiers for potentially improved results, all of which is great fun to learn about and implement. And hopefully before next Thursday I’ll have a brand new app ready to go for the second Leader’s Debate!

Apr 13 10

More on PoliticsPosters

by James Harrison

Okay, so day one of that site is over, with some lessons learned and some serious improvements made!

About 11AM, just as I was trundling along the M25 on my way to Egham, TheyWorkForYou made a change to their API method, getMP, which PoliticsPosters uses to find your MP from your postcode (or at least did- more on that in a tick). Basically since our MPs aren’t technically our MPs, the method started to return nothing, a case I hadn’t considered. Fortunately they provided the always_return option, so when I got back from a successful day in Egham at around 7PM, I quickly fixed that.

Next bug: In any election, constituency boundaries are likely to change. In this one, we got plenty. My lookup had previously been this:

  1. mp = PoliticsPosters::API.twfy.mp(:postcode=&gt;params[:postcode], :always_return=&gt;true)

Trouble is, this doesn’t work if your boundary has changed (and thus your last MP has changed). To do that, we need to use the getConstituency call with the new ‘future’ flag.

  1. constituency = PoliticsPosters::API.twfy.constituency(:postcode=&gt;params[:postcode], :future=&gt;1)
  2. mp = PoliticsPosters::API.twfy.mp(:constituency=&gt;constituency.name, :always_return=&gt;true)

And now we’re good! Easily fixed, overall, and it removed the two biggest problems. The other problems are a little trickier. One remains- handling of special characters in MP names. There’s a few MPs with circumflexes and the like in their name, which is fine if you have a language/framework/tools that support UTF8 encoding (which Ruby/Sinatra/Prawn does). But the font I’d chosen, Chunk, didn’t have any of the characters it needed to render. This remains an issue, and my current planned work-around is to degrade to another font for pages and posters that handle those names. I’ll sort that out tomorrow, though- this post is, after all, being put together at 3AM, and I have to sleep sometime.

The only other thing people wanted was more posters. I have been happy to oblige. I combined the PublicWhip policy feed provided through TheyWorkForYou, wrote a tiny scraper to get the titles off the PublicWhip site (since there’s no API for that I’m aware of), stuffed it together in PostgreSQL through DataMapper, and now I can do something like this:

  1. PoliticsPosters::MP.first(:full_name=&gt;'David Cameron').policies #=&gt; [&lt;#PoliticsPosters::MPPolicy policy_id=4 distance=0.05 etc&gt;]

Which is very, very neat, and means that developing stuff is very easy. I dug around in the source code for TheyWorkForYou and pulled out their code for rendering PublicWhip data to get the same policy IDs and descriptions, wrote some methods to scale color and size, and stuffed it all together to produce the new and updated constituency page as well as new posters- one for all policies, and one for each individual policy. The individual policy posters are rendered on demand, whereas the common constituency ones are rendered on the first view of a constituency page.

The Linode VPS has been handling the load without even noticing. Despite the site being inoperative most of the day, it got 10,000 hits from nearly 4,000 unique users, who downloaded nearly 2,000 posters. Not bad for a first day.

Tomorrow I hope to sort out UTF8 character handling and maybe even get around to publishing the source code to the site if I have a spare moment.

Apr 12 10

Democracy by Posters

by James Harrison

So, a couple of days ago I sat in front of my computer, the EVE Metrics development environment still rebuilding itself on my local development box, and thought “Right, I need a project”. Inspired by election mashups like Debillitated and VoteThemOut, I figured I should do something for the election. I tweeted, and a short while later had a suggestion from Jim Killock of the Open Rights Group. I got that at about 9PM on Saturday, and I launched PoliticsPosters.co.uk about 24 hours later via Twitter.

The basic premise of the site is you can stick in your postcode, and it’ll provide you with a poster to stick up in your window to encourage candidates to come and talk to you about the Digital Economy Bill. The poster includes how your last MP voted, too. It’ll also give you links to share the site and a link over to Email Your Candidates so you can get in touch with your candidates directly.

It’s a neat little site people seem to like, and after some polishing it runs smoothly and fast as anything, too. I’m running it off one of my Linode 360 VPSes, which are fantastic little VPSes- small but mighty. Without any optimization the site can handle 250 requests a second on the faster portions of the site, accelerated to 1000 requests a second (testing with ab, 25 concurrent requests and 15000 total) with Rack::Cache and memcached. Even the very slow, involving-the-TWFY-API-and-database-and-PDF-renderer-sometimes portion of the site can manage 80 requests a second (And my apologies for TWFY for hammering your API by accident, though it seems not to have noticed).

The site’s pretty simple. It’s a Sinatra based app, using jQuery, Cufon (for font rendering) and font-kit (for more of the same) on the client side, and YAML, DataMapper, Prawn, libraries for TWFY, Haml, Rack, Rack::Cache, and Rack::Hoptoad on the server side. If you ignore the libraries, the whole site weighs in at around 300 lines of code; I may open source it if I get a spare moment.

Prawn- an excellent Ruby PDF library- is an absolute joy to use if you’re not using custom fonts. If you are, it gets difficult due to some bugs, but they’re getting fixed and they’re easy to work around. Rendering any poster I choose along a basic common framework is accomplished in less than 30 lines of code, which lets me render posters like so:

r = PoliticsPosters::Renderer.new
  1. r.render("out.pdf", {
  2.   :text=&gt;[
  3.     {:content=&gt;"Candidates: Please call in!", :pad_top=&gt;2, :pad_bottom=&gt;2, :font_size=&gt;72},
  4.     {:content=&gt;"David Cameron didn't turn up to vote.", :font_size=&gt;48},
  5.     {:content=&gt;"Tell me what you did about the Digital Economy Act.", :pad_top=&gt;1, :font_size=&gt;42}
  6.   ]
  7. })

Mix a little YAML into the equation and I now just have a posters config file that defines all the posters for any constituency, with the appropriate names and phrases subbed in at runtime. Perfect!

DataMapper backed onto PostgreSQL running on another Linode 360 over Linode’s internal network handles persistence; this is used to store MPs, parties and constituencies. This lets me cache results of things like looking up which MPs voted which way on the Digital Economy Bill, and just makes things like that a lot easier.

The site itself is served up using Nginx 0.8.35 for static files, which routes dynamic requests via a Unix socket to Unicorn. Unicorn is a blazing fast, brilliantly easy to manage app server that will soon be replacing Thin on EVE Metrics after this; completely seamless app reboots are a matter of sending a kill -HUP, and the whole thing is speedy as hell to boot. Rack::Cache sits in front of Sinatra and uses memcached to reply with lightning speed on anything we can cache. The stack is a pleasure to maintain and works well while being fast, so I’m happy with that.

Sinatra is as ever a very nice way to write webapps, Haml I’ve used for years now and this project is no different; not a lot of it, but what there is is done with Haml exclusively. Just too damn nice to go back to Erb.

So there you have it, everything that goes into an app like that. I’ll be out most of tomorrow and hopefully it won’t need any babysitting while I’m out. Having said that, I’ve probably doomed myself to an odd bug early in the morning. Ah well. Off to sleep for me!

Apr 8 10

The Digital Economy Bill – A Cryptographer’s View

by James Harrison

I love cryptography. If you’ve ever received email from me you know I sign all my email messages with OpenPGP; many of you share keys with me and we exchange all our emails in encrypted form.

Yesterday, I went and bought an Ipredator subscription for 15 euros. Ipredator is a service that provides a PPTP Virtual Private Network endpoint in Sweden, anonymized and encrypted from your computer to the exit node.

Yesterday, the Digital Economy Bill was passed into law.

These two events were not unrelated. Why did I go and set up all of my internet traffic to be encrypted and exported from the UK before it gets released onto the internet? Because perfectly legal things that anyone on the internet does on a daily basis is being criminalized. Websites like YouTube and Google have the potential to be blocked at this point, based on rights holders (The BPI and pals) accusing sites of being likely to be used for copyright infringement. Awesome.

Not only that, but any connection in the UK which is accused of having copyright infringement associated with it (note there is no requirement for evidence, and this relates to a whole connection, not individuals) has to be disconnected by their ISP.

So, I’m turning to cryptography to cover my ass. Because if everything coming in and out of my connection is encrypted till it hits somewhere in Sweden at which point it has no actual traceable relation to me individually, then any accusations of filesharing _must_ be wrong because there’s no way they could know that (and for the record, I download the occasional TV show and Linux distributions using BitTorrent).

And I absolutely think that a little dash of cryptoanarchism in the UK would be a fantastic thing right now. Spread the word on cryptography and VPNs, get Tor and I2P in more mainstream use with your friends and family, you name it. Let’s face it- nearly everyone who can use BitTorrent or anything else to infringe copyright can use cryptography to hide that without much effort or expense, and there’s no reason why people who don’t infringe copyright shouldn’t use it. We all use cryptography every day of our lives for online banking, shopping, or just looking at some sites which default to using SSL (I use Github for example, which bundles SSL with their paid accounts). The government can’t regulate what it can’t see, and it can’t make bills to regulate cryptography out of existence. It’s a solution, though a tricky one.

I’ve already started thinking- what about cheap, mainstream-friendly VPN appliances? I’m not the only person to think of this it looks like- there’s a fair bit of discussion about this. It’d be great- imagine a £50-75ish bit of kit, you buy it, you get the hardware, some bundled months of VPN access, which you can add onto whenever you want to- if it were a polished and easy to set up (plug into wall, network cables between your existing router/modem and your computer(s), turn on, make account, done), then it’d have a chance of becoming a way to deliver cryptography against not only government and ISP snoopers, but also would provide security for people at places like university halls of residence, shared homes, etc. Heck, I use Ipredator on my iPod to encrypt anything going over wifi while I’m out and about using public wifi spots and campus wifi. I’m not concerned about being snooped on, but why not? (A bit of discussion came to the conclusion that a custom firmware for a WRT54GL would be the way to go).

The point is that those who do infringe copyright will always be one step ahead of the curve technologically. I absolutely predict VPN tunnels will be the next big thing for BitTorrent users and legitimate users alike. And what will the government do then? Cut off anyone using a VPN? There go business users working from home. Cut off anyone with lots of internet usage? (Not that ISPs aren’t trying to do that anyway. I’m looking at you, PlusNet!) You’d cut off half the UK, including anyone who used iPlayer. VPNs are the way to go, though the number of providers could do with increasing.

And there’s no chance the government can keep up- and why should it? At the end of the day, copyright needs to be reformed to take the internet into account. This is the only way that the problem will ever be solved from a legal standpoint- trying to win this war with technology won’t work for the government. Deep Packet Inspection hardware works till you slather everything with crypto. Disconnection notices work till you realise that all the pirates are using dynamic IPs and the ISPs don’t keep track of who has what IP at any given moment (and if they do, why? Do they have a legal onus to do so?)IP of course actually stands for Internet Protocol, and even if DPI or Disconnection worked, you’re still not fixing the problem, and you’re still causing huge inconvenience for all the users of that connection.

I’d like to think that our ministers are vaguely understanding what all this means and that at least the guys in charge of all this know their technical stuff. Alas, this has been revealed to not be the case; The Rt Hon Stephen Timms MP, Minister for Digital Britain, revealed in this letter that he believes the term “IP Address” to mean “Intellectual Property Address”. I feel that it’s unlikely that he’s confused IP addresses with a URN scheme or anything, and that he really does think that’s what IP stands for. If you don’t know, it actually means Internet Protocol, because it’s the underlying framework most of our internet relies on for communication between computers. It’s the sort of thing you cover in the first lesson if you’ve ever been taught anything about networking.

With this absolute failure to have knowledge where it’s needed in the current UK political system it’s even more important that the Digital Economy Bill be removed as soon as humanly possible (if that is possible; I’m not a lawyer) or at least be heavily amended. And in the meantime, we should be encouraging MPs to learn the basics, voting for those who will make the right decisions (looks like LibDem for me), and spreading the word about all this as fast as we can. And a little cryptoanarchism wouldn’t hurt, either.

And on that note, I’m going to go see what it’d take to set up a free VPN endpoint on one of my underused VPSes.

http://www.openrightsgroup.org/campaigns/disconnection/why-care
Feb 27 10

accVIEW Rejuvenated

by James Harrison

Well, it’s been way too long since I opened an editor and got to work on accVIEW’s source, and it really showed. In reality, accVIEW was something I slapped together in an afternoon for Vanguard Frontiers, home of myself, PyjamaSam (of Capsuleer fame) and some of the best pilots I’ve ever flown with. We needed a better way to do API checks and this was it.

I made it public and popularity grew. I added some features, added the premium option for those who wanted a bit more, and it’s been ticking along, occasionally throwing horrible errors and falling over, the background worker regularly falling over and dying, and running on a Quantum Rise datadump. And there was a major security glitch- we didn’t store API keys, making it impossible to validate people regularly, meaning people who left corporations could still view their old corp’s requests. And they couldn’t update their account to their new corporation.

No more.

accVIEW has gotten a fresh new facelift, skill distribution graphs, a fundamental API key change, some improved code throughout and a new database dump update. I’ve also added a ‘forgot password’ feature for those who don’t remember their logins too well, and fixed a few outstanding bugs.

If you’re an accVIEW user, next time you log in you will be prompted for your API key again. This is to be expected; the reason we’re doing this is so we have a copy we can re-validate regularly (once a day) to ensure that you are still in the corporation you were in last time we looked. If you change corporations, your main character will be dissociated and you’ll have to reenter your API keys next time you log in and choose a new main character.

Enjoy!