<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Talk Unafraid &#187; twitter</title>
	<atom:link href="http://www.talkunafraid.co.uk/tag/twitter/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.talkunafraid.co.uk</link>
	<description>The (occasionally coherent) ramblings of a geek</description>
	<lastBuildDate>Sat, 07 Jan 2012 22:24:46 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Facebook and why your organization should be ignoring it</title>
		<link>http://www.talkunafraid.co.uk/2011/09/facebook-and-why-your-organization-should-be-ignoring-it/</link>
		<comments>http://www.talkunafraid.co.uk/2011/09/facebook-and-why-your-organization-should-be-ignoring-it/#comments</comments>
		<pubDate>Tue, 27 Sep 2011 21:47:41 +0000</pubDate>
		<dc:creator>James Harrison</dc:creator>
				<category><![CDATA[Odds and Ends]]></category>
		<category><![CDATA[Politics and Organizations]]></category>
		<category><![CDATA[email]]></category>
		<category><![CDATA[facebook]]></category>
		<category><![CDATA[infrastructure]]></category>
		<category><![CDATA[privacy]]></category>
		<category><![CDATA[rhul]]></category>
		<category><![CDATA[security]]></category>
		<category><![CDATA[social media]]></category>
		<category><![CDATA[surhul]]></category>
		<category><![CDATA[twitter]]></category>

		<guid isPermaLink="false">http://www.talkunafraid.co.uk/?p=1357</guid>
		<description><![CDATA[There&#8217;s a huge amount of talk out there about how best to use Facebook as an organization. How you can generate massive amounts of publicity and interest, capture new users and visitors, and maximize engagement. All those silken terms that sales and marketing people love to liberally spray all over their presentations. Well, this is [...]]]></description>
			<content:encoded><![CDATA[<p>There&#8217;s a huge amount of talk out there about how best to use Facebook as an organization. How you can generate massive amounts of publicity and interest, capture new users and visitors, and maximize engagement. All those silken terms that sales and marketing people love to liberally spray all over their presentations. Well, this is not a blog post about how you can do that. I don&#8217;t have much of an issue with people using Facebook as a PR tool and a marketing tool- after all, that <em>is</em> what it was designed to be. Marketing yourself, originally, and like all popular but free websites, the site rapidly became about marketing to users.</p>
<p>No, this is a post about why you should ignore Facebook. Turn a blind eye and let it pass. It will, in time, fade away, like Yahoo, MSN and others before those. It may have a huge number of users, but then so did MySpace. People will move on, and Facebook is already <a href="http://www.guardian.co.uk/technology/2011/jun/13/facebook-growth-slows-for-second-month" target="_blank">worrying about growth figures</a>. But that&#8217;s not <em>why</em> you should be ignoring it.<span id="more-1357"></span></p>
<p>You should be ignoring it because all of the things that you try and promote can be just as easily undone, and using Facebook as a solution to problems that your organization faces will only lead to more problems in the long run- not least of all when Facebook finally goes the way of MySpace and people move on. But that&#8217;s not all.</p>
<p>First off, some disclosure about myself. I don&#8217;t have a Facebook account. I nuked my account after my exams ended, before the holidays started, months ago. I&#8217;m an active member of the Royal Holloway Students&#8217; Union. I attend numerous events and work with various media outlets on campus. At no point in doing all of this have I actually needed to visit Facebook.com, much less log in. The last stored login details in my browser aren&#8217;t even mine- they&#8217;re a friend&#8217;s left over from when she borrowed my PC to check some messages. I&#8217;ve written apps for Facebook before, and have worked with the APIs and tools that Facebook gives developers to let them access what they describe as &#8216;the social graph&#8217;- that is, your data. And I spend a lot of time working on computer security and, specifically, web security and analysis. I wouldn&#8217;t call myself an expert or even a specialist, but I have half a clue about this stuff.</p>
<p>So- your organization is organizing an event. You&#8217;ve got a website, and you&#8217;ve got someone (or a company) maintaining it. What do you do- put the event information on your website, or on Facebook? Unfortunately, lots of people will answer &#8220;Facebook&#8221; to this question. The correct answer is your website first, Facebook to raise awareness, but the information and (if appropriate) booking/RSVP information should be on your website. The Student Radio Awards are the latest in a long line of offenders to reach my inbox, informing me gleefully of a page on their website which promptly takes you to Facebook so you can RSVP to their nomination parties. As a person who chooses not to use Facebook, I now cannot interact with these events, and if I want to RSVP I&#8217;d have to reactivate my account, which I&#8217;m not doing for an SRA party. I emailed back, and got the response &#8220;But you can still see the information without joining Facebook&#8221;. True enough- but here&#8217;s the rub.</p>
<ul>
<li>Your organization does not control access to that data any more. Facebook does.</li>
<li>Your organization is specifically endorsing Facebook at this stage. Do you really want to do that, with all the <a href="http://news.cnet.com/8301-13578_3-20006532-38.html" target="_blank">privacy</a> and <a href="http://nikcub.appspot.com/logging-out-of-facebook-is-not-enough" target="_blank">security</a> issues flaring up right now?</li>
<li>Your organization is specifically blocking people who do not want to use Facebook from using your services.</li>
<li>To view your data, users are forced to interact with Facebook servers. Users with privacy or security concerns can now not use your service.</li>
<li>Facebook is not your server. It&#8217;s unlikely that your server is blocked from any workplace networks- how about Facebook?</li>
</ul>
<p>Some of my other brethren in the webdevel community (SEO and marketing experts particularly) will be flaunting the positive aspects &#8211; reaching more people, less operational costs for your own site and infrastructure, being seen to be &#8216;social&#8217; (which is a big deal for some organizations who have talked themselves into it thanks to overzealous marketing gurus wanting to be seen to keep up with emergent trends- come on, guys, it&#8217;s just a fancy way of sending email newsletters). Do they balance up? Actually, no. The enhanced reach that comes with getting on Facebook&#8217;s timeline/front page for people is a big deal. There&#8217;s no hiding from that, especially in a university environment where people just don&#8217;t seem to understand that Facebook is optional. There&#8217;s no harm in using Facebook to link to things on your site, of course- there&#8217;s no such thing as bad publicity. But the second you start putting information on Facebook that isn&#8217;t on your own infrastructure, you&#8217;re damaging yourself. In severe cases you can <a href="http://mediadecoder.blogs.nytimes.com/2011/09/26/angry-reaction-to-spotifys-new-facebook-id-requirement/" target="_blank">piss users off in droves</a>.</p>
<h3>Facebook is not infrastructure for your organization. Build your own infrastructure- everyone on the web will be better off for it, and your users will thank you.</h3>
<p>The privacy issues flaring up at Facebook are serious, and in some states in Germany, sites are being <em><a href="http://siliconfilter.com/germany-vs-facebook-like-button-declared-illegal-sites-threatened-with-fine/" target="_blank">fined for using the Like button</a> as it violates German law. </em>Do you really want to be supporting, encouraging and endorsing the sort of company that supports this degree of invasion of privacy?</p>
<p>If the pile of (in my mind very compelling) reasons why you shouldn&#8217;t be using Facebook as infrastructure listed above don&#8217;t get you, okay, well let&#8217;s look at this another way. Facebook is a company that exists to make money off your data. You, as a company, help them, as a company, get more money (through advertising) in return for some page views. But it wasn&#8217;t always this way. In the past, people built interesting and engaging websites, and people actually visited them of their own volition. You didn&#8217;t need social engagement to get page hits, because people would visit your site anyway. And you know what? They still do, Facebook or no. And you own it. What you make, you own, you can control precisely as you desire. Facebook you can control however they decide you can control it all one week to the next- everything changes so often that forming solid policies or organizational structures on how to use Facebook is next to impossible. The page views to your site that are effectively the sole reason why organizations started using Facebook meanwhile dwindle as you put more on Facebook and less on your site. In extreme cases you stop updating your site, and get confused as to why people ask why your site&#8217;s entirely empty and has no information on it- &#8220;oh, it&#8217;s on the Facebook page&#8221;. More damage to your organization&#8217;s reputation and web presence.</p>
<p>Organizations should be proud of their web presence, and shouldn&#8217;t just say &#8220;Okay, our website&#8217;s rubbish, let&#8217;s just use Facebook instead&#8221;. Fix your website. Build your own infrastructure. If you want to be social, at least use Twitter, which is simple, straightforward, open and easy. It&#8217;s hugely popular, has far fewer privacy and data retention concerns and issues inherent to its nature, and is massively more powerful in terms of engagement and potential page hits.</p>
<p>It&#8217;s not your infrastructure, though, just as much as Google Plus or Diaspora aren&#8217;t. Look at NASA&#8217;s tweets- they all link to a NASA website. The Guardian doesn&#8217;t link to Facebook. You shouldn&#8217;t either.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.talkunafraid.co.uk/2011/09/facebook-and-why-your-organization-should-be-ignoring-it/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Experiments in CL, NLP: Building Backchat, Part 1</title>
		<link>http://www.talkunafraid.co.uk/2010/04/experiments-in-cl-nlp-building-backchat-part-1/</link>
		<comments>http://www.talkunafraid.co.uk/2010/04/experiments-in-cl-nlp-building-backchat-part-1/#comments</comments>
		<pubDate>Sun, 18 Apr 2010 02:23:21 +0000</pubDate>
		<dc:creator>James Harrison</dc:creator>
				<category><![CDATA[Awesome Stuff]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Projects]]></category>
		<category><![CDATA[amqp]]></category>
		<category><![CDATA[computational linguistics]]></category>
		<category><![CDATA[distributed computing]]></category>
		<category><![CDATA[natural language processing]]></category>
		<category><![CDATA[neural networks]]></category>
		<category><![CDATA[nltk]]></category>
		<category><![CDATA[politics]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[ruby]]></category>
		<category><![CDATA[sentiment analysis]]></category>
		<category><![CDATA[twitter]]></category>

		<guid isPermaLink="false">http://www.talkunafraid.co.uk/?p=862</guid>
		<description><![CDATA[Okay, so I may have something wrong with me. As soon as anything important (in my view) comes up, I have to build an app for it. Well, sometimes. Still, the impulse is strong, and so at 2:30 AM or thereabouts I registered a domain name and got to work. The aim of the project [...]]]></description>
			<content:encoded><![CDATA[<p>Okay, so I may have something wrong with me. As soon as anything important (in my view) comes up, I <em>have </em>to build an app for it. Well, sometimes. Still, the impulse is strong, and so at 2:30 AM or thereabouts I registered a domain name and got to work.</p>
<p>The aim of the project is this: To build a tool to do real-time analysis of Tweets for any event in terms of the sentiment of those tweets towards the various subjects of an event</p>
<p>I am fairly good at doing simple apps quickly. I had all but one component of this app done by the first Leader&#8217;s Debate here in the UK (allowing me to collect my data set for future development- around 185,000 tweets from 35,000 users). <a href="http://assets.talkunafraid.co.uk/2010/04/backchat.png" rel="lightbox[862]"><img class="alignright size-thumbnail wp-image-863" title="Backchat in Components" src="http://assets.talkunafraid.co.uk/2010/04/backchat-150x150.png" alt="" width="150" height="150" /></a> I&#8217;ve thrown in a handy diagram which details the data collection portion of the app as it stands. But here&#8217;s the quick overview:</p>
<ul>
<li>Streamer &#8211; Uses the Twitter streaming API to receive new tweets and for each tweet, throws them onto the appropriate AMQP exchanges</li>
<li>Parser &#8211; Receives a tweet and loads it into the database. Doesn&#8217;t actually do any parsing as such yet, but could be extended to do so (extracting URIs and hashtags are the things I&#8217;m thinking of)</li>
<li>Classifier &#8211; Receives a tweet and does clever stuff on it to determine all subjects and associated sentiments, passing the results back to AMQP</li>
<li>ClassificationLoader &#8211; Receives the results from the Classifier and loads them into the database</li>
</ul>
<p>Now, for starters this app isn&#8217;t done yet, so this is all strictly subject to change. For instance, I&#8217;d like to have the DB loader pass the tweet on to the classifier instead of the streamer since that&#8217;ll let the classifier store with reference to a DB object, and a few things like that. However, this distributed component structure means that I can run multiple copies of every component in parallel to cope with demand, across any number of computers. EC2 included, of course, but I can also use my compute cluster at home where network speed/latency isn&#8217;t a huge issue. Right now I don&#8217;t need that, but it&#8217;s nice to have and doesn&#8217;t involve a lot more work. It also lets me be language-agnostic between components, which leads me to&#8230;</p>
<p>CL/NLP. Short for computational linguistics/natural language processing, this is a seriously badass area of computer science. It&#8217;s still a developing field and a lot of great work is being done in it. As a result, the documentation barely exists, there are no tutorials, no how-to manuals, and what help you have assumes innate knowledge of the field. And I know <em>nothing</em> (Well, I know a fair bit now) about linguistics or computational linguistics or NLP. So, getting started was hard work. I ran into <a href="http://blog.knowtheory.net">knowtheory</a>, a chap in the Datamapper IRC channel of all places who happened to be a linguist interested in CL and whom has helped out substantially with my methods here.</p>
<p>I&#8217;ve gone through about 5 distinct versions and methods for my classifier. The first three were written in Python using the <a href="http://www.nltk.org/">NLTK </a>toolkit, which is great for some stuff but hard to use, especially to get results. That, and using NLTK was giving me very good results but at the cost of speed- several seconds to determine the subjects of a tweet, let alone do sentiment analysis or work out grammatical polarity and all that. Now, getting perfect results at the cost of speed was one way to go, and for all I know it might still be the way to go, but I decided to try a different plan of attack for my fifth attempt. I started fresh in Ruby using the <a href="http://github.com/postmodern/raingrams">raingrams</a> gem for n-gram analysis, and the <a href="http://github.com/luisparravicini/classifier">classifier </a>gem to perform latent semantic indexing on the tweets.</p>
<p>I boiled this down to a really, really simple proof of concept (It&#8217;s worth noting that I spent _days_ on the NLTK approach. Those of you who know me will know that days are very, very rarely used to describe the amount of time I&#8217;d spend on one component of an app to get it to a barely-working stage). I figured I could train two trigram models (using sets of three words) for positive and negative sentiment respectively, then use the total probabilistic chance of a given tweet&#8217;s words (split into trigrams) appearing in either model as a measure of distance. Positive tweets should have a higher probability in the positively trained model, and a lower probability in the negatively trained one. Neat thing is, this technique sort of worked. I trained LSI to pick up on party names etc, and added common words into an unknown category so that any positive categorization would be quite certain. This doesn&#8217;t take into account grammatical polarity or anything like that, but still. Then, using the classifications, I can work out over my initial dataset what the end result was; and here it is:</p>
<pre># Frequencies
Total: 183518 tweets
Labour: 30871, Tory: 35216, LibDem: 25124
# Average Sentiment
#  calculated by sum of (positive_prob - negative_prob)
#  divided by number of tweets for the party
Labour: -0.000217104050691102
Tory: -0.000247080522382047
LibDem: 0.000394512163310021
# Total time for data loading, training and computation
# I could speed this up with rb-gsl but didn't have it installed
real    13m5.759s
user    12m35.800s
sys     0m12.170s
</pre>
<p>So according to my algorithm, the liberal democrats did very well while labour and especially tories didn&#8217;t do so well. Which, if you read the papers, actually fits pretty well. However, algorithmically speaking the individual results on some tweets can be pretty far out, and so there&#8217;s lots of room for improvement. And my final approach I think has to consider part-of-speech tagging and chunking, but I need to work out a way to do that faster to be able to integrate it into a realtime app.</p>
<p>All in all, working on Backchat has so far been hugely rewarding and I&#8217;ve learned a lot. I&#8217;m looking further into CL/NLP and looking at the fields of neural network classifiers for potentially improved results, all of which is great fun to learn about and implement. And hopefully before next Thursday I&#8217;ll have a brand new app ready to go for the second Leader&#8217;s Debate!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.talkunafraid.co.uk/2010/04/experiments-in-cl-nlp-building-backchat-part-1/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>EVE Fanfest(feed) 2009</title>
		<link>http://www.talkunafraid.co.uk/2009/10/eve-fanfestfeed-2009/</link>
		<comments>http://www.talkunafraid.co.uk/2009/10/eve-fanfestfeed-2009/#comments</comments>
		<pubDate>Sun, 04 Oct 2009 23:35:48 +0000</pubDate>
		<dc:creator>James Harrison</dc:creator>
				<category><![CDATA[EVE]]></category>
		<category><![CDATA[EVE Metrics]]></category>
		<category><![CDATA[MMMetrics]]></category>
		<category><![CDATA[fanfest]]></category>
		<category><![CDATA[flickr]]></category>
		<category><![CDATA[servers]]></category>
		<category><![CDATA[service]]></category>
		<category><![CDATA[stuffisawesome]]></category>
		<category><![CDATA[twitter]]></category>

		<guid isPermaLink="false">http://www.talkunafraid.co.uk/?p=512</guid>
		<description><![CDATA[Well, that fateful time of year comes along again- thousands of EVE Online players meet for fanfest in Reykjavik, Iceland. And I can never make it. This year, my studies conspired against me; except they didn&#8217;t. While unknown until hours beforehand, I actually had no work and a lecture on basic packet switching keeping me [...]]]></description>
			<content:encoded><![CDATA[<p>Well, that fateful time of year comes along again- thousands of EVE Online players meet for fanfest in Reykjavik, Iceland. And I can <em>never make it</em>. This year, my studies conspired against me; except they didn&#8217;t. While unknown until hours beforehand, I actually had no work and a lecture on basic packet switching keeping me in England. Doh.</p>
<p>Anyway. We got a lot of fluff, this year. Aside from further elaboration on stuff already announced, there were actually no major announcements made at fanfest. We did have some interesting info about New Eden, CCP&#8217;s EVE-Online-Online website. And there was some evidence (gasp!) that CCP were listening to third party developer suggestions at the API roundtable.</p>
<p>There was almost enough minor stuff announced to make it worthwhile. We did get a release date for Dominion &#8211; 1st December 2009. But no New Eden with the launch. And knowing CCP we&#8217;ll probably not get API changes till a bit after that. What&#8217;s really awesome though is that we will be getting new APIs. I&#8217;m just hoping they&#8217;re useful APIs&#8230;</p>
<p>Anyway, while I was sitting at home being mostly bored, I decided I&#8217;d had enough pressing F5 on the Twitter search page, and put together a website (ff.mmmetrics.co.uk &#8211; it&#8217;s down now) to grab EVE fanfest feeds from Twitter and Flickr. This became popular enough within a few hours that we had to rip it off the server and give it it&#8217;s own Amazon EC2 virtual server, as it was in danger of crashing ISKsense and EVE Metrics. Doh. A wild success, in any case, for a simple but handy website. What the website did make us realise is how little headroom we have on our current server. We kinda knew that already but it did make the point quite well.</p>
<p>EVE Metrics 2.1 has launched mostly well but we&#8217;re still having issues with the API processing code. Makurid has been working hard to pin down the cause of the problems and destroy it while I&#8217;ve been fixing up servers and moving sites around, and we&#8217;re getting a bit closer to having a complete fix. We&#8217;re not there yet, but we will be soon with any luck.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.talkunafraid.co.uk/2009/10/eve-fanfestfeed-2009/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
	</channel>
</rss>

