<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	
	xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>OMG. OMG! OMFG! Digital Meets Analog, by AV Flox &#187; Emil Kågström</title>
	<atom:link href="http://omgomgomfg.com/tag/emil-kagstrom/feed/" rel="self" type="application/rss+xml" />
	<link>http://omgomgomfg.com</link>
	<description></description>
	<lastBuildDate>Sat, 19 Jun 2010 00:02:38 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>GenderAnalyzer: Girls Who&#8217;re Boys Who Like Boys To Be Girls</title>
		<link>http://omgomgomfg.com/2008/11/22/genderanalyzer-girls-who-are-boys-who-like-boys-to-be-girls/</link>
		<comments>http://omgomgomfg.com/2008/11/22/genderanalyzer-girls-who-are-boys-who-like-boys-to-be-girls/#comments</comments>
		<pubDate>Sat, 22 Nov 2008 22:36:00 +0000</pubDate>
		<dc:creator>AV Flox</dc:creator>
				<category><![CDATA[intarwebz]]></category>
		<category><![CDATA[classifier technology]]></category>
		<category><![CDATA[Emil Kågström]]></category>
		<category><![CDATA[GenderAnalyzer]]></category>
		<category><![CDATA[Jon Kågström]]></category>
		<category><![CDATA[Roger Karlsson]]></category>
		<category><![CDATA[spam filters]]></category>
		<category><![CDATA[uClassify]]></category>

		<guid isPermaLink="false">http://omgomgomfg.com/?p=374</guid>
		<description><![CDATA[I had a thing with a guy on the internet when I was in college. We exchanged e-mails for a while; I knew he was into me. I knew it even though he constantly worried that I was a bored 40-something mid-western dude toying with his emoticons. 
He’s never given me any reason as to [...]]]></description>
			<content:encoded><![CDATA[<p>I had a thing with a guy on the internet when I was in college. We exchanged e-mails for a while; I knew he was into me. I knew it even though he constantly worried that I was a bored 40-something mid-western dude toying with his emoticons. </p>
<p>He’s never given me any reason as to why he thought I was a man. </p>
<p>Neither did the <a href=http://genderanalyzer.com/>GenderAnalyzer</a> when I plugged in my blog to analyze my gender.</p>
<p><center><img src=http://omgomgomfg.com/wp-content/uploads/2008/11/genderanalyzer.jpg></center></p>
<p>Curious to see how they’d arrived at this conclusion, I contacted the team at GenderAnalyzer and asked. Jon Kågström, the brains behind the operation, wrote me back to fill me in on the details of this wondrous machine:</p>
<blockquote><p>It all started back in 2004 when I was doing my master thesis on machine learning for spam filtering. I was fascinated by how well they worked on spam (often better than humans) and started to wonder what more than spam text classifiers could be used for. I did a big number of hobby research projects testing classifiers on different domains (e.g., sentiment, happy/sad, web page categorization). </p>
<p>Doing this while improving the classifier technology derived from my master thesis, I came up with the idea to let everyone have access to classifiers. So with help from two friends, Roger Karlsson and Emil Kågström, we built uclassify.com. The idea is to share advanced classifier technology that is easy to use (don&#8217;t even have to be a programmer) for free in a Web 2.0 format. </p>
<p>After we had finished with uclassify.com we decided to test if it&#8217;s possible to have a computer differentiate between males and females by looking at their text. We found the idea really interesting so we collected 2,000 blogs written by males and females and used the uClassify API to train a classifier. We decided to put it into test and created <a href=http://genderanalyzer.com/>GenderAnalyzer</a>. </p>
<p>The accuracy is lower than we expected and we believe a major reason for that is that the training data is biased (only collected from blogspot). We think we can get better accuracy by using the URLs that users test as training data (when they vote if it worked or not, we train it accordingly). In this way we would have a classifier that adapts to real world gender data. Just as machine learning spam filters do.</p></blockquote>
<p>Today on the <a href=http://blog.uclassify.com/genderanalyzer-thoughts/>uClassify blog</a>, Kågström elaborated on the Analyzer’s current low accuracy (53%):</p>
<blockquote><p>Our training data of 2,000 blogs is automatically collected from blogspot. Running internal tests (10 fold cross validation) on this data gives us an accuracy of 75%.  This effectively means “Given that the corpus is a perfect representation of real world data, the classifier is able to give any real world data the correct label by a chance of 75%”. So our training data is probably not very representative, as a matter of fact it’s very stereotypical.</p>
<p>When someone is testing a blog we are not crawling through posts on the blog to get a good amount of text. We are only hitting the given URL and using the text (and html) that appear there as test data. So a page with mostly images or frames will give bad test data.</p>
<p>We are trying to encode test data to utf-8 which is the format of the training data &#8211; it could be that we are missing some encodings.</p></blockquote>
<p>It’s a worthwhile experiment and the technology is there to make the process of gender analysis possible. If there are any issues with <a href=http://genderanalyzer.com/>GenderAnalyzer</a>, further development of the tested blog samples will enable the Analyzer to increase its accuracy levels, and as more people make use of it and the machine gets more training, it will become better equipped to answer the question “man or woman—who is writing that blog?”</p>
<p>Though Kågström brings up a good question in the <a href=http://blog.uclassify.com/genderanalyzer-thoughts/>uClassify’s blog</a> post, maybe “the difference between male and female writing is not significant?”</p>
<p>Only one way to find out. Best of luck to the team at <a href=http://genderanalyzer.com/>GenderAnalyzer</a>.</p>
<p>As for the college e-fling, we still talk form time to time. We’re both married now—to other people. I think he finally believes that I’m a woman. It only took him five years. </p>
]]></content:encoded>
			<wfw:commentRss>http://omgomgomfg.com/2008/11/22/genderanalyzer-girls-who-are-boys-who-like-boys-to-be-girls/feed/</wfw:commentRss>
		<slash:comments>15</slash:comments>
	
		<media:thumbnail url="http://omgomgomfg.com/wp-content/uploads/2008/11/genderanalyzer.jpg" />
		<media:content url="http://omgomgomfg.com/wp-content/uploads/2008/11/genderanalyzer.jpg" medium="image" />
	</item>
	</channel>
</rss>
