mostly research stuff
okay, remember a long while back when i was talking about cool stuff at microsoft research? i brought up a way to search usenet using msr’s netscan search interface, a very cool labs project…and of course, others have tried to make finding those usenet newsgroups easier, though few took off, and fewer have stood the test of time…but shit, i was blown away when i started toying with the guba overhaul - have you seen it? they now bill themselves as the ‘premier multimedia search site’…
guba stands for ‘gigantic usenet binaries archive’…what’s usenet? per wikipedia , “Usenet is a distributed Internet discussion system that evolved from a general purpose UUCP network of the same name. It was conceived by Duke University graduate students Tom Truscott and Jim Ellis in 1979. Users, sometimes called Usenetters, read and post email-like messages (called “articles”) to a number of distributed newsgroups, categories that resemble bulletin board systems in most respects. The medium is sustained among a large number of servers, which store and forward messages to one another. Usenet is of significant cultural importance in the networked world, having given rise to, or popularized, many widely recognized concepts and terms such as “FAQ” and “spam”.“…and per guba, “With over 2 terabytes of new content posted daily…no company, organization, or government controls Usenet…Structurally, Usenet is comprised of thousands of discussion groups called “newsgroups,” which have names like “alt.binaries.multimedia.Comedy.”
…so why use guba? well, for starters, before guba (launched around 1998), there was no simple way to look at all of usenet because no single server looked at all of the content, but the new guba interface allows for very simple and structured meta-search…you can search all of the images, videos, files and text touched, described, posted and discussed across usenet, with real time updates…very, very cool stuff….from guba, “Search within the terabytes of continuously refreshed Usenet images and videos using advanced search utilities to help users find relevant content; Filter video search results by keyword, file size, video duration, and other criteria; Save, Manage, and Update searches automatically according to users’ preferences; Preview video as rolling thumbnails or in an extracted frame format so that users can download only the files they want; and Watch Video in Flash regardless of video format”
…am i missing something, i mean why haven’t the big search engines gone after this already? what the hell is elGoog doing with that dejanews acquisition anyways, besides allowing groups to post into usenet? …this is obviously a backdoor to a lot of pirated content (duh), but any good search engine can dig up this same stuff…hello? you listenin’ elgoog, or yahoo, or microsoft? time to integrate with the main page of your search engines, or - mark my words - watch sites like guba usurp you in a media-centric device-agnostic future of personal web services…(’specially since people are always looking for better ways to find pirated content ;)
okay, so here’s some free investigative insight from yours truly regarding thenamesdatabase, a social networking site that has evolved from a pimple on the arse of the internet to a networking site of brobdingnagian proportions in under 24 months….never heard of it? okay, let me tell you a quick story…i got an email from my friend shally the other week inviting me to join (it came from inside the site)…i clicked on through (shally generally sends only good stuff ;)….
…but before i could access this database of over 17 million people, i was required to give up 4 new friends’ email addresses - damn! it chilled my shit…was this some kind of digital subterfuge? a marketeer playing on the trust-based networks of active friendships and contacts to create the world’s largest email harvesting machine? i tried putting in fake emails, but it caught me, then i tried putting in my own other emails, and it caught me again…ouch…being the naturally curious guy i am, i took it upon myself to track down one of the elusive founders myself - and i spent some time talking to him by phone, and in the end i walked away considerably more comfortable - this site appears to be for real, designed for users, not a grandiose marketing ruse…
…now i promised this guy that i would not share his email or his home or mobile numbers (which i now have), so let’s just call him ‘dan’…about 2 years ago, dan and a friend came up with this idea for a networking site, a low-on-graphics, low-on-profile-data version of something between friendster and classmates…the ‘big idea‘: most users around the globe simply can not display the graphics and trick-laden navigation features of these us-centric networking sites (think linkedin) and most users could give two shits about the ridiculous feature sets available from sites like myspace (as in, too much to do, so nothing gets done)..their idea: just connect people, with a name, a valid email, some key areas of interest (e.g. military alum) and key notes regarding where they went to school and where they’re from - note: this site collects ip addresses and pinpoints locations (validity depends on where you are when you register)…
..but here’s what’s really fascinating: within 2 years, this site - operated by only 2 guys working part time - has Read the rest of this entry »
okay, so i’m reading this quick bit on how email turned 34 (hey, it lived longer than jesus and alexander the great!)…but, uh, i got a story problem…this guy paul bucheit from elGoog is credited as the first gmail engineer, and from what i understand, there’s another important guy not mentioned in that article - georges harik, one of elgoog’s earliest (and probably richest) employees…
who is george, and what am i talking about? well, according to webtalkradio , he did some critical early work on gmail…now’s he’s some kinda ‘distinguished engineer’ - a summary bio for all y’all: “During the mid-90s, Georges Harik visited the Illinois Genetic Algorithms Lab for an extended period as part of his PhD studies at the University of Michigan. He did a lovely thesis on the linkage learning GA, and did a number of other very creative things, including the gambler’s ruin problem population sizing, compact GA, and extended compact GA. Georges left the lab to make his mark in commerce, first working for a data-mining group at SGI and then taking a position as one of the early employees at Google in early 1999. Today, Georges is Director of Googlettes as part of Google. ” - am i wrong about george’s work with gmail? somebody? anybody? bueller…?
…my point? hey, elgoog, we all know that fights for authorship take great companies down, and now that you’ve grown up and finally hired some real gray hair, perhaps you oughta start citing all key contributors to your apps (the guy even says “we” a few times!)…just my two cents. not even sure why i blogged this bit to begin with, just hate when people get overlooked, even if i don’t know ‘em personally….
ever wonder what big consulting firms do to fill the idle time of consultants because utilization rates tend to be so low? the answer: do some kinda bullshit survey, ideally in conjunction with some other organization that also has too much free time…now don’t get all pissy with me, because i agree that some of these studies are good stuff - though often they’re just a load of shit tagged with logos, contacts and associated whitepapers (also authored by idle-time consultants spitting up erstwhile reflections on crap that means nothing to those of us who live in the here and now..)
oh, my point? well, i actually stumbled onto summary results of a big survey from informationweek done in conjunction with accenture that looked at how companies spy on their employees…errr, i mean ‘how companies monitor employee communications to ensure privacy and promote security‘….the results should chill your shit and remind you that there’s always a good reason to own your own ‘other’ computer …here’s the skinny on the numbers for over 2,500 us-based businesses that responded:
15 percent track printing and photocopying (get your ass off of that machine! only i do that)
2 in 5 review phone logs (no more phone-singles calls…)
1/3 monitor opening of email attachments (and i’ll bet half of all cio’s keep the good nudey pics)
1/3 monitor instant messaging (that’s gotta be boring-ass job, reviewing im logs)
30 percent track time in the office (hey, something to be said for ‘face time’ employers, welcome back!)
44 percent examine employee internet use (duh, thought that would be a higher number)
10 percent monitor home worker productivity (how? that one begs for detail…)
1 in 5 review fax transmissions (getting harder to run football pools all of the time, ouch)
2 percent monitor keystrokes per hour (that’s some fucked up shit, monitoring keystrokes, reeks of login/id theft)
*supposedly at elgoog - and i’m just imagining this - these numbers might be pushing 100 percent across the board, and they supposedly take doody samples out of the bathroom pipes and steal hair samples from high-back seats and hire former low-cost russian kgb dudes to follow new employees home…
…so remember kids: your cell phone - if it’s not on a corporate server via smart phone apps - is just about the last safe haven…or snail mail….or shit, you know what? why don’t you just quit your job and blog full time?
remember how great anybirthday.com was back in the day? …really useful for unearthing information on people’s b-days with users flocking to it from all over (including the nosy folks trying to just figure out if friends are lying about their ages)…but then it went away, like a lotta dotcom services (boo hoo)…but not to worry, there’s a strong replacement from the folks at bored.com (oh, and that bored site is really boring btw, so don’t even bother clicking on through)…
what they’ve pulled together is another database of circa 120 million birth records that they claim comes from the ‘official government‘ (as in, “hi, we’re building a big website mister federal employee, would you mind just zipping that file of birth records and emailing over? thanks!“)…it’s called the birth database …and don’t be fooled by how much the look and feel of the site completely sucks ass, because the database works well (look me up - david carpe, pearl harbor day)..
…but you know, there’s sorta a problem (there’s always a problem as far as i’m concerned, doesn’t matter what i’m looking at)…the site’s search capabilities kind of suck - it’s just last and first name and a box for ‘approximate age’ - but alas, a problem solver named stephen morse came up with a solution….stephen has put together his own wonderful collection of genealogical research resources - along with a custom interface to the birthdatabase website that makes searching much cleaner…
why’s it bettah? because his search lets you add in other variables like city, state or zip code and a much cleaner range for ‘approximate age’ which you control by entering earliest or latest possible date - topped only by the ability to limit results and totals or begin returns from a certain page of results…very clever, very useful…but you know what’s really cool? steve’s main site - it’s loaded to the damned brim with useful stuff (like this killer one-stop zip code search tool, or this wickedly nerdy ‘one step search tool‘ creation engine)… enjoy, ’cause i am…found out my mom’s friend completely lied about her age, very exciting stuff!
this blog is mostly safe for work, though i sometimes throw around a 'fuck' or two. you'll find a bunch of my articles from CI Magazine, SCIP online, other research pieces and some other crap. enjoy. there's lost of content here related to getting information about, around, from and through people and organizations...