Monday, October 25, 2010

The Chinese Language is the Deep Web

Reading Nicholas Kristof's post "Liu Xiaobo and Chinese Democracy", about Mr. Liu's recent Nobel Peace Prize, I saw a piece of content stood out, not only for its content, but also for the offhand way in which it was presented:

Today, Liu presumably doesn’t know that he has won the prize, and the Chinese government is trying to censor the news. But China is changing and censorship no longer works so effectively. It can ban mobile phone users from texting the characters for his name, but young Chinese are smart enough to use substitute characters.

Assuming this actually is the case, it means that hidden within the Chinese languages (and it's clear that they are separate languages, not dialects of one overarching, crazily heterogeneous Chinese language) is a hidden world of possible ideogram-meaning combinations, connected by sound. Here's how that would work:

Every Chinese character represents a word. (Linguists: I know there are exceptions. Thanks.) For example, the word for "work" is 工, pronounced "gong" with a high, steady tone. The word for "attack" is 攻, also pronounced "gong" with a high, steady tone. The word for "supply" as in "power supply" is 供, also "gong" with a high, steady tone. So on with the words for "official business", "palace", and "bow" as in "bow and arrow".

Right now, the censors at Great Firewall HQ, actually called the Propaganda Department—I kid you not—are poring over blogposts and texts and other electronic content, finding subversive messages and stamping them out like bugs. Now, I imagine that a bit of this is done automatically, by keyword, and a great deal more is done by
a large government department, full of the average office assortment of flunkies, middle managers, angry bosses, and the ennui that comes along with this setup.

Now imagine an undercurrent of blogs that don't seem to make sense at first glance. They bring up no poisonous keyword hits. They carry no familiar subversive slogans. But for those who would read them aloud, they transfer hopeful messages of democracy, commentary on the Chinese political situation, and perhaps even plans for meetups and other events.

This sound-meaning correspondence is much like what serious internet people call the "deep web". The deep web consists of all the data on the Internet that's not directly accessible to the average end user of a search engine. Deep web data is significantly more voluminous than surface web data. From the wikipedia page:
Deep Web search reports cannot display URLs like traditional search reports. End users expect their search tools to not only find what they are looking for quickly, but to be intuitive and user-friendly. In order to be meaningful, the search reports have to offer some depth to the nature of content that underlie the sources or else the end-user will be lost in the sea of URLs that do not indicate what content lies underneath them.
By moving context outside of the scope of these messages of Chinese democracy, writers would easily circumvent any mechanical attempts at censorship. Certainly, it's not perfect, but even in a worst-case scenario, this practice could burden the Propaganda Department with the need for more human censors.

Monday, October 11, 2010


The new Facebook Groups is a miserable failure.

Quietly, amidst the opening of that movie where Zuck is played by a guy who looks like Michael Cera only more serious, Facebook rolled out a new version of Groups. The new Groups combines the function of the old Groups and the function of Lists, which I (and apparently only I) like and use a great deal. It's been touted and announced on a ton of news outlets and techblogs. Read Farhad Manjoo's syrupy article in Slate. Zuck, I hope you already have a date to prom.

Personally, I think the new Groups is at best a major tactical error after a few quiet months. (As you'll recall, there were a number of concerns over the privacy control changes in May.) At worst, it signals a Microsoft-like disconnect with the user base, which will lead to a Microsoft-like end of relevance. Here are the problems:

First, it was rolled out without any real announcement. I heard it on the tech blogs, and have still never seen any notifications on the site. That's pretty major. I have been known to miss the forest for the trees, but I haven't even seen a sapling of notice.

Second, the FAQ doesn't really differentiate it from Lists, and doesn't address the question of why it was implemented (other than "We are continually looking for ways to enhance overall user experience"). It does give a basic overview of the new features, but they seem a lot like things that you already have access to if you use any Google products.

Third, any friend can add anyone, and Facebook gives the added user no notification, even when added to public groups. And there's the spam.

Some have compared the new Groups to Google Wave (apologies for the third TC link—they just did a good job this time) and while the functionality is somewhat similar, it appears that new Groups is the latest in a series of missteps. Wave was simply a gigantic error in judgment, an aberrant faux pas. No, the better comparison here is Microsoft's gradual decline. In no way is new Groups Facebook's Vista, but it does show that to Team Facebook, features are more important than utility, which means irrelevance is inevitable.

Diaspora kids, take your chance soon—you won't get another.