0%

Old days

All files of Wordpress were in server side, whenever I want to make changes, I had to ssh to my remote machine.

I use a Linode VPS shared by some friends, sometimes, tragedies like "failed to start-up after reboot", "system needs to be re-installed" happened. I didn't want to lose any file, so, I relied on git, after changes were made directly on server, git pull on my Macbook Pro would get all things backed up locally.

To write posts, I used Wordpress editor at first, but it sucked, I like leaving one blank line between every paragraph, it couldn't.

I tried org-mode by using "org2blog" package, since I use Emacs a lot, it was an amazing writing experience. But when my blog post became long, and with lots of code snippets, problem occurred, Emacs ate up all CPU resources making system hung during org2blog-post-buffer occasionally.

Read more »

Using Emacs is like an adventure.

I found two very powerful extensions recently in Github. Yes, I’ve shared some powerful features/extensions last year by this post, but those two are not only powerful, but also handy, compared with align-regex, occur, anything, etc.

mark-multiple.el

I’ve used Eclipse, which has a good support for refactoring, but I want complain: Dialogs suck!

So, when I need to rename one variable or a function which occurs many times in a file, what am I supposed to do?

Read more »

What does org.apache.hadoop.ipc.Client do to help the real RPC client to make a remote method invocation on RPC server?

If you don’t know, or forget, please read my last post.

Firstly, create a Call instance, passing the param to it, which contains method name and parameters (class type and instances).

Then, get a client-server connection.

In the third place, send the Call instance to server using connection.sendParam method.

Read more »

In last post, I metioned that all RPC client calls will be dispatched to Invoker.invoke() method by the dynamic proxy, mainly work of method invoke are done by client.call.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
//** org.apache.hadoop.ipc.RPC.java L218
public Object invoke(Object proxy, Method method, Object[] args)
throws Throwable {
final boolean logDebug = LOG.isDebugEnabled();
long startTime = 0;
if (logDebug) {
startTime = System.currentTimeMillis();
}
//** Request comes to "invoke",
//** then it calls "client.call",
//** and get the result "value",
//** then return it back by "value.get".
//** The first parameter is an instance of "Invocation"
ObjectWritable value = (ObjectWritable)
client.call(new Invocation(method, args), remoteId);
if (logDebug) {
long callTime = System.currentTimeMillis() - startTime;
LOG.debug("Call: " + method.getName() + " " + callTime);
}
return value.get();
}

There are two parameters in method client.call, the first one is an instance of Invocation, the second one is remoteId, which is easy to guess that it might be used to represent the connection between client and server.

Before step into client.call to see what are done there, it’s better to know what the Invocation is.

Invocation implements Writable interface, so, let’s look at Writable first.

Read more »

This will be the first article of a series of blogs I am about to post, which are all about what I learned from reading the Hadoop source code (cdh3u2).

Code snippets will be full of those posts, to not confuse you, all comments added by me begin with //** instead of // or /*.

An Easy Example of Java Dynamic Proxy

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
import java.lang.reflect.*;

public class ProxyTester {
public interface Hello {
void sayHello();
}

public class RealHello implements Hello {
public void sayHello() {
System.out.println("Hello, World");
}
}

public class FakeHello implements InvocationHandler {
private Object agent;
public FakeHello(Object obj) {
this.agent = obj;
}
//** Hijacking real method invocation
public Object invoke(Object proxy, Method method, Object[] args)
throws Throwable {
System.out.println("Hello Hadoop first");
//** Real method invocation happens here
Object result = method.invoke(agent, args);
return result;
}
}

public void test() {
RealHello realHello = new RealHello();
FakeHello fakeHello = new FakeHello(realHello);
//** "hello" will be a proxy,
//** whose method invocation will be dispatched to "fakeHello".
Hello hello = (Hello) Proxy.newProxyInstance(
Hello.class.getClassLoader(), new Class[] {Hello.class}, fakeHello);
hello.sayHello();
}

public static void main(String[] args) {
ProxyTester proxyTester = new ProxyTester();
proxyTester.test();
}
}

The results:

Read more »

在集群安装 Hadoop 的过程中,出现了这样的问题。

所有 Node 都起来了,工作正常,唯独 secondary namenode 在 doCheckpoint 的时候报错,而且是诡异的 403 http error。

1
2
3
4
// secondary namenode log
2011-10-24 17:09:12,255 INFO org.apache.hadoop.security.UserGroupInformation: Initiating re-login for hadoop/[email protected]
2011-10-24 17:09:22,917 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in doCheckpoint: 2011-10-24 17:09:22,918 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: java.io.IOException: Server returned HTTP response code: 403 for URL: https://hz169-91.i.site.com:50475/getimage?getimage=1
...

于是怀疑 kerberos 认证问题,可是 secondary namenode 已经通过 Kerberos 验证了;

又怀疑 secondary namenode 向 namenode 请求服务被拒绝,可是 namenode 的 log 显示已经通过验证了。(hadoop/[email protected] 是 secondary namenode 的 kerberos principal,hadoop/[email protected] 是 namenode 的 kerberos principal)

Read more »

花了两天多时间,终于让 Hadoop 可以安安稳稳的跑几个 MapReduce 的程序了,安装的过程很折腾,但我觉得应该用不着花这么长时间的。

很久以前就深知 RTFM 的重要性,但很多时候第一反应仍然是去 Google,而不是去官网找 Documentation,诚然,有些东西的文档的确很烂,但其实 Google 出来的 Blog 转载来转载去也就那么几份,很多里面都存在问题,毕竟需求是不一样的,态度也是不一样的。

这篇文章内容大部分参照自官方文档,并且包含若干我的配置错误,以及解决方法。

这个安装过程适用于配置 Kerberos V5 1.8.4 to Cloudera CDH3 on Ubuntu 11.04

安装 JDK 6

Read more »

今天失业在家。

第一次换工作,考虑了很久。也许是因为没有经验,或者说自己决断力不足,纠结了不少时间。

兴趣对我来说很重要,我平时看的,学的,做的跟工作完全不相关,我早该离开的,但是离不开的是那里的工作氛围,真的挺好,像研究生时的实验室。

我试图换岗位,做 Coach,遇到了一群可爱的 co-worker,但很多时候很多事情总是事与愿违。

死去的乔布斯说过 connecting the dots,我新工作,是因为我研究生时候做的项目的经历来的。所以虽然我放弃了这三年来的工作经验和积累,但谁知道说不准未来什么时候就突然有用了。

Read more »

七月中旬 outing 去了武夷山,本来的休闲游,却被我们搞成了暴走三日行,其实旅游本是探索,如果跟着导游,或者只按景点安排来,那或许少了很多乐趣。

就譬如说在贴这篇 Blog 的过程中,我经历了洗手间门被反锁事件,拿卡片开了十几分钟,无果,但同样的锁,隔壁房间开了好多次,早已练就 10 秒内开门绝技,无奈我只得叫开锁公司。

回想起两年前,同样是晚上,去洗澡,锁坏了,出不去,在洗手间足足呆了三个多小时,期间洗了 4 次澡,坐马桶上发呆,对窗外发呆,以至于对面楼很多房间都拉了窗帘。室友回来才帮我叫开锁公司,开门换锁,就是今天开不了的这只。

这就叫做生活的花絮,

那种我也许可以讲很多年的故事。

Read more »

正则表达式的实现引擎(NFA/DFA)

NFA 是不确定的有限自动机,也就是说在状态的迁移过程中,下一个状态可能有好几种可能,而对于 DFA 确定有限自动机而言,下一个状态只有一种可能。

想起大学时上的一门课,“可计算性理论”,用自动机来证明程序是可以写出来的,我大学有两门典型天书课,这个是一,另一个是 “相对论和量子力学”。

简单来讲,NFA 对应的是正则表达式主导的匹配,而 DFA 对应的是文本主导的匹配。

比如说对于这样的文本和正则表达式,

Read more »