Hello World by Thrift using Java

I have an application runs perfectly on single machine, it was designed to be used this way, but things got changed, now, at least three copies have deployed, and is continue increasing.

The main job of the application is to collect data, and then do analysis, being used by three products means deploy it to three machines.

It always runs background with three processes, so I need to keep an eye on every processes, trace logs and set nearly the same configurations to each copy on every machine, as well as some product specific settings. It’s chaos.

I plan to make it distributed, with one server do the product specific analysis, and some clients collecting data. Server dispatches clients, gives clients job to do, sets configurations per client, and gets running status of all clients.

Hope this can relieve me. :)

I pick Thrift as my RPC framework, tried it, it’s easy and fun except the not-so-good documentation.

First, create .thrift file.

    # This will set the generated java class
    # in me.zhengdong.thrift package
    namespace java me.zhengdong.thrift

    struct Item {
      1: i64 id,
      2: string content,
    }

    service CrawlingService {
        void write(1:list<Item> items),
    }

Types and services are defined in this file, I define a type Item, and a service named CrawlingService, which has one method write, it will be used as remote method.

Then, use thrift compiler to compile the file.

The compiler must be installed, I use Mac, which has the amazing Homebrew, “brew install thrift”, done.

Executes thrift -out . --gen java item.thrift to compile.

Use the -out . flag to not create the gen-java directory. If not, it will create java files in gen-java/me/zhengdong/thrift, which is not what I need.

Two java files are generated, CrawlingService.java, and Item.java.

Implement the service handler

CrawlingService has a write method, we need to implement it.

    // Need to implement the interface defined in CrawlingService
    public class CrawlingHandler implements CrawlingService.Iface {
        public void write(List<Item> items) throws org.apache.thrift.TException {
            for (Item item : items) {
                System.out.println(item.toString())
            }
        }
    }

This will be used in RPC, client can call this method by putting items it collects, then server will print all the items out.

Finally, make the server and client

    public class Server {
        private void start(Configuration conf){
            try {
                // Set port
                TServerSocket serverTransport = new TServerSocket(7911);
                // Set CrawlingHandler we defined before
                // to processor, which handles RPC calls
                // Remember, one service per server
                CrawlingHandler handler = new CrawlingHandler();
                CrawlingService.Processor<CrawlingService.Iface> processor
                        = new CrawlingService.Processor<CrawlingService.Iface>(handler);

                TServer server = new TThreadPoolServer(
                        new TThreadPoolServer.Args(serverTransport).processor(processor));

                System.out.println("Starting server on port 7911 ...");
                server.serve();
            } catch (TTransportException e) {
                e.printStackTrace();
            }
        }

        public static void main(String args[]){
            Server server = new Server();
            server.start();
        }
    }

    public class Client {
        public void write(List<Item> items){
            TTransport transport;
            try {
                transport = new TSocket("localhost", 7911);
                transport.open();

                TProtocol protocol = new TBinaryProtocol(transport);
                CrawlingService.Client client = new CrawlingService.Client(protocol);

                client.write(items);
                transport.close();
            } catch (TTransportException e) {
                e.printStackTrace();
            } catch (TException e) {
                e.printStackTrace();
            }
        }
    }

We can set the IP which server runs on to replace the “localhost”, then, every invoke of Client.write on any machine will make server prints.

You can see the tutorial here, I feel it’s funny to call it “tutorial”.

If you want to learn more about Thrift, please read the white paper.

— updated

The tutorial link above no longer exists, then I google and find the official tutorial looks better now.