Kids Return: July 2010

Monday, July 26, 2010

tbb: Intel® Threading Building Blocks笔记

tbb(Threading Building Blocks)是Intel开发的一个C++模板库, 类似于STL. 其特点是针对对于多核(multi-core)CPU的优化.提高在多线程的环境下程序并发性. tbb提供了多种类似于STL但高度并行的容器类.

安装

Ubuntu:

sudo apt-get install libtbb2 libtbb-dev libtbb-doc

MacOS上:

sudo port install tbb

编译

g++ -ltbb foo.cpp

使用

concurrent_hash_map
此容器是类似于STL中的map的存在. 同std::map一样，concurrent_hash_map为一个从Key类型读取T类型的容器. 为提高并发性,我们需要用accessor或者const_accessor两种不同方式访问此容器的某元素.accessor和const_accessor均为智能指针,其中accessor写和修改访问,会lock相应的key直到访问结束.而const_accessor用于只读方式访问,这样可以同时有多个不同的const_accessor同时指向同一个key.区分不同的访问方式有助于增加程序的并发性.

#include <iostream>
#include <string>
#include <tbb/concurrent_hash_map.h>

using namespace tbb;
using namespace std;
typedef concurrent_hash_map<string,string> CacheTable;

int main() {
    CacheTable cache;

    // insert an element to the map
    CacheTable::accessor a;
    cache.insert(a, s);
    a->second = "value1";
    a.release();
    
    // look for an element in the map
    CacheTable::const_accessor ca;
    if (cache.find(ca, "key1"))
         cout << "the value is " << ca->second << endl;
    else
         cout << "not found" << endl;

    // iterate over the map
    for(CacheTable::const_iterator itr=cache.begin(); itr!=cache.end(); ++itr)
         std::cout << itr->first << " - " << itr->second << std::endl;

}

concurrent_queue
一个类似于stl中queue的存在. 提供包括push(item),pop(item)以及try_pop(item)等等的操作.
下面是对于一个concurrent_queue的iteration操作:

#include <iostream>
typedef concurrent_queue::const_iterator iter;
for(iter i(q.unsafe_begin()); i!=q.unsafe_end(); ++i ) {
   do sth 
}

atomic
如果atomic< your data type> x, 以下操作为原子操作

= x  read the value of x
x =  write the value of x, and return it
x.fetch_and_store(y)  do y=x and return the old value of x
x.fetch_and_add(y)  do x+=y and return the old value of x
x.compare_and_swap(y,z)  if x equals z, then do x=y. In either case, return old value of x

Wednesday, July 21, 2010

git笔记

git是Linus亲自操刀设计实现的版本控制工具(据说Linus在两个礼拜内搞定了git,教主威武).git功能上比SVN要强,但是学习曲线也陡了很多. 这里是我对SVN的笔记

文档

关于git的教程和手册网上很多, 但很多写的都不好.最推荐:
Git Workflow (这里的diagram非常给力)
How to version projects with Git 图文并茂,有木有?
git quick reference

这几个也不错
Git Magic 中文,英文
Pro Git(中文)

其他的还有:
Understanding Git Conceptually
Git User Manual写的比较晦涩
Git - SVN Crash Course. 我觉得帮助不是很大, 因为git和svn在用法上区别是在太大了
GIT cheat sheet这个不错
Basic Branch Merging
关于stash, log, 以及gitconfig的用法

用法

初始化一个workspace
从远端克隆一个repository到本地

$git clone git://path/to/your/git/repo

注意git是一个分布式的版本控制系统. 如果你从server1上clone的repo, 那么以后server1就相当于你的源.如果以后server2再clone了你的repo, 你就相当于server2的源.我发现这种结构很方便调试 -- 在其他机器上可以从我本地repo来pull 文件. clone的时候除了git协议还可以使用ssh, 比如

$git clone ssh://hostname/path/to/your/git/repo

工作流程
进入你的工作目录,通常大家首先将本地repository更新到源上的最新版本(大致相当于svn update, 但是因为git是分布式的设计,更新到源上最新未必是全局最新). 下面的命令从origin 这个remote 更新 master这个branch的update.

$git pull origin master

未必需要从自己所在的branch来更新. 比如你实际在master这个branch, 你也可以更新branch foo的东西,即使你还是留在master这个branch里:

$git pull origin foo

有时候每次都是默认同样的remote同样的branch,那么我们可以修改工作目录下的.git/config文件,

[branch "master"]
    remote = origin
    merge = refs/heads/master

这样就可以直接用下面命令来更新而无须指定remote和branch

$git pull

做了修改后, 为了查看当前工作目录的状态,比如哪些文件被改动, 哪些文件没有commit, 可以使用status (大致相当于svn st)

$git status

如果需要git status的时候忽略某些文件,比如.o文件或者.pyc文件, 我们可以在.gitignore这个文件中加入两行*.o以及*.pyc.

修改了本地的文件后, 需要将其用add命令将其加入index. svn里没有index的概念,而且add这个命令在svn中是将一个文件加入版本控制. 而git里, 一个文件每次的修改都会需要你用add加入index后才能commit.

$git add file1 file2

然后将index的修改commit到本地.

$git commit file1 file2 -m "go! commit file1, file2"

以上两步(add, commit)也可以被合并成一步

$git commit -a file1 file2 -m "go! commit file1, file2"

如果需要更改上一次commit的信息

$git commit --amend -m "this is the right commit!"

这个时候文件只是在本地commit, 如果希望更新到远端的repositiory中, 还需要push(和更新源到本地的pull命令对应). 默认git push的话是push到origin的master分支

$git push

你也可以指定remote repo以及branch

$git push repo5 branch12

如果当前工作目录下checkout了多个branch, 但是你一般只会push正在tracking的这个branch,可以在config文件(比如.git/config)里设置

[push]
    default = tracking

或者使用命令

$git config push.default tracking

提取给定版本 checkout
checkout最新commit里的foo

$git checkout foo

如果需要提前两个版本的foo,可以

$git checkout HEAD~2 foo

从最新的stash(stash@{0})里checkout foo这个文件

$git checkout stash@{0} foo

撤销更改 revert, checkout, reset
比如你刚刚修改了一个文件foo, 但现在想撤销这个更改. 这需要根据foo的状态来选择不同的方法:

Changed but not updated: 如果一个文件foo被删除了, 或者被修改了但完全还在本地workspace中,还没有使用git add把foo加入index当中, 那么我们只需要重新checkout 这个文件
```
$git checkout foo
```
如果你不幸有一个branch也叫foo
```
$git checkout -- foo
```
如果你需要恢复当前目录所有文件
```
$git checkout . 
```
有时候需要把当前workspace中所有的修改都撤销,可以简单的使用:
```
$git stash
```
Changes to be committed: 这时候修改已经加入index当中,但是还没有commit 提交到本地repo,我们可以使用reset使的当前workspace回到上一个commit的时候, 然后使用checkout恢复修改.
```
$git reset HEAD foo
$git checkout foo
```
撤销所有index当中的修改:
```
$git reset HEAD
```
如果已经使用git commit把修改提交到了本地repo当中, 我们可以把本地repo rollback到上一次commit前的状态:
```
$git reset --hard HEAD~1
```
--hard选项会overwrite的所有的改动.

如果我们只需要undo某一次的commit,可以使用git revert:
```
$git revert b38155cbf671d55ceb027687c39508de8cef2463
```
这样实际上是提交了另一个commit来抵消指定的commit.

查看日志 log
当前branch的commit日志

$git log

当前branch的commit日志,显示内容

$git log -p

当前branch的commit日志,但只显示上3次提交的内容

$git log -3

当前branch的commit日志,但只显示被修改的文件名

$git log --name-only

处理冲突 conflict
参考手册http://www.kernel.org/pub/software/scm/git/docs/user-manual.html
以及 http://www.kernel.org/pub/software/scm/git/docs/git-push.html

pull的时候如果有本地修改会导致无法成功. 如果希望能把远端的commit merge到本地

$git checkout -m foo
Auto-merging foo

有可能会失败,这时候就需要你手动来编辑这个文件来merge了

push的时候, 当前commit到远端repo的时候,有时候会出现如下问题:

$git push
To git@example.come:example.git
 ! [rejected]        master -> master (non-fast-forward)
error: failed to push some refs to 'git@example.com:example.git'
To prevent you from losing history, non-fast-forward updates were rejected
Merge the remote changes before pushing again.  See the 'Note about
fast-forwards' section of 'git push --help' for details.

这是由于你的工作目录没有update到最新的版本.换言之有人在你上一次pull之后又push了新的commit进去.所以如果你的push生效, 可能会导致别人push的commit失效. 解决方法有两种:

$git pull
$git push

这种方法先把当前工作目录更新到最新 -- 这一步可能会产生冲突. 解决冲突后把更新提交. 这种方法会产生两个commit信息. 一次是说你自己的更新,另一次是将你的分支和repo上分支的merge.
如果不希望产生两次commit信息, 可以用第二种方法:

$git pull --rebase
$git push

这种方法直接将你的commit重新rebase到最新的版本上, 然后再进行提交.

分支 branch
一个repo可以有不同的分支,branch命令查看本地已有的分支, *表示当前使用的branch

$git branch
* master
  testbranch

可以用branch -r 查看remote所有的分支

$git branch -r

也可以用branch -a 同时查看local和remote所有的分支

$git branch -a

以当前branch的HEAD为起始点,新建新的branch叫newbranch

$git branch newbranch

也可以指定起始点(比如testbranch), 创建一个新的分支叫newbranch

$git branch newbranch testbranch

将当前分支从master切换到testbranch

$git checkout testbranch
$git branch
  master
* testbranch

删除一个local的branch

$git branch -d branch_to_delete

删除一个remote的branch

$git branch push origin --delete branch_to_delete

远端 remote
查看当前branch有哪些remote

$git remote
origin

查看当前branch有哪些remote,显示具体一些的信息

$git remote -v
origin git@example.com:bar.git (fetch)
origin git@example.com:bar.git (push)

查看origin这个remote的具体信息

$git remote show origin

一个repo可以有多个remote.比如给分支branch_foo添加一个新的remote:

$git remote add branch_foo git@host-for-new-remote:repo-name.git

将新添加的remote添加为tracking的branch

$git branch --set-upstream branch_foo   remote_bar/branch_foo

查看历史版本
查看某个版本(some-sha1)的某个文件

$git show some-sha1-number:some-file

查看某个版本(some-sha1)的和最新版本(HEAD)之间的所有区别

$git diff some-sha1 HEAD

查看某个版本(some-sha1)的和最新版本(HEAD)之间的某个文件的区别

$git diff some-sha1 HEAD --somefile

比较当前branch和另外一个branch比如叫foo的区别

$git diff foo

比较branch foo和另外一个branch bar的区别

$git diff foo bar

查看远端repo里的文件
查看当前HEAD中所有加入git管理的个文件

$ git ls-tree -r  HEAD

其他相关工具
tig - text-mode interface for git 设置git ignore的方法: http://help.github.com/ignore-files/

Saturday, July 17, 2010

赶个时髦，学习一下Object-C

(未完)

由于iPhone等Mac系产品无比红火的存在，Object-C也变得重要起来。Object-C是Apple的御用语言(类似C#之于Microsoft)。如果要开发个iPhone App啥的，就需要用到Object-C。如果要在MacOS里写点基于Cocoa的应用程序，也需要用到Object-C(其实Cocoa有python等其他语言的绑定，不过我试了一下不太喜欢)。

话说当年水果教教主Steve Jobs被董事会一脚踢出了自己创立的Apple，就搞了一个Startup叫做NeXT。Object-C就是NeXT从别人手里买下的。Steve Jobs在NeXT待了十年，期间潜心研发硬件软件，修为大为精进。而Apple这段时间却止步不前，终于在1997年被Steve Jobs夺舍而复辟回到了Apple。Steve Jobs顺道也把在NeXT修炼的OpenStep操作系统（打包了Object-C,Cocoa和Xcode等）带回了Apple，也就是后来广为人知的MacOS X。在使用Object-C的时候会发现大量NS开头的类（NeXTstep的意思），就是在纪念这段历史。

Object-C从名字上来看顾名思义就是包括了面向对象的C。道友可能会问，C++不就是用面向对象版的加强的C么？对，但其实现在我们看到的C++是那轮改造中涌现出来的最流行的那个。Object-C和C++都说自己继承了C语言,是C的超集--当你用不到面向对象的时候确实是这样。但是用上面向对象以后，两者的语法就存在着不小的区别。Object-C的面象对象部分其实处处都流露出从更加上古时期就流传下来的Smalltalk的影子。为了以示区别，Object-C中对于C扩充的关键字多以@开头，比如@interface,@end等。

下面开始步入正题：
第一个Object-C程序:Hello World

像学习其他所有语言一样，我们也不能免俗的从Hello World开始。下面是Object-C版本的Hello World

#import <stdio.h>

int main( int argc, const char *argv[] ) {
    printf( "hello world\n" );
    return 0;
}

不难发现，Object-C的Hello World和C版本的基本一样，唯一区别在于把#include换成了#import。没错，谁让Object-C就是C的扩充呢。

面向对象:类的创建

在Object-C当中，首先用@interface定义一个类的成员变量和成员函数，然后在@implement中实现成员函数

Object-C	C++
//Foo.h @interface Foo : NSObject { @private: double x; @protected: double y; @public: double z; } -(int) f:(int)x; -(float) g:(int)x :(int)y; @end	//Foo.h class Foo { private: double x; protected: double y; public: double z; int f(int x); float g(int x, int y); };
//Foo.m #import "Foo.h" @implementation Foo -(int) f:(int)x {...} -(float) g:(int)x :(int)y {...} @end	//Foo.cpp #include "Foo.h" int Foo::f(int x) {...} float Foo::g(int x, int y) {...}

Object-C

C++

//Foo.h
@interface Foo : NSObject 
{ 
@private:
    double x;
@protected:
    double y;
@public:
    double z;
} 
-(int) f:(int)x; 
-(float) g:(int)x :(int)y; 
@end

//Foo.h
class Foo 
{ 
private:
    double x;
protected:
    double y;
public:
    double z;
    int f(int x); 
    float g(int x, int y); 
};

//Foo.m
#import "Foo.h"
@implementation Foo 
-(int) f:(int)x {...} 
-(float) g:(int)x :(int)y {...} 
@end

//Foo.cpp
#include "Foo.h"
int Foo::f(int x) {...} 
float Foo::g(int x, int y) {...}

调用类方法

Object-C	C++
[myObj method] [myObj method:para1:para2] output=[myObj method:para1:para2] [obj1 func1:[obj2 func2]]	myObj.method() myObj.method(para1,para2) output=myObj.method(para1,para2) obj1.func1(obj2.func2())

在Object-C当中，类方法的调用并不是传统的C/C++方式。而是采取了贴近Smalltalk的设计--消息。

Instance Method vs Class Method

在Object-C中，类的方法分为Instance 和Class两种，定义的时候分别使用"-"以及"+". Class method在大多数其他语言比如C++以及Java中又叫静态方法(static method).

@interface MyClass : NSObject

+ (void)aClassMethod;
- (void)anInstanceMethod;

@end

调用的时候

[MyClass aClassMethod];

MyClass *object = [[MyClass alloc] init];
[object anInstanceMethod];

Duck Typing

同受Smalltalk影响，Object-C和Python一样可以归入Duck Typing：一个类的实例可以调用该类中定义都方法，也可以调用该类中并没有定义的方法。在C++中如果一个类foo没有定义方法bar,那么foo的实例是无论如何也不能调用bar的--编译就无法通过。但是在Object-C(以及Python)中却没有问题--至少在编译期没有问题，但在运行期会有异常.

参考

从C/C++语言到Objective-C语言
Learning Object-C

Thursday, July 08, 2010

[Linux]readline

使用gnuplot的时候, 使用backspace 或者del这样的键总是显示~这样的字符

rlwrap -a -c gnuplot

这有一篇Consistent BackSpace and Delete Configuration
没有来得及仔细看

Tuesday, July 06, 2010

Virtualbox

Guest OS Addition
启动虚拟OS以后, 在device的menu里找到安装Guest OS Addition
安装这个的好处是鼠标可以自由移动, Guest OS的分辨率也会调整的比较好

共享文件夹
Host OS: Mac OS X
Guest OS: Ubuntu 9.04
为了在Guest OS中访问Host的文件系统,我们需要把Host的目录mount上

mount -t vboxsf [-o OPTIONS] sharename mountpoint
e.g. mount -t vboxsf -o uid=500,gid=500 /path/in/host /mount/point/in/guest

Guest OS: Windows XP
在Guest的Windows系统中访问MacOS上的目录,是通过映射网络硬盘实现的
在windows的命令行中输入
net use x: \\vboxsvr\sharename
这里x是你在Windows系统中分配的网络硬盘盘符,而sharename则是你在虚拟机中设置的共享文件夹(不一定是路径)

[Linux]动态链接库相关命令

Linux下动态链接库的管理
ldconfig 管理系统中的动态链接库文件

ldconfig这个命令会在“特定的路径”下搜寻可以共享的动态链接库文件(在Linux底下格式为lib*.so*),进而创建出动态装入程序(ld.so)所需的连接和cache文件.这里"特定的路径"指/lib和/usr/lib这些默认路径以及动态库配置文件/etc/ld.so.conf内所列的目录以及文件.

如果刚刚安装了一个lib之后,相关文件仍然说找不到,可能是动态链接文件的cache没有更新.可以ldconfig把ld.so的cache更新一下:

ldconfig

只处理foo文件夹下的动态链接文件(前面描述的特定路径下就不处理了)

ldconfig -n foo

显示当前cache的动态链接文件

ldconfig -p

ldd
作用: 查看可执行文件需要的动态链接库
例子:

>ldd /bin/ls
 linux-vdso.so.1 =>  (0x00007fff549ff000)
 librt.so.1 => /lib/librt.so.1 (0x00007fd82409c000)
 libselinux.so.1 => /lib/libselinux.so.1 (0x00007fd823e7e000)
 libacl.so.1 => /lib/libacl.so.1 (0x00007fd823c75000)
 libc.so.6 => /lib/libc.so.6 (0x00007fd8238f2000)
 libpthread.so.0 => /lib/libpthread.so.0 (0x00007fd8236d5000)
 /lib64/ld-linux-x86-64.so.2 (0x00007fd8242c8000)
 libdl.so.2 => /lib/libdl.so.2 (0x00007fd8234d0000)
 libattr.so.1 => /lib/libattr.so.1 (0x00007fd8232cb000)

是否应该使用环境变量LD_LIBRARY_PATH?
答案是no.尽量不要设置这个变量.理由参见http://linuxmafia.com/faq/Admin/ld-lib-path.html

Mac下的动态链接
ldd对应otool

Ref
http://wiki.linuxquestions.org/wiki/Library
http://tldp.org/HOWTO/Program-Library-HOWTO/shared-libraries.html

[Latex]几招压缩Latex paper的页数

写paper的时候要压缩页数常常让人很痛苦.其实默认的Latex的paper尚有很大的空间,让你不动太多的内容改动就大大的压缩论文页数.下面我贡献常用的几招
调整Section title的font和spacing

\usepackage[medium,compact]{titlesec}

默认的section title啥的其实spacing相当大.所以别看这招简单, 但其实相当狠, 能省下来很大的空间.

对于Bibliography参考文献用小号字体
可以用 \small, 再小点用 \footnotesize, \scriptsize 也有人用. 不过\tiny就太小了,reviewer会有意见的.

\footnotesize
\bibliography{ref}
\bibliographystyle{abbrvnat}

标准的itemize环境里的indent吃掉了很多空间,可以用list环境来自定义左右间距等参数

\begin{list}{\labelitemi}{\leftmargin=1em}
  \setlength{\topmargin}{0pt}
  \setlength{\itemsep}{0em}
  \setlength{\parskip}{0pt}
  \setlength{\parsep}{0pt}
\item blablabla
\item blablabla
\end{list}

在这里我们用\labelitemi (实心圆点)作为每个item开头的bullet.还可以用预定义好的\labelitemii (一个-)或者\quad (空白)甚至 $\star$这样的.
或者可以这样全局的来设置:

\usepackage{enumitem}
\setlist{itemsep=0pt,parsep=0pt}

调整Equation和前面文字的空间
公式和文字之间有时候会留下很大的间距,可以通过\vspace来压缩这个间距

Therefore,
\vspace*{-0.5\baselineskip}
\begin{eqnarray}
     A = B
\end{eqnarray}