sklearn: RandomForest

发表于 2018-03-22 | 分类于 ai | 阅读次数

随机森林RandomForestClassifier

利用多棵树对样本进行训练并预测的一种分类器

阅读全文 »

添加右键启动terminator

发表于 2018-02-02 | 分类于 linux | 阅读次数

old: nautilus-actions

sudo apt install nautilus-actions
Action: display item in location context menu
Command: Path: terminator Parameters: –working-directory=%f
In the Mimetypes tab, set: Mimetype filter: inode/directory

阅读全文 »

sklearn: metrics

发表于 2018-01-17 | 分类于 ai | 阅读次数

真实情况|预测正|预测反
|:-:|:-:|:-:|
正|TP(真正例)|FN(假反例)
反|FP(假正例|TN(真反例)

查全率sensitivity/true positive rate/recall_score/ TPR

TPR = TP / (TP + FN)

阅读全文 »

批量重命名文件名称

发表于 2018-01-03 | 分类于 linux | 阅读次数

目标：

copy all files .pdf to _0.pdf

推荐使用rename.ul或mmv

1. rename.ul

1	rename.ul -v -n ".pdf" "_0.pdf" *.pdf

2. bash

1	for f in *.pdf; do pre="${f%.pdf}"; echo mv -- "$f" "${pre}_0.pdf"; done

3. rename

1	rename -n 's/\.pdf$/_0$&/' *.pdf

matches .pdf at the end of the filename
1
2. in the replacement, the match is prepended by ```_0```: ```_0$&

drop -n for actual action

4. mmv

1 2	sudo apt-get install mmv mmv '*.pdf' '#1_0.pdf'

linux系统通知notify-send

发表于 2018-01-02 | 分类于 linux | 阅读次数

1	notify-send -i 123.png -t 10000 "title" "content"

git命令笔记

发表于 2018-01-01 | 分类于 linux | 阅读次数

git添加远程仓库

服务器上

1	git init --bare repo

repo相当于中转站,

配置一下remote之后,服务器上的程序git push到repo

然后本机再从repo上pull下来.

# local
git remote -v
git remote add origin biolab@biolab:~/zzp/repo
git push --set-upstream origin master
git config --global push.default simple

rm file from git repo

官方文档

git filter-branch --index-filter 'git rm -r --cached --ignore-unmatch path/to/your/file' HEAD
git push origin master --force
rm -rf .git/refs/original/
git reflog expire --expire=now --all
git gc --prune=now
git gc --aggressive --prune=now

设置全局编辑器

git config –global core.editor vim

删除恢复:

场景1

当你改乱了工作区某个文件的内容，想直接丢弃工作区的修改时，用命令git checkout – file。

场景2

当你不但改乱了工作区某个文件的内容，还添加到了暂存区时，想丢弃修改，分两步，第一步用命令git reset HEAD file，就回到了场景1，第二步按场景1操作。

场景3

已经提交了不合适的修改到版本库时，想要撤销本次提交

HEAD指向的版本就是当前版本，因此，Git允许我们在版本的历史之间穿梭，使用命令git reset –hard commit_id。

穿梭前，用git log可以查看提交历史，以便确定要回退到哪个版本。

要重返未来，用git reflog查看命令历史，以便确定要回到未来的哪个版本

git reset –hard HEAD^回到上一个版本

在Git中，用HEAD表示当前版本.

上一个版本就是HEAD^，上上一个版本就是HEAD^^，

当然往上100个版本写100个^比较容易数不过来，所以写成HEAD~100。

修改.gitignore后移除索引


git rm -r --cached .

git add .

git commit -m "update .gitignore"

.gitignore设置

*代表所有


# 以&#39;#&#39;开始的行，被视为注释.                                                                                                                          
# 忽略掉所有文件名是 foo.txt的文件.
foo.txt
# 忽略所有生成的 html文件,
*.html
# foo.html是手工维护的，所以例外.
!foo.html
# 忽略所有.o和 .a文件.
*.[oa]
# 忽略*.b和*.B文件，my.b除外
*.[bB]
!my.b
# 忽略dbg文件和dbg目录
dbg
# 只忽略dbg目录，不忽略dbg文件
dbg/
# 只忽略dbg文件，不忽略dbg目录
dbg
!dbg/
# 只忽略当前目录下的dbg文件和目录，子目录的dbg不在忽略范围内
/dbg

不用每次push都输入用户名密码的操作

1	git config --global credential.helper store

使用下面的命令从远程仓库强制覆盖本地文件:

1 2	git fetch --all git reset --hard origin/master

要下载一些其他分支的更改使用以下命令。

1	git reset --hard origin/other_branch

说明：-

Git fetch命令下载最新的更新远程，但不合并或本地文件变基址。

Git reset master分支重置你刚才取下的。该-hard选项更改的所有文件到你的工作树origin/master。

一、git clone

远程操作的第一步，通常是从远程主机克隆一个版本库，这时就要用到git clone命令。

1	$ git clone <版本库的网址>

比如，克隆jQuery的版本库。

1	$ git clone https://github.com/jquery/jquery.git

该命令会在本地主机生成一个目录，与远程主机的版本库同名。如果要指定不同的目录名，可以将目录名作为git clone命令的第二个参数。

1	$ git clone <版本库的网址> <本地目录名>

git clone支持多种协议，除了HTTP(s)以外，还支持SSH、Git、本地文件协议等，下面是一些例子。

$ git clone http[s]://example.com/path/to/repo.git/

$ git clone ssh://example.com/path/to/repo.git/

$ git clone git://example.com/path/to/repo.git/

$ git clone /opt/git/project.git 

$ git clone file:///opt/git/project.git

$ git clone ftp[s]://example.com/path/to/repo.git/

$ git clone rsync://example.com/path/to/repo.git/

SSH协议还有另一种写法。

1	$ git clone [user@]example.com:path/to/repo.git/

通常来说，Git协议下载速度最快，SSH协议用于需要用户认证的场合。各种协议优劣的详细讨论请参考官方文档。

二、git remote

为了便于管理，Git要求每个远程主机都必须指定一个主机名。git remote命令就用于管理主机名。

不带选项的时候，git remote命令列出所有远程主机。

1 2	$ git remote origin

使用-v选项，可以参看远程主机的网址。

1
2
3

$ git remote -v
origin  git@github.com:jquery/jquery.git (fetch)
origin  git@github.com:jquery/jquery.git (push)

上面命令表示，当前只有一台远程主机，叫做origin，以及它的网址。

克隆版本库的时候，所使用的远程主机自动被Git命名为origin。如果想用其他的主机名，需要用git clone命令的-o选项指定。

1
2
3

$ git clone -o jQuery https://github.com/jquery/jquery.git
$ git remote
jQuery

上面命令表示，克隆的时候，指定远程主机叫做jQuery。

git remote show命令加上主机名，可以查看该主机的详细信息。

1	$ git remote show <主机名>

git remote add命令用于添加远程主机。

1	$ git remote add <主机名> <网址>

git remote rm命令用于删除远程主机。

1 2	$ git remote rm <主机名> git remote rm origin

git remote rename命令用于远程主机的改名。

1	$ git remote rename <原主机名> <新主机名>

三、git fetch

一旦远程主机的版本库有了更新（Git术语叫做commit），需要将这些更新取回本地，这时就要用到git fetch命令。

1	$ git fetch <远程主机名>

上面命令将某个远程主机的更新，全部取回本地。

git fetch命令通常用来查看其他人的进程，因为它取回的代码对你本地的开发代码没有影响。

默认情况下，git fetch取回所有分支（branch）的更新。如果只想取回特定分支的更新，可以指定分支名。

1	$ git fetch <远程主机名> <分支名>

比如，取回origin主机的master分支。

1	$ git fetch origin master

所取回的更新，在本地主机上要用”远程主机名/分支名”的形式读取。比如origin主机的master，就要用origin/master读取。

git branch命令的-r选项，可以用来查看远程分支，-a选项查看所有分支。

$ git branch -r
origin/master

$ git branch -a
* master
  remotes/origin/master

上面命令表示，本地主机的当前分支是master，远程分支是origin/master。

取回远程主机的更新以后，可以在它的基础上，使用git checkout命令创建一个新的分支。

1	$ git checkout -b newBrach origin/master

上面命令表示，在origin/master的基础上，创建一个新分支。

此外，也可以使用git merge命令或者git rebase命令，在本地分支上合并远程分支。

1
2
3

$ git merge origin/master
# 或者
$ git rebase origin/master

上面命令表示在当前分支上，合并origin/master。

四、git pull

git pull命令的作用是，取回远程主机某个分支的更新，再与本地的指定分支合并。它的完整格式稍稍有点复杂。

1	$ git pull <远程主机名> <远程分支名>:<本地分支名>

比如，取回origin主机的next分支，与本地的master分支合并，需要写成下面这样。

1	$ git pull origin next:master

如果远程分支是与当前分支合并，则冒号后面的部分可以省略。

1	$ git pull origin next

上面命令表示，取回origin/next分支，再与当前分支合并。实质上，这等同于先做git fetch，再做git merge。

1 2	$ git fetch origin $ git merge origin/next

在某些场合，Git会自动在本地分支与远程分支之间，建立一种追踪关系（tracking）。比如，在git clone的时候，所有本地分支默认与远程主机的同名分支，建立追踪关系，也就是说，本地的master分支自动”追踪”origin/master分支。

Git也允许手动建立追踪关系。

1	git branch --set-upstream master origin/next

上面命令指定master分支追踪origin/next分支。

如果当前分支与远程分支存在追踪关系，git pull就可以省略远程分支名。

1	$ git pull origin

上面命令表示，本地的当前分支自动与对应的origin主机”追踪分支”（remote-tracking branch）进行合并。

如果当前分支只有一个追踪分支，连远程主机名都可以省略。

1	$ git pull

上面命令表示，当前分支自动与唯一一个追踪分支进行合并。

如果合并需要采用rebase模式，可以使用–rebase选项。

1	$ git pull --rebase <远程主机名> <远程分支名>:<本地分支名>

如果远程主机删除了某个分支，默认情况下，git pull 不会在拉取远程分支的时候，删除对应的本地分支。这是为了防止，由于其他人操作了远程主机，导致git pull不知不觉删除了本地分支。

但是，你可以改变这个行为，加上参数 -p 就会在本地删除远程已经删除的分支。

$ git pull -p
# 等同于下面的命令
$ git fetch --prune origin 
$ git fetch -p

五、git push

git push命令用于将本地分支的更新，推送到远程主机。它的格式与git pull命令相仿。

1	$ git push <远程主机名> <本地分支名>:<远程分支名>

注意，分支推送顺序的写法是<来源地>:<目的地>，所以git pull是<远程分支>:<本地分支>，而git push是<本地分支>:<远程分支>。

如果省略远程分支名，则表示将本地分支推送与之存在”追踪关系”的远程分支（通常两者同名），如果该远程分支不存在，则会被新建。

1	$ git push origin master

上面命令表示，将本地的master分支推送到origin主机的master分支。如果后者不存在，则会被新建。

如果省略本地分支名，则表示删除指定的远程分支，因为这等同于推送一个空的本地分支到远程分支。

1
2
3

$ git push origin :master
# 等同于
$ git push origin --delete master

上面命令表示删除origin主机的master分支。

如果当前分支与远程分支之间存在追踪关系，则本地分支和远程分支都可以省略。

1	$ git push origin

上面命令表示，将当前分支推送到origin主机的对应分支。

如果当前分支只有一个追踪分支，那么主机名都可以省略。

1	$ git push

如果当前分支与多个主机存在追踪关系，则可以使用-u选项指定一个默认主机，这样后面就可以不加任何参数使用git push。

1	$ git push -u origin master

上面命令将本地的master分支推送到origin主机，同时指定origin为默认主机，后面就可以不加任何参数使用git push了。

不带任何参数的git push，默认只推送当前分支，这叫做simple方式。此外，还有一种matching方式，会推送所有有对应的远程分支的本地分支。Git 2.0版本之前，默认采用matching方法，现在改为默认采用simple方式。如果要修改这个设置，可以采用git config命令。

1
2
3

$ git config --global push.default matching
# 或者
$ git config --global push.default simple

还有一种情况，就是不管是否存在对应的远程分支，将本地的所有分支都推送到远程主机，这时需要使用–all选项。

1	$ git push --all origin

上面命令表示，将所有本地分支都推送到origin主机。

如果远程主机的版本比本地版本更新，推送时Git会报错，要求先在本地做git pull合并差异，然后再推送到远程主机。这时，如果你一定要推送，可以使用–force选项。

1	$ git push --force origin

上面命令使用–force选项，结果导致远程主机上更新的版本被覆盖。除非你很确定要这样做，否则应该尽量避免使用–force选项。

最后，git push不会推送标签（tag），除非使用–tags选项。

1	$ git push origin --tags

六、git stash

git栈，在切换分支的时候，当前分支有未完成提交的代码，但又不想提交，一方面是因为代码没有完成，一方面是因为这样会在log中打印许多无用的日志信息。但是不提交就无法切换分支，于是git便开辟出来一个临时的仓库，这个仓库可以暂时存放最新修改过的代码。

git栈，可以存放多次修改，切换分支后这些存放的修改还在。


工作区--------暂存区--------本地仓库
   \
   
     \
     
       \----git栈

git stash

保存当前的工作进度,会分别对暂存区和工作区的状态进行保存。保存后工作区恢复到之前最后一次提交的状态

1	git stash list

显示进度列表。此命令显然显示了git stash 可以多次保存工作进度，并在恢复时候选择。

1	git stash pop [--index] []

如果不使用任何参数，会恢复最新保存的工作进度，并将恢复的工作进度从存储的git栈列表中清除。

如果提供参数（来自git stash list显示的列表），则将工作进度恢复。恢复完毕也将从git栈删除工作进度。

1	git stash [save [--patch] [-k\|--[no]keep-index] [-q\|--quiet] []]

这条命令实际上是第一条git stash命令的完整版。

使用参数–patch会显示工作区和HEAD的差异，通过对差异文件的编辑决定在进度中最终要保存的工作区的内容，通过编辑差异文件可以在进度中排除无关内容。

使用-k或者–keep-index参数，在保存进度后不会将暂存区重置。默认会将暂存区和工作区强制重置。

1	git stash apply [--index] []

除了不删除恢复的进度之外，其余和git stash pop 命令一样。

1	git stash drop []

删除一个存储的进度。默认删除最新的进度。

1	git stash clear

删除所有存储的进度。

fisa-vim-config

发表于 2017-12-29 | 分类于 vim | 阅读次数

install

1
2
3

sudo apt-get install curl vim exuberant-ctags git ack-grep
sudo pip3 install pep8 flake8 pyflakes isort yapf
# curl -O https://raw.githubusercontent.com/fisadev/fisa-vim-config/master/.vimrc

fisa-vim-config

fisa-vim-config/features.rst at master · fisadev/fisa-vim-config

Most important features include:

Plugins managed using Vim-plug! You can easily install or remove
plugins, and they are installed into .vim/plugged/. More info
here

| Command | Description |
| ———————————– | —————————————————————— |
| PlugInstall [name ...] [#threads] | Install plugins |
| PlugUpdate [name ...] [#threads] | Install or update plugins |
| PlugClean[!] | Remove unused directories (bang version will clean without prompt) |
| PlugUpgrade | Upgrade vim-plug itself |
| PlugStatus | Check the status of plugins |
| PlugDiff | Examine changes from the previous update and the pending changes |
| PlugSnapshot[!] [output path] | Generate script for restoring the current snapshot of the plugins |
Smart autocompletion as you type, sometimes using python
instrospection (completion of module names, instance methods and
attributes) and sometimes text-based (used words) (from version 4.0,
it’s even more intelligent!). And with neocomplcache, it even can
autocomplete with typos, thanks to the fuzzy completion settings.
Fuzzy file, code and command finder (like Textmante or Sublime
Text 2):
- ,e = open file (like the original :e) but with recursive and
  fuzzy file name matching. Example: if you type “mopy” it will
  find a file named “models.py” placed on a subdirectory. And
  allows you to open the selected file on a new tab with Ctrl-t!
- ,g = fuzzy symbol finder (classes, methods, variables,
  functions, …) on the current file. Example: if you type “usr”
  it will find the User class definition on the current file. ,G
  does the same but on all opened files.
- ,c = fuzzy command finder (internal vim commands, or custom
  commands). Example: if you type “diff” it will find :GitDiff,
  :diffthis, and many other similar commands.
- ,f = fuzzy text finder on all the opened files. Example: if
  you type “ctm=6” it will find the line containing “current_time
  = 16”.
- ,m = fuzzy finder of most recently used files.
- ,we, ,wg, ,wc, ,wf and ,wm = same as ,e, ,g, ,c,
  ,f and ,wm but initiate the search with the word under the
  cursor (also the upper case version of ,G, ,wG). Is useful
  to think about the ,wg as a “fuzzy go to definition” (if the
  definition is in the same file, or ,wG if the definition is on
  any of the opened files).
- ,pe = same as ,e but initiates the search with the path
  under the cursor.
Ropevim for really neat python goodies!:
- Go to definition with ,d, or open the definition on a new
  tab with ,D.
- Find occurrences with ,o.
Classes/module browser that lists classes, functions, methods,
and such of the current file, and navigates to them when ENTER is
pressed. Toggle it with F4.
Pending tasks browser pressing F2. This reads the current file
searching for comments that start with “TODO”, “FIXME”, and such,
and shows them on a list that allows navigation similar to the class
browser.
Error checking of code using Syntastic (it will detect unused
variables or imports, syntax errors, and such), for several
languages, highlighting the errors and warnings in the code. You can
open an errors list with \e. In python, the error checking
includes pep8 validation, and pylint.
Grep code recursively and navigate the results:
- ,r uses the ack command (a kind of grep optimized for code
  search), lists the found matches, and allows you to open them
  with ENTER.
- ,wr does the same, but searching the word under the cursor.
Some settings for better tabs and spaces handling.
Better file browser, toggle it with F3, or open it with your
current file selected using ,t.
Results count while searching text.
Search and read python documentation with the :Pydoc command.
Example: :Pydoc collections (also works over the current word with
vim’s default help keybinding: Shift-K).
Comment and uncomment code with n\ci.
Easy tab navigation:
- tt = new tab and leaves the cursor waiting to specify the file
  path to open (leave blank to open an empty tab).
- tn or Ctrl-Shift-Right = next tab.
- tp or Ctrl-Shift-Left = previous tab.
- tm = move current tab to a specific position (or to the end if
  no position number is specified).
- tl = show a list of current tabs with their inner windows on a
  side pane. You can navigate them!
- ts = duplicate current tab.
The mappings starting with the t letter work only on command mode,
but the mappings with Ctrl-Shift work on both, command and insert
mode.
Easy window navigation using Alt-arrows keys.
Some vim goodies enabled by default:
- incremental search (moves to the first result while you are
  typing).
- highlighted search results.
- line numbers.
- keep cursor 3 lines away from screen border while scrolling.
- shell-like autocompletion of commands and paths
  (autocomplete the common part and show matching options).
- syntax highlighting on by default.
Python interpreter inside vim, or any other console. They are
opened as a buffer using the command :ConqueTerm. Examples:
:ConqueTerm python, :ConqueTerm bash.
Save current file as sudo using :w!!.
Navigate html/xml tags the same way that you navigate (), {} and
[]: using %.
Beautiful status line allways visible, with colors, breadcrumbs
and useful information about file type, encoding and position. When
working with python files, it also displays the current python
function or class where the cursor is.
Automatically removes trailing spaces when saving python files.
Smart autoclosing of (, [, and {
Beautiful color schemes for on vim with 256 colors (fisa
colorscheme) and gvim (wombat colorscheme).
Use of 256 colors when possible.
2 spaces indentation for html and javascript (can disable it
removing two lines from the .vimrc).
Thousands of code snippets for many languages with SnipMate.
Example, in python you can write cl and press tab (while in
inser mode), and it will insert the boilerplate code of a common
python class (then use tab to navigate the snippet fields).

Zen coding for html: generate lots of html code writing simple
and short expressions. Example:

1.  write `#books>ul>li.book*5>a`
2.  press `Ctrl-y ,`
3.  it will generate:

        <div id="books">
            <ul>
                <li class="book">
                    <a href=""></a>
                </li>
                <li class="book">
                    <a href=""></a>
                </li>
                <li class="book">
                    <a href=""></a>
                </li>
                <li class="book">
                    <a href=""></a>
                </li>
                <li class="book">
                    <a href=""></a>
                </li>
            </ul>
        </div>

Learn more on the plugin
site.

Git and other vcs integration, with commands such as:
:GitStatus, :GitDiff, :GitBlame, :GitLog, :GitCommit, or
simply :Git with your own command. Key mappings and syntax
highlighting for git displays. Displays icons on the side of each
line based on the result of a diff of the current file (example: if
you added a line and still didn’t commit the file, that line will
have a green + on its side). And finally, when on a changed file
you can jump through changes using \sn and \sp.
Better python indentation.
Really neat surround actions using the surround.vim plugin.
Learn how to use it here.
Indentation defined text objects for the editing language, named
i. For example, you can change an entire indented code block with
cii, or the indented block and its header line with cai (also
yank, delete, …).
Indentation based movements, move to the header of your current
python block with [-, to the end of the block with ]-, and more
(short reference
here).
Python class and method/function text objects for the editing
language, named C and M. For example, you can change an entire
function content with ciM, or delete a class including its header
with daC.
Run the current python file and display the output on a split
with \r.
Insert and remove ipdb breakpoints with \b.
Copy history navigation using the YankRing plugin, which allows
you to cicle the vim clipboard with Ctrl-p and Ctrl-n, and many
other features (described
here).
Insert ipdb breakpoints with \b.
Automatically sort python imports using :Isort.
Persistent undos modify file, exit vim, reopen file, and you can
undo changes done on the previous session.
Better paths for temporary swap files, backups, and persistent
undos (all of them stored under ~/.vim/dirs).
Drag visual blocks (blocks selected on Ctrl-v and Shift-v
visual modes) with Shift-Alt-arrows, or even duplicate them
with D.
Simple window chooser: press - and you will see big green
letters for each window. Just press the letter of the window you
want to activate.
Paint css color values with the actual color.
Format Python code using yapf (:YapfFullFormat formats the
whole file, and has other commands as well, explained
here. Works only if
you have a vim compiled with python 2, not python 3).
Custom configs by folder add a .vim.custom file in the
project’s root folder with whatever configs you want to customize
for that project. For example, if you have a project tree like this
example and you want to exclude folder_x from FuzzyFinder, put
let g:ctrlp_custom_ignore["dir"] = g:ctrlp_custom_ignore["dir"] . '|\v[\/]folder_x$'
in the .vim.custom file.
```
project
├── folder_1
├── folder_2
├── folder_x
└── .vim.custom
```

k个存储空间限制情况下，实现未知规模序列数据的均匀采样

发表于 2017-12-26 | 分类于 algorithm ， other | 阅读次数

# coding:utf-8
import os
import random
import sys


'''
不知道采样个数n时，想要均匀采样k个的方法！
(只有k个存储空间限制的情况下)
若sample各不相同，则最终被选取的概率应为k/n

假设:
1. k<=n
2. 第i次的sample，最终接受它的概率为p(i)
3. 若已经有k个samples采用，则随机选一个剔除然后接受新的sample

采样方法:
1. i<=k, approve
2. i>k, approve with the probability of  k/i

证明：
对于前k个sample中任意一个sample，最终approved的概率为：
k/n = [1-p(k+1)/k] * [1-p(k+2)/k] ... * [1-p(n)/k]   ... (1)
注： p(i)*((k-1)/k)+1-p(i) 化简得 1-p(i)/k

对于i>k的任意一个sample，最终approved的概率为：
k/n = p(i) * [1-p(i+1)/k] * ... * [1-p(n)/k]         ... (2)

连立(1),(2),得
p(k+1)=k/(k+1)
...
p(n)=k/n

第i次的sample，最终接受它的概率为:
    p(i) = k/i

决定了p(i),则无论采样个数多少，则最终采用的

'''


class Sampling(object):

    def __init__(self, k):
        self.samples = []
        self.k = k
        self.tick = 0

    def sampling(self):
        return self.samples

    def read(self, sample):
        # processing
        self.processing(sample)
        assert len(self.samples) <= self.k, "Overflow"

    def approve(self, sample):
        idx = random.randint(0, self.k - 1)
        self.samples[idx] = sample

    def processing(self, sample):
        self.tick += 1
        if len(self.samples) < self.k:
            self.samples.append(sample)
        elif random.randint(1, self.tick) <= self.k:
            self.approve(sample)
        else:
            pass


class Stat(object):

    def __init__(self, T, N, k):
        """ Repeat T trials,
        each trial will read N characters and return k samples
        """
        self.T = T
        self.N = N
        assert 1 <= k <= 25
        assert k <= N
        self.k = k
        self.source = {}

    def stream(self):
        sampler = Sampling(self.k)
        for i in range(self.N):
            delta = random.randint(0, 10)
            c = chr(ord('A') + delta)
            if c not in self.source:
                self.source[c] = 1
            else:
                self.source[c] += 1
            sampler.read(c)
        return sampler

    def count(self):
        cnt = {}
        for t in range(self.T):
            sampler = self.stream()
            samples = sampler.sampling()
            for s in samples:
                if s in cnt:
                    cnt[s] += 1
                else:
                    cnt[s] = 1
        return cnt

    def statistic(self):

        cnt = self.count()
        total = sum(cnt.values())
        print "total: ", total
        for k, v in sorted(cnt.items()):
            print "%c %d %0.3f" % (k, v, float(v) / total)

        total = sum(self.source.values())
        print "source total: ", total
        for k, v in sorted(self.source.items()):
            print "%c %d %0.3f" % (k, v, float(v) / total)

if __name__ == '__main__':
    stat_char = Stat(100000, 10, 5)
    stat_char.statistic()

rst转换为md

发表于 2017-12-14 | 分类于 linux | 阅读次数

1	sudo apt install pandoc

FILES=*.rst
for f in $FILES
do
  filename="${f%.*}"
  echo "Converting $f to $filename.md"
  `pandoc $f -f rst -t markdown -o $filename.md`
done

vim registers 寄存器

发表于 2017-12-14 | 分类于 vim | 阅读次数

常见文本编辑器都会提供剪切板来支持复制粘贴，Vim也不例外。不同的是Vim提供了10类共48个寄存器，提供无与伦比的寄存功能。最常用的y操作将会拷贝到默认的匿名寄存器中，我们也可以指定具体拷贝到哪个寄存器中。

一般来讲，可以用

用```"{register}p```来粘贴```{register}```中的内容。例如： "ayy可以拷贝当前行到寄存器a中，而"ap则可以粘贴寄存器a中的内容。


除了a-z26个命名寄存器，Vim还提供了很多特殊寄存器。合理地使用可以极大地提高效率。例如：

* ```"+p```可以粘贴剪切板的内容，
* ```":p```可以粘贴上一个Vim命令（比如你刚刚费力拼写的正则表达式），
* ```"/p```可以粘贴上一次搜索关键词（你猜的没错，正是normal模式下的/foo搜索命令）。

在Vim中可通过```:reg```来查看每个寄存器当前的值。
寄存器分类

Vim提供了10类寄存器，可在Vim中通过:help registers查看帮助。

1. 匿名寄存器 ""
2. 编号寄存器 "0 到 "9
3. 小删除寄存器 "-
4. 26个命名寄存器 "a 到 "z
5. 3个只读寄存器 ":, "., "%
6. Buffer交替文件寄存器 "#
7. 表达式寄存器 "=
8. 选区和拖放寄存器 "*, "+, "~
9. 黑洞寄存器 "_
10. 搜索模式寄存器 "/

# 1. 匿名寄存器

使用d, c, s, x等会删除字符的命令时，被删除字符会进入匿名寄存器""。 你可以认为""寄存器是一个指针，指向刚才被存到的寄存器。

在如何用Vim搭建IDE？一文中提到，Mac下可通过下列设置来让Vim共享系统剪切板， 就是这个原理：所有删除和拷贝操作默认都会到匿名寄存器。

set clipboard=unnamed

使用y命令未指定寄存器会存到"0寄存器中，同时""会与该寄存器保有同样的值。 这意味着你使用p和"p总会得到同样的结果。
# 2. 编号寄存器

编号寄存器从"0到"9共10个，其中"0保存着拷贝来的字符串，"1到"9保存着删除掉的字符串。 删除操作符包括s, c, d, x。 删除掉的字符串会被存到"1中，上次删除的则会被存到"2中。以此类推，Vim会保存你最近的9次删除。

* 只有整行整行的删除，和通过段落级别的移动指令（包括%,(,),/,`,?,n,N,{,}） 的删除才会被放到"1中。
* 当用户指定拷贝操作的寄存器时（如"ap），"0不会被写入；但删除操作一定会被写入到"1中。

>
    "0寄存器很有用，比如我们copy了一段文本然后用它替换另一段文本。 这时默认寄存器""中的值就变成了被替换文本，如果还需要用copy的文本继续替换的话就需要"0p了。

# 3. 小删除寄存器

不足一行的小删除则会被放到小删除寄存器中（"-），起作用的删除操作符也包括s, c, d, x。 例如：

dw # 删除一个词
d9l # 删除9个字符
cb # 向前更改一个词

与"0寄存器类似，当用户指定寄存器并进行删除时，"-不会被写入。
# 4. 命名寄存器

命名寄存器有"a到"z共26个，这些寄存器只有当我们指定时才会被使用。 其实我们在录制宏时，所有键盘操作会以字符串的形式存到寄存器中。 例如录制一个宏存到"a寄存器中，内容为更改当前行cc，改为foo字符串：

qaccfoo

然后执行:reg来查看寄存器，可以发现a寄存器的值是ccfoo。

    小技巧：当使用小写字母进行操作时会覆盖当前寄存器内容，当使用大写字母进行操作时，会追加当前寄存器内容。

# 5. 只读寄存器

只读寄存器共3个，它们的值是由Vim提供的，不允许改变：

    ".：上次insert模式中插入的字符串。还记得吗？.命令可以重复上次操作，而".存储了上次插入。
    "%：当前文件名，不是全路径，也不是纯文件名，而是从当前Vim的工作目录到该文件的路径。例如此时Harttle的Vim中，"%p的结果为_drafts/vim-registers.md。
    ":：上次命令模式下键入的命令。正如@a可以执行"a寄存器中的宏一样，":可以执行上次命令。

# 6. 交替文件寄存器

交替文件寄存器"#存储着当前Vim窗口（Window）的交替文件。交替文件（alternate file）是指 Buffer中的上一个文件，可通过Ctrl+^来切换交替文件与当前文件。

    Window和Buffer有什么区别？参见Vim 多文件编辑：窗口一文。

# 7. 表达式寄存器

表达式寄存器"=主要用于计算Vim脚本的返回值，并插入到文本中。 当我们键入"=后光标会移动到命令行，此时我们可以输入任何Vim脚本的表达式。 例如3+2，按下回车并且p则会得到5。

这在我们调试Vim脚本时非常有用，比如调用一个函数看它是否有正确的返回值。
# 8. 选择和拖放寄存器

选择和拖放寄存器包括"*, "+, 和"~，这三个寄存器的行为是和GUI相关的。

"*和"+在Mac和Windows中，都是指系统剪切板（clipboard），例如"*yy即可复制当前行到剪切板。 以供其他程序中粘贴。其他程序中复制的内容也会被存储到这两个寄存器中。 在X11系统中（绝大多数带有桌面环境的Linux发行版），二者是有区别的：

    "*指X11中的PRIMARY选区，即鼠标选中区域。在桌面系统中可按鼠标中键粘贴。
    "+指X11中的CLIPBOARD选区，即系统剪切板。在桌面系统中可按Ctrl+V粘贴。

    上文所述的Mac下set clipboard=unnamed会使得系统剪切板寄存器"*和Vim默认的匿名寄存器""始终保有同样的值，即Vim和系统共用剪切板。

有文本拖拽到Vim时，被拖拽的文本被存储在"~中。Vim默认的行为是将"~中内容插入到光标所在位置。 当然你可以给<DROP>做键盘映射。
# 9. 黑洞寄存器

黑洞寄存器"_，所有删除或拷贝到黑洞寄存器的文本将会消失。 这是为了在删除文本的同时不影响任何寄存器的值，"_通常用于Vim脚本中。
# 10. 搜索寄存器

搜索寄存器"/用于存储上一次搜索的关键词。Vim中如何进行搜索呢？ 在normal模式下按下/即进入search模式，输入关键字并按下回车即可。

该寄存器是可写的，例如:let @/ = "harttle"将会把"harttle"写入该寄存器。 下次搜索时不输入搜索词直接回车便会搜索"harttle"。
命令行模式拷贝

值得一提的时，任何寄存器中的值都是可以拷贝到命令模式下的。

比如对于寄存器"a中的值，在normal模式下可以通过"ap来粘贴；在command-line模式下通过<Ctrl-R>a来粘贴。这一操作存在风险，因为寄存器中的值可能是从网页中拷贝来的。

如果寄存器中的字符串存在<Esc>字符或<CR>字符，则会时Vim回到normal模式， 并继续执行寄存器中的命令。为了防范剪切板劫持，可以添加下列的Vim配置：

inoremap + u<C->“+gP
`
该命令的解释请移步：http://vim.wikia.com/wiki/Pasting_registers