记录并整理Python相关安装及应用知识。

Contents

环境部署

WSL X Debian

  1. install from source
sudo apt update && sudo apt upgrade -y

sudo apt install wget build-essential libreadline-dev libncursesw5-dev libssl-dev libsqlite3-dev tk-dev libgdbm-dev libc6-dev libbz2-dev libffi-dev zlib1g-dev -y

PYTHON_VER=3.10.5 && PYTHON_SVER=3.10
wget https://www.python.org/ftp/python/$PYTHON_VER/Python-$PYTHON_VER.tgz

tar -xzvf Python-$PYTHON_VER.tgz && cd Python-$PYTHON_VER


./configure --enable-optimizations
sudo make altinstall  # recommend
cd .. && sudo rm -rf Python-$PYTHON_VER && rm Python-$PYTHON_VER.tgz

whereis python$PYTHON_SVER
sudo ln -s /usr/local/bin/python$PYTHON_SVER /usr/bin/python$PYTHON_SVER
sudo ln -s /usr/local/bin/python$PYTHON_SVER-config /usr/bin/python$PYTHON_SVER-config

PYENV | Python版本管理

1. 安装
# Ubuntu 18.04 LTS (安装完后按照提示添加环境变量)
curl https://pyenv.run | bash
2. 常用命令
# 查看当前版本
pyenv versions
# 查看可安装Python版本
pyenv install -l
# 根据系统版本提前安装必要依赖(https://github.com/pyenv/pyenv/wiki/common-build-problems),否则 Python 可能构建不完整
sudo apt install -y make build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev python-openssl git
# 安装
pyenv install x.x.x
# 卸载
pyenv uninstall x.x.x
# 切换
## 优先级 shell > local > global,逐级查找
## 全局(将版本号写入 ~/.pyenv/version 文件)
pyenv global x.x.x
## 本地(将版本号写入当前目录下的 .python-version 文件)
pyenv local x.x.x
## SHELL(设置当前 shell 的 PYENV_VERSION 环境变量)
pyenv shell x.x.x
## 取消
pyenv shell --unset
# 垫片路径
pyenv rehash
3. 问题
  1. 镜像加速

    1. 从镜像或官方下载 tar.xz 包
    2. 保存到安装缓存目录(~/.pyenv/cache)
    3. 执行 pyenv install x.x.x

Pip

下载加速
  1. 修改包源
# 路径`%APPDATA%`下创建pip/pip.ini
[global]
timeout = 6000
index-url = https://pypi.tuna.tsinghua.edu.cn/simple/
trusted-host = pypi.tuna.tsinghua.edu.cn
  1. 临时指定源
pip install xxx -i https://pypi.tuna.tsinghua.edu.cn/simple/

PIPENV | Python依赖及包管理

  • 不需要单独使用pipvirtualenvpipenv将它们结合到一起
  • pipenv使用自动管理的PipfilePipfile.lock代替人工维护的requirements.txt
  • 广泛使用Hashes保障安全
  • 鼓励使用最新依赖
  • 提供依赖图 (pipenv graph)
  • 通过加载.env文件简化开发流程
1. 安装
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple --user pipenv
# 如出现 pipenv: command not found 需要添加环境变量到文件(~/.profile)
vim .profile
# 添加环境
export PATH=$PATH:~/.local/bin
# 刷新当前 bash 环境,profile文件只在用户登录时执行一次,彻底生效需重启
#  注意 .profile 与 .bashrc 的区别
source .profile
2. 配置
cd project_dir
# 初始化pipenv
  # 1.
pipenv --python 3.7 # 或默认
pipenv install
  # 2. 或直接指定python安装路径(推荐)
pipenv --python C:/Utils/python/Python27/python.exe install [--skip-lock]
# 修改包源
  # 编辑`Pipfile`文件`source`下的`url`
  # 修改包源
pipenv install --pypi-mirror https://pypi.tuna.tsinghua.edu.cn/simple
pipenv update --pypi-mirror https://pypi.tuna.tsinghua.edu.cn/simple
pipenv sync --pypi-mirror https://pypi.tuna.tsinghua.edu.cn/simple
pipenv lock --pypi-mirror https://pypi.tuna.tsinghua.edu.cn/simple
pipenv uninstall --pypi-mirror https://pypi.tuna.tsinghua.edu.cn/simple

https://pypi.org/simple -> https://pypi.tuna.tsinghua.edu.cn/simple
3. 进入环境
pipenv shell
4. 安装包
# 可指定版本
pipenv install xxx[==x.xx]
# 安装 .whl 预编译文件
  # 1. 找到文件的 url
  # 2. 安装
pipenv install url
5. 退出环境
exit

Anaconda

  1. 添加Anaconda到右键Prompt菜单

方法1:

  1. 如Anaconda安装路径为C:\ProgramData\Anaconda3\
  2. Win+R $\to$ regedit
  3. 定位到HKEY_CLASSES_ROOT\Directory\Background\shell\,并新建项Anaconda3,默认值为Anaconda3 Prompt Here,在Anaconda3下新建字符串值Icon,数据为 C:\ProgramData\Anaconda3\Menu\Iconleak-Atrous-Console.ico PS:如果想要在同时按下Shift鼠标右键才出现菜单项,而不是直接鼠标右键就出现,就在Anaconda3下新建字符串值Extended,数据留空
  4. Anaconda3下新建项command,默认值填入cmd.exe /s /k "title Anaconda3" && C:\ProgramData\Anaconda3\Scripts\activate.bat C:\ProgramData\Anaconda3

方法2:

Windows Registry Editor Version 5.00
[HKEY_CLASSES_ROOT\Directory\Background\shell\Anaconda3]
@="Anaconda3 Prompt Here"
"Icon"="C:\\ProgramData\\Anaconda3\\Menu\\Iconleak-Atrous-Console.ico"
[HKEY_CLASSES_ROOT\Directory\Background\shell\Anaconda3\command]
@="cmd.exe /s /k \"title Anaconda3\" && C:\\ProgramData\\Anaconda3\\Scripts\\activate.bat C:\\ProgramData\\Anaconda3"
# if install_dir = `C:\Users\tangzhiyong\AppData\Local\Continuum\anaconda3\`
Windows Registry Editor Version 5.00
[HKEY_CLASSES_ROOT\Directory\Background\shell\Anaconda3]
@="Anaconda3 Prompt Here"
"Icon"="C:\\Users\\tangzhiyong\\AppData\\Local\\Continuum\\anaconda3\\Menu\\Iconleak-Atrous-Console.ico"
[HKEY_CLASSES_ROOT\Directory\Background\shell\Anaconda3\command]
@="cmd.exe /s /k \"title Anaconda3\" && C:\\Users\\tangzhiyong\\AppData\\Local\\Continuum\\anaconda3\\Scripts\\activate.bat C:\\Users\\tangzhiyong\\AppData\\Local\\Continuum\\anaconda3"

Pycharm

Spyder

VSCode

语法特性

Python2 vs Python3

  1. 包导入
    Python2默认相对路径导入,Python3默认绝对路径导入,尽量运行模块使用绝对导入、调用包使用相对导入

Python3

Python <3.7

传参
Keyword-Only Arguments: *

指明某些函数形参仅限使用关键字参数的形式

# key 必须以关键字参数形式传入
def compare(a, b, *, key=None):
    ...
类型注解
def add(x:int, y:int) -> int:
    return x + y
变量作用域
with 语句
contextlib

为涉及 with 语句的常见任务提供实用工具。

  • Python 3.4

    • contextlib.redirect_stdout

      用于将 sys.stdout 临时重定向到一个文件或类文件对象的上下文管理器

    • contextlib.redirect_stderr

      用于将 sys.stderr 临时重定向到一个文件或类文件对象的上下文管理器

    # redirect to string
    with redirect_stdout(io.StringIO()) as f:
      help(pow)
    s = f.getvalue()
    # redirect to file
    with open('help.txt', 'w') as f:
        with redirect_stdout(f):
            help(pow)
    # redirect to stderr (pay attention to the side effect)
    with redirect_stdout(sys.stderr):
      help(pow)

Python 3.7

Python 3.8

Features
Assignment expressions: :=

在表达式内部为变量赋值

Positional-only parameters: /

指明某些函数形参仅限使用位置参数的形式

# a, b 仅限为位置形参;c, d 可以为位置形参或关键字形参;e, f 仅限为关键字形参
def f(a, b, /, c, d, *, e, f):
    print(a, b, c, d, e, f)
f-strings support: =

用于自动记录表达式和调试文档

print(f'{theta=}  {cos(radians(theta))=:.3f}')
# >> theta=30  cos(radians(theta))=0.866

python 3.9

Features
Dictionary Merge & Update Operators: | and |=

作为 dict 的合并和更新运算符,是 dict.update{**d1, **d2} 的补充。

New String Methods to Remove Prefixes and Suffixe: str.removeprefix(prefix) and str.removesuffix(suffix)

新的字符串方法,用于移除(存在的)前缀和后缀

python 3.10

Features
Parenthesized context managers

带圆括号的上下文管理器支持连续多行的书写多个上下文管理器,更好的格式化源代码


with (CtxManager() as example):
    ...

with (
    CtxManager1(),
    CtxManager2()
):
    ...

with (CtxManager1() as example,
      CtxManager2()):
    ...

with (CtxManager1(),
      CtxManager2() as example):
    ...

with (
    CtxManager1() as example1,
    CtxManager2() as example2
):
    ...
Structural Pattern Matching: match/case

基于模式匹配加相应动作的方式实现结构化模式匹配,类似于 C 中的 switch/case 语句,但是更强大。

match subject:
    case <pattern_1>:
        <action_1>
    case <pattern_2>:
        <action_2>
    case <pattern_3>:
        <action_3>
    case _:
        <action_wildcard>
New Type Union Operator: |

新的类型联合运算符

# before
def square(number: Union[int, float]) -> Union[int, float]:
    return number ** 2
# now
def square(number: int | float) -> int | float:
    return number ** 2
# also valid
isinstance(1, int | str)

python 3.11

相比 Python 3.10,性能提升 $10\% \sim 60\%$,在标准测试套件中加速 1.25x

Features

混合编程

Python <===> C/C++

pybind11

pybind11 是一个轻量级的用于在 PythonC++ 之间互相暴露类型的 header-only 库。

Python ===> C/C++
Embedding Python in Another Application
C/C++ ===> Python
Extending Python with C or C++

实战

网络爬虫

1. 环境

CentOS 6

  1. Phantomjs
# Selenium 不再支持,放弃
wget https://bitbucket.org/ariya/phantomjs/downloads/phantomjs-2.1.1-linux-x86_64.tar.bz2
tar jxvf phantomjs-2.1.1-linux-x86_64.tar.bz2
mv phantomjs-2.1.1-linux-x86_64 /usr/local/src/phantomjs
ln -sf /usr/local/src/phantomjs/bin/phantomjs /usr/local/bin/phantomjs
  1. Selenium
pip install selenium
  1. XVFB
yum install Xvfb
yum install xorg-x11-fonts*
Xvfb :2 -ac -screen 0 1024x768x16 &
  1. PyVirtualDisplay
pip install pyvirtualdisplay
  1. Firefox安装geckodriver驱动
# https://github.com/mozilla/geckodriver/releases 下载
wget https://github.com/mozilla/geckodriver/releases/download/v0.24.0/geckodriver-v0.24.0-linux64.tar.gz
tar zxvf geckodriver-v0.24.0-linux64.tar.gz
mv geckodriver /usr/local/bin
  1. dbus
# 基于 geckodriver.log 错误信息
yum install dbus
dbus-uuidgen > /var/lib/dbus/machine-id
yum install -y dbus-x11
  1. Centos 6 & Firefox 测试
from selenium import webdriver
from pyvirtualdisplay import Display

display = Display(visible=0, size=(800,600))
display.start()
options = webdriver.FirefoxOptions()
options.set_headless()
browser = webdriver.Firefox(executable_path=r"/usr/local/bin/geckodriver",firefox_options=options)
browser.get(r'https://www.baidu.com')
print(browser.title)

browser.quit()
display.stop()

Debian 9

  1. How to Setup Selenium with ChromeDriver on Debian 10/9/8
# 依赖
apt update
apt install -y unzip xvfb libxi6 libgconf-2-4
apt install -y default-jdk
# Google Chrome
wget -q -O - https://dl.google.com/linux/linux_signing_key.pub | apt-key add -
echo "deb http://dl.google.com/linux/chrome/deb/ stable main" | tee /etc/apt/sources.list.d/google-chrome.list
或
curl -sS -o - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add
echo "deb [arch=amd64]  http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list
apt -y update
apt -y install google-chrome-stable
    # root 用户需要
    vim /opt/google/chrome/google-chrome
    `exec -a "$0" "$HERE/chrome" "$@"`修改为`exec -a "$0" "$HERE/chrome" "$@" --user-data-dir --no-sandbox`
# ChromeDriver (注意版本)
wget https://chromedriver.storage.googleapis.com/2.41/chromedriver_linux64.zip
unzip chromedriver_linux64.zip
mv chromedriver /usr/bin/chromedriver
chown root:root /usr/bin/chromedriver
chmod +x /usr/bin/chromedriver
# firefox
apt update & apt install snapd
snap install firefox
ln -s /snap/bin/firefox /usr/bin/firefox
# install xvfb
apt install -y xvfb
# install pip
wget https://bootstrap.pypa.io/get-pip.py
# for pip3 
python3 get-pip.py
# for pip2
python get-pip.py
# install xdpyinfo
apt-get install x11-utils
# install chrome
  1. 创建非root用户
adduser yirami # passwd yirami.xyz
# deluser -r yirami
su
apt install sudo
addusr yirami sudo
chmod +w /etc/sudoers
vim /etc/sudoers `%sudo ALL = (ALL:ALL) ALL`下新增`yirami ALL = (ALL:ALL) ALL`
  1. Chrome 测试
from selenium import webdriver
from pyvirtualdisplay import Display
from selenium.webdriver.chrome.options import Options

chrome_options = webdriver.chrome.options.Options()
chrome_options.binary_location = r'/usr/bin/chromedriver'
chrome_options.set_capability("--log-level", "DEBUG");
chrome_options.add_argument('--headless')
chrome_options.add_argument("start-maximized")
chrome_options.add_argument('--disable-gpu')
chrome_options.add_argument("disable-infobars")
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-extensions')
chrome_options.add_argument('--disable-dev-shm-usage')

display = Display(visible=0, size=(800,600))
display.start()
browser = webdriver.Chrome(chrome_options=chrome_options, service_args=['--verbose'], service_log_path="chromedriver.log")
browser.get(r'https://www.baidu.com')
print(browser.title)

browser.quit()
display.stop()

踩坑热图

参考