Python Note 100 - File#

:date: 2017-02-13
:modified: 2024-03-04
:slug: python-note-100-file
:tags: python, note, file
:category: Development
:author: Dormouse Young
:summary: Python note series 100 - file

[1]:

import os
import stat
from collections import Counter
from datetime import datetime
from pathlib import Path

创建文件#

[2]:

my_file = Path('/tmp/first/firstone/tmp.txt')

# my_file.touch()
# touch 方法用于创建空文件，目录必须存在，否则无法创建
# ---------------------------------------------------------------------------
# FileNotFoundError                         Traceback (most recent call last)
# Cell In[9], line 2
#       1 my_file = Path('/tmp/first/firstone/tmp.txt')
# ----> 2 my_file.touch()
# ....

[3]:

my_path = Path('/tmp/first/firstone/')
my_path.mkdir(exist_ok=True,parents=True)
my_file.touch()
my_file.exists()

[3]:

True

文件名的拆解#

[4]:

my_file.name                          # 获取文件名

[4]:

'tmp.txt'

[5]:

my_file.stem                          # 获取文件名除后缀的部分

[5]:

'tmp'

[6]:

Path('tmp_file.tar.gz').stem  # 获取文件名除后缀的部分

[6]:

'tmp_file.tar'

[7]:

my_file.suffix                        # 文件后缀

[7]:

'.txt'

[8]:

my_file.suffixes                 # 文件的后缀们...

[8]:

['.txt']

[9]:

Path('tmp_file.tar.gz').suffix  # 文件后缀

[9]:

'.gz'

[10]:

Path('tmp_file.tar.gz').suffixes # 文件的后缀们...

[10]:

['.tar', '.gz']

[11]:

my_file.parent                        # 相当于dirnanme

[11]:

PosixPath('/tmp/first/firstone')

[12]:

# p.parents                       # 返回一个iter, 包含所有父目录
list(my_file.parents)

[12]:

[PosixPath('/tmp/first/firstone'),
 PosixPath('/tmp/first'),
 PosixPath('/tmp'),
 PosixPath('/')]

[13]:

my_file.parts                     # 将路径通过分隔符分割成一个元组

[13]:

('/', 'tmp', 'first', 'firstone', 'tmp.txt')

>>> desk = Path('C:/Users/Administrator/Desktop/')
>>> desk.parent
WindowsPath('C:/Users/Administrator')

>>> desk.parent.parent
WindowsPath('C:/Users')

>>> list(desk.parents)
[WindowsPath('C:/Users/Administrator'),
 WindowsPath('C:/Users'),
 WindowsPath('C:/')]

文件名替换#

[14]:

# with_name(name)替换路径最后一部分并返回一个新路径
my_file.with_name('python.txt')

[14]:

PosixPath('/tmp/first/firstone/python.txt')

[15]:

# with_suffix(suffix)替换扩展名，返回新的路径，扩展名存在则不变
my_file.with_suffix('.txt')

[15]:

PosixPath('/tmp/first/firstone/tmp.txt')

文件信息#

[16]:

my_file.stat()                        # 获取详细信息

[16]:

os.stat_result(st_mode=33204, st_ino=30540243, st_dev=2050, st_nlink=1, st_uid=1000, st_gid=1000, st_size=0, st_atime=1721027775, st_mtime=1721027775, st_ctime=1721027775)

[17]:

my_file.stat().st_size                # 文件大小

[17]:

[18]:

my_file.stat().st_ctime               # 创建时间

[18]:

1721027775.697604

[19]:

my_file.stat().st_mtime               # 修改时间

[19]:

1721027775.697604

[20]:

# 以下为老方法

my_file_str = str(my_file)
oct(stat.S_IMODE(os.lstat(my_file_str).st_mode))

[20]:

'0o664'

[21]:

oct(os.stat(my_file_str)[stat.ST_MODE])

[21]:

'0o100664'

[22]:

oct(os.stat(my_file_str).st_mode & 0o777)

[22]:

'0o664'

常用常数

项目	十六进制	说明
S_IRWXU	00700	mask for file owner permissions
S_IRUSR	00400	owner has read permission
S_IWUSR	00200	owner has write permission
S_IXUSR	00100	owner has execute permission
S_IRWXG	00070	mask for group permissions
S_IRGRP	00040	group has read permission
S_IWGRP	00020	group has write permission
S_IXGRP	00010	group has execute permission
S_IRWXO	00007	mask for permissions for others (not in group)
S_IROTH	00004	others have read permission
S_IWOTH	00002	others have write permission
S_IXOTH	00001	others have execute permission

读写文件#

写入文件#

[23]:

todo_string="""# TODO LIST
## Today

* Read book
* Buy milk

## Tomorrow

* Hike out
"""
todo_file_path = Path('/tmp/todo.md')
todo_file_path.write_text(todo_string)

[23]:

读取文件#

[24]:

content = todo_file_path.read_text(encoding="utf-8")
[line for line in content.splitlines() if line.startswith("*")]

[24]:

['* Read book', '* Buy milk', '* Hike out']

复制文件#

[25]:

# Pathlib 没有现成的复制，只有用读取和写入替代

# 可以考虑使用老的 shutil ，下文有示例

my_file = Path('/tmp/first/firstone/tmp.txt')
des_file = my_file.with_name('python.txt')
des_file.write_bytes(my_file.read_bytes())
[line for line in des_file.read_bytes().splitlines()]

[25]:

[]

移动文件（包含重命名文件）#

[26]:

source = Path('/tmp/first/firstone/tmp.txt')
destination = Path('/tmp/first/first_tmp.txt')

if not destination.exists():
    source.replace(destination)

# 为了避免 race condition ,可以采用以下方式
try:
    with destination.open(mode="xb") as file:
        file.write(source.read_bytes())
except FileExistsError:
    print(f"File {destination} exists already.")
else:
    source.unlink()

File /tmp/first/first_tmp.txt exists already.

文件操作专题#

遍历文件#

[27]:

paths = [
    '/tmp/iterfile/oneone.txt',
    '/tmp/iterfile/onetwo.txt',
    '/tmp/iterfile/twoone.py',
    '/tmp/iterfile/sub/subone.py',
]
for item in paths:
    my_file = Path(item)
    my_file.parent.mkdir(exist_ok=True,parents=True)
    if not my_file.exists():
        my_file.touch()
Counter(path.suffix for path in Path('/tmp/iterfile/').iterdir())
# 这里注意子文件夹里的文件是不涉及的。子文件夹没有扩张名，也会计数。

[27]:

Counter({'.txt': 2, '': 1, '.py': 1})

[28]:

Counter( Path('/tmp/iterfile/').iterdir())

[28]:

Counter({PosixPath('/tmp/iterfile/oneone.txt'): 1,
         PosixPath('/tmp/iterfile/onetwo.txt'): 1,
         PosixPath('/tmp/iterfile/sub'): 1,
         PosixPath('/tmp/iterfile/twoone.py'): 1})

[29]:

# 用 glob 可以排除子目录
Counter(path.suffix for path in Path('/tmp/iterfile/').glob('*.*'))

[29]:

Counter({'.txt': 2, '.py': 1})

[30]:

# 用 rglob 可以递归子目录
Counter(path.suffix for path in Path('/tmp/iterfile/').rglob('*.*'))

[30]:

Counter({'.txt': 2, '.py': 2})

显示树形目录结构#

[31]:

def tree(directory):
    print(f"+ {directory}")
    for path in sorted(directory.rglob("*")):
        depth = len(path.relative_to(directory).parts)
        spacer = "    " * depth
        print(f"{spacer}+ {path.name}")
tree( Path('/tmp/iterfile/'))

+ /tmp/iterfile
    + oneone.txt
    + onetwo.txt
    + sub
        + subone.py
    + twoone.py

查找最新修改的文件#

[32]:

my_dir = Path('/tmp/iterfile/')
time, file_path = max((f.stat().st_mtime, f) for f in my_dir.iterdir())
datetime.fromtimestamp(time), file_path

[32]:

(datetime.datetime(2024, 7, 15, 14, 54, 12, 373026),
 PosixPath('/tmp/iterfile/twoone.py'))

以前的老方法#

打开文件#

with open("/tmp/foo.txt") as file:
    data = file.read()

with open('examples/favorite-people.txt', encoding='utf-8') as a_file:
    for a_line in a_file:
        line_number += 1
        print('{:>4} {}'.format(line_number, a_line.rstrip()))

使用字符串的 format() 方法可以打印出行号和行自身。格式说明符 {:>4} 的意思是 “使用最多四个空格使之右对齐，然后打印此参数。”变量 a_line 是包括回车符等在内的完整的一行。字符串方法rstrip()可以去掉尾随的空白符，包括回车符。

写入文件#

with open(csvfile, 'w') as f:
    f.writelines(linelist)
f.close()

关于 open 模式#

open 的模式如下表：

命令	说明
r	以读方式打开
w	以写方式打开
a	以追加模式打开 (从 EOF 开始, 必要时创建新文件)
r+	以读写模式打开
w+	以读写模式打开 (参见 w )
a+	以读写模式打开 (参见 a )
rb	以二进制读模式打开
wb	以二进制写模式打开 (参见 w )
ab	以二进制追加模式打开 (参见 a )
rb+	以二进制读写模式打开 (参见 r+ )
wb+	以二进制读写模式打开 (参见 w+ )
ab+	以二进制读写模式打开 (参见 a+ )

shutil 操作#

复制文件：

shutil.copyfile(“oldfile”,“newfile”) oldfile 和 newfile 都只能是文件。
shutil.copy(“oldfile”,“newfile”) oldfile 只能是文件夹， newfile 可以是文件，也可以是目标目录

复制文件夹：

shutil.copytree(“olddir”,“newdir”) olddir和newdir都只能是目录，且newdir必须不存在

移动文件（目录）：

shutil.move(“oldpos”,“newpos”)

删除目录：

shutil.rmtree(“dir”) 空目录、有内容的目录都可以删

os 和 os.path 模块#

os.mkdir(“file”)：创建目录
os.rmdir(“dir”)：只能删除空目录
os.listdir(dirname)：列出dirname下的目录和文件
os.getcwd()：获得当前工作目录
os.curdir：返回当前目录（‘.’)
os.chdir(dirname)：改变工作目录到dirname
os.remove(“file”)：删除文件
os.rename(“oldname”,“newname”)：重命名文件（目录），文件或目录都是使用这条命令
os.path.isdir(name)：判断name是不是一个目录，name不是目录就返回false
os.path.isfile(name)：判断name是不是一个文件，不存在name也返回false
os.path.exists(name)：判断是否存在文件或目录name
os.path.getsize(name)：获得文件大小，如果name是目录返回0L
os.path.abspath(name)：获得绝对路径
os.path.normpath(path)：规范path字符串形式
os.path.split(name)：分割文件名与目录（事实上，如果你完全使用目录，它也会将最后一个目录作为文件名而分离，同时它不会判断文件或目录是否存在）
os.path.splitext()：分离文件名与扩展名，返回一个tuple：(“aaa”,“.txt”)
os.path.join(path,name)：连接目录与文件名或目录
os.path.basename(path)：返回文件名
os.path.dirname(path)：返回文件路径

获得同一后缀名的文件#

import glob
for filename in glob.glob("*.xls"):
     print filename

Python

Python Note 200 - List

Python Note 100 - File#

创建文件#

文件名的拆解#

文件名替换#

文件信息#

读写文件#

写入文件#

读取文件#

复制文件#

移动文件（包含重命名文件）#

文件操作专题#

遍历文件#

显示树形目录结构#

查找最新修改的文件#

以前的老方法#

打开文件#

写入文件#

关于 open 模式#

shutil 操作#

相关函数#

os 和 os.path 模块#

获得同一后缀名的文件#