Python glob: How to search files
The glob module is a module for performing wildcard searches in directories, which can return filenames that match a specific pattern as a list. This is very convenient when writing scripts for file operations and is used in various scenarios such as batch processing of files, searching for files with specific extensions, and exploring directories.
Basic Usage of the glob Function
Example of Basic Pattern Matching
Below is the basic usage of the glob function. In this example, it searches for all text files in the current directory.
1import glob
2
3files = glob.glob('*.txt')
4print(files)
This code returns all filenames that match the *.txt
pattern as a list.
How to Find Files from Multiple Directories
You can also find files from a specific directory by including the directory path in the pattern.
1import glob
2
3files = glob.glob('/path/to/directory/*.txt')
4print(files)
This code searches for all text files within the /path/to/directory/
directory.
How to Search for Files with a Specific File Type (Extension)
To search for files with a specific extension, include that extension in the pattern. The following example searches for all Python files with a .py
extension.
1import glob
2
3files = glob.glob('*.py')
4print(files)
Advanced Usage of the glob Module
How to Find Files Recursively
To search through a directory tree recursively, use **
. This pattern matches any number of subdirectories. To enable recursive search, you must specify recursive=True
.
1import glob
2
3files = glob.glob('**/*.txt', recursive=True)
4print(files)
This code searches for text files in the current directory and all its subdirectories.
How to Search for Multiple File Types at Once
To search for files with multiple extensions at once, specify the extensions in curly braces {}
.
1import glob
2
3files = glob.glob('*.{txt,csv}', recursive=True)
4print(files)
This code searches for both text files and CSV files.
Performance and Efficiency of the glob Module
Performance of the glob Module
The glob module may experience performance degradation when there are a large number of files or directories, especially when performing recursive searches. Therefore, when searching large file systems, it’s important to specify the necessary directories or file types as specifically as possible.
Efficient Use of the glob Module
To use the glob module efficiently, consider the following points:
- Avoid unnecessary recursive searches
- Specify the file type or directory specifically
- Use as narrow a range of wildcards as possible for efficient pattern matching