This post will discuss how to iterate over files in a directory in Python.
1. Using os.listdir()
function
A simple solution to iterate over files in a directory is using the os.listdir()
function. It returns the list of files and subdirectories present in the specified directory. To get only the files, you can filter the list with the os.path.isfile()
function:
1 2 3 4 5 6 7 8 |
import os directory = 'path/to/dir' for filename in os.listdir(directory): f = os.path.join(directory, filename) if os.path.isfile(f): print(f) |
To get files of a specific extension, say .txt
, you can add a condition to check for file extension.
1 2 3 4 5 6 7 8 |
import os directory = 'path/to/dir' for filename in os.listdir(directory): f = os.path.join(directory, filename) if os.path.isfile(f) and filename.endswith('.txt'): print(f) |
2. Using os.scandir()
function
Starting with Python 3.5, consider using the os.scandir()
function when you need file type or file attribute information. It returns directory entries and file attribute information, giving significantly better performance over os.listdir()
.
1 2 3 4 5 6 7 8 |
import os directory = 'path/to/dir' for entry in os.scandir(directory): if entry.is_file() and entry.name.endswith('.txt'): print(entry.path) |
3. Using pathlib
module
With Python 3.4, you can also use the pathlib
module. To iterate over files in a directory, use the Path.glob(pattern)
function, which glob the given relative pattern in the specified directory and yield the matching files.
The following example shows how to filter and display text files present in a directory.
1 2 3 4 5 6 7 |
from pathlib import Path directory = 'path/to/dir' pathlist = Path(directory).glob('*.txt') for path in pathlist: print(path) |
Alternatively, you can use the Path.iterdir()
function, which yields path objects of the directory contents. To get file extension of the file, use the suffix
property:
1 2 3 4 5 6 7 8 |
from pathlib import Path directory = 'path/to/dir' for path in Path(directory).iterdir(): if path.is_file() and path.suffix == '.txt': print(path) |
4. Using os.walk()
function
If you need to search for subdirectories as well, consider using the os.walk()
function. It yields a 3-tuple (dirpath, dirnames, filenames)
for everything reachable from the specified directory, where dirpath
is the path to the directory, dirnames
is a list of the names of the subdirectories in dirpath, and filenames
is a list of the names of the non-directory files in dirpath.
1 2 3 4 5 6 7 8 9 |
import os directory = 'path/to/dir' for root, dirs, files in os.walk(directory): for file in files: if file.endswith('.txt'): print(os.path.join(root, file)) |
As of Python 3.5, os.walk()
calls os.scandir()
instead of os.listdir()
, hence making it faster by reducing the total number of calls to os.stat()
.
5. Using glob
module
Finally, you can use the glob.iglob
function, which returns an iterator over the list of pathnames that match the specified pattern.
1 2 3 4 5 6 7 |
import glob directory = 'path/to/dir' for path in glob.iglob(f'{directory}/*.txt'): print(path) |
Python version 3.5 extended support for recursive globs using **
that allows you to search subdirectories and symbolic links to directories.
1 2 3 4 5 6 7 |
import glob directory = 'path/to/dir' for path in glob.iglob(f'{directory}/**/*.txt', recursive=True): print(path) |
That’s all about iterating over files in a directory in Python.