Introduction
In Python, binary types are used to handle sequences of bytes, which are essential for working with binary data. Python provides three main binary types: bytes
, bytearray
, and memoryview
. Each type serves a specific purpose and offers different functionalities for managing and manipulating binary data.
Bytes
The bytes
type represents an immutable sequence of bytes. This means that once a bytes
object is created, its contents cannot be modified. The bytes
type is often used to handle binary data that should not be altered, such as data read from files or received over a network.
Creating a Bytes Object
You can create a bytes
object by prefixing a string literal with a b
or by using the bytes()
constructor. The string should be encoded in a specific encoding, such as UTF-8.
Example:
my_bytes = b"Hello"
print(my_bytes)
print(my_bytes[2]) #l = 108 in ascii
Output:
b'Hello'
108
In this example, my_bytes
is a bytes
object containing the byte sequence corresponding to the string "Hello"
.
Bytearray
The bytearray
type represents a mutable sequence of bytes. Unlike bytes
objects, bytearray
objects can be modified after they are created. This makes bytearray
useful for scenarios where you need to update or manipulate binary data.
Creating a Bytearray Object
You can create a bytearray
object by passing an iterable of integers (0-255) to the bytearray()
constructor.
Example:
my_bytearray = bytearray([1, 2, 3])
print(my_bytearray)
Output:
bytearray(b'\x01\x02\x03')
In this example, my_bytearray
is a bytearray
object containing the byte sequence \x01\x02\x03
. You can modify this bytearray using various methods such as append()
, extend()
, and remove()
.
Memoryview
The memoryview
type provides a way to access the internal data of an object without copying it. This allows for efficient handling of large binary data, especially when working with file I/O or data buffers. A memoryview
object gives you access to the data of an underlying binary object, like bytes
or bytearray
, without creating a copy.
Creating a Memoryview Object
You can create a memoryview
object by passing a binary object (such as a bytes
or bytearray
) to the memoryview()
constructor.
Example:
my_bytearray = bytearray([1, 2, 3, 4, 5])
my_memoryview = memoryview(my_bytearray)
print(my_memoryview)
Output:
<memory at 0x1187b38>
In this example, my_memoryview
is a memoryview
object that provides access to the data in my_bytearray
. This allows you to manipulate the underlying byte data without making a copy, which is useful for performance when dealing with large amounts of binary data.
Example
- Efficient File reading with memoryview
""" Efficient file reading using memoryview and readinto(). Goal: Read file contents directly into a pre-allocated buffer without creating new Python objects for every read. Why this matters: - Avoids unnecessary data copying. - Reuses the same memory buffer each time. - Useful for reading large files or streaming data. """ # Step 1: Create a bytearray as a writable buffer. buffer = bytearray(4096) # 4 KB reusable memory block # Step 2: Wrap it with a memoryview. mv = memoryview(buffer) # Step 3: Open the source file for reading in binary mode. with open("source.txt", "rb") as f: while True: # Step 4: Read directly into the memoryview. bytes_read = f.readinto(mv) if bytes_read == 0: # End of file reached break # Step 5: Work with the data directly in memory # (here, we just print it as bytes, but you could process it) chunk = mv[:bytes_read] # slice to valid portion only print(chunk.tobytes()) # convert view to bytes for display print("File reading complete with zero-copy I/O.")
-
writing data efficiently
""" Efficient file writing using memoryview and write(). Goal: Write data from memory directly to disk without extra copying. Why this matters: - When sending data to files, sockets, or pipes, memoryview lets write() access the same memory block directly. """ # Step 1: Create some mutable binary data data = bytearray(b"Hello from memoryview!\nThis was written efficiently.\n") # Step 2: Wrap it with a memoryview (no copy is made) mv = memoryview(data) # Step 3: Open a target file for writing in binary mode with open("output.txt", "wb") as f: # Step 4: Write directly from memoryview to file f.write(mv) print("✅ File writing complete using memoryview (no data copies made).")
Conclusion
Python's binary types—bytes
, bytearray
, and memoryview
—offer different ways to work with binary data. bytes
provides an immutable sequence of bytes, bytearray
offers a mutable sequence, and memoryview
allows efficient access to the underlying data without copying. Understanding these types and their functionalities is crucial for effectively managing binary data in Python.