Run this notebook: Open in Colab Open in Kaggle

Input and Output¶

NumPy provides multiple ways to save and load array data. Binary formats (.npy, .npz) preserve exact data types and are fast to read/write – ideal for intermediate results and model checkpoints. Text formats (CSV, TSV) are human-readable and interoperable with other tools but slower and less precise for floating-point data. Understanding these I/O options is critical for data pipelines: you need to persist processed datasets, share data between scripts, and checkpoint long-running computations.

from __future__ import print_function
import numpy as np

author = "kyubyong. https://github.com/Kyubyong/numpy_exercises"

np.__version__

from datetime import date
print(date.today())

NumPy Binary Files (NPY, NPZ)¶

np.save() writes a single array to a .npy file, preserving its shape, dtype, and data exactly. np.savez() saves multiple arrays into a single .npz archive (a zip of .npy files). np.load() reads them back. Binary formats are the fastest way to persist NumPy arrays and are preferred for saving preprocessed datasets, trained model weights, and intermediate computation results.

Q1. Save x into temp.npy and load it.

x = np.arange(10)
...

# Check if there exists the 'temp.npy' file.
import os
if os.path.exists('temp.npy'):
    x2 = ...
    print(np.array_equal(x, x2))

Q2. Save x and y into a single file ‘temp.npz’ and load it.

x = np.arange(10)
y = np.arange(11, 20)
...

with ... as data:
    x2 = data['x']
    y2 = data['y']
    print(np.array_equal(x, x2))
    print(np.array_equal(y, y2))

Text Files¶

np.savetxt() writes arrays to text files (CSV, TSV) with configurable delimiters, headers, and formatting. np.loadtxt() and np.genfromtxt() read them back. Text files are useful for sharing data with non-Python tools, inspecting data manually, and interoperating with spreadsheet software. The trade-off is slower I/O and potential precision loss for floating-point numbers compared to binary formats.

Q3. Save x to ‘temp.txt’ in string format and load it.

x = np.arange(10).reshape(2, 5)
header = 'num1 num2 num3 num4 num5'
...
...

Q4. Save x, y, and z to ‘temp.txt’ in string format line by line, then load it.

x = np.arange(10)
y = np.arange(11, 21)
z = np.arange(22, 32)
...
...

Q5. Convert x into bytes, and load it as array.

x = np.array([1, 2, 3, 4])
x_bytes = ...
x2 = ...
print(np.array_equal(x, x2))

Q6. Convert a into an ndarray and then convert it into a list again.

a = [[1, 2], [3, 4]]
x = ...
a2 = ...
print(a == a2)

String Formatting¶

np.array_str() and np.array2string() convert arrays to string representations, while np.fromstring() parses strings back into arrays. These are useful for logging, debugging, and serializing arrays into formats that can be embedded in text files or transmitted over networks.

Q7. Convert x to a string, and revert it.

x = np.arange(10).reshape(2,5)
x_str = ...
print(x_str, "\n", type(x_str))
x_str = x_str.replace("[", "") # [] must be stripped
x_str = x_str.replace("]", "")
x2 = ...
assert np.array_equal(x, x2)

Text Formatting Options¶

np.set_printoptions() controls how arrays are displayed in the console – precision, threshold for summarization, line width, and suppression of small values. Configuring these options makes debugging large arrays much easier by showing the most relevant information without overwhelming the screen.

Q8. Print x such that all elements are displayed with precision=1, no suppress.

x = np.random.uniform(size=[10,100])
np.set_printoptions(...)
print(x)

Base-n Representations¶

np.binary_repr() and np.base_repr() convert integers to their binary, hexadecimal, or other base representations as strings. These are useful in low-level data processing, working with bit masks, encoding categorical features as binary vectors, and understanding how numbers are stored at the hardware level.

Q9. Convert 12 into a binary number in string format.

Q10. Convert 12 into a hexadecimal number in string format.