String OperationsΒΆ
NumPyβs np.char module provides vectorized string operations that work element-wise on arrays of strings. Functions like np.char.add(), np.char.multiply(), np.char.capitalize(), np.char.lower(), np.char.strip(), and np.char.replace() mirror Pythonβs built-in string methods but operate on entire arrays at once. While Pandas is more commonly used for string manipulation in data science, these NumPy functions are useful for lightweight text processing without the overhead of a DataFrame β for example, cleaning label arrays or formatting output strings.
from __future__ import print_function
import numpy as np
author = "kyubyong. https://github.com/Kyubyong/numpy_exercises"
np.__version__
Q1. Concatenate x1 and x2.
x1 = np.array(['Hello', 'Say'], dtype=np.str)
x2 = np.array([' world', ' something'], dtype=np.str)
Q2. Repeat x three time element-wise.
x = np.array(['Hello ', 'Say '], dtype=np.str)
Q3-1. Capitalize the first letter of x element-wise.
Q3-2. Lowercase x element-wise.
Q3-3. Uppercase x element-wise.
Q3-4. Swapcase x element-wise.
Q3-5. Title-case x element-wise.
x = np.array(['heLLo woRLd', 'Say sOmething'], dtype=np.str)
capitalized = ...
lowered = ...
uppered = ...
swapcased = ...
titlecased = ...
print("capitalized =", capitalized)
print("lowered =", lowered)
print("uppered =", uppered)
print("swapcased =", swapcased)
print("titlecased =", titlecased)
Q4. Make the length of each element 20 and the string centered / left-justified / right-justified with paddings of _.
x = np.array(['hello world', 'say something'], dtype=np.str)
centered = ...
left = ...
right = ...
print("centered =", centered)
print("left =", left)
print("right =", right)
Q5. Encode x in cp500 and decode it again.
x = np.array(['hello world', 'say something'], dtype=np.str)
encoded = ...
decoded = ...
print("encoded =", encoded)
print("decoded =", decoded)
Q6. Insert a space between characters of x.
x = np.array(['hello world', 'say something'], dtype=np.str)
Q7-1. Remove the leading and trailing whitespaces of x element-wise.
Q7-2. Remove the leading whitespaces of x element-wise.
Q7-3. Remove the trailing whitespaces of x element-wise.
x = np.array([' hello world ', '\tsay something\n'], dtype=np.str)
stripped = ...
lstripped = ...
rstripped = ...
print("stripped =", stripped)
print("lstripped =", lstripped)
print("rstripped =", rstripped)
Q8. Split the element of x with spaces.
x = np.array(['Hello my name is John'], dtype=np.str)
Q9. Split the element of x to multiple lines.
x = np.array(['Hello\nmy name is John'], dtype=np.str)
Q10. Make x a numeric string of 4 digits with zeros on its left.
x = np.array(['34'], dtype=np.str)
Q11. Replace βJohnβ with βJimβ in x.
x = np.array(['Hello nmy name is John'], dtype=np.str)
ComparisonΒΆ
NumPy provides element-wise string comparison functions in np.char.equal() and np.char.not_equal(). These return boolean arrays indicating where strings match or differ, which is useful for validating data, finding mismatches between predicted and expected labels, or filtering arrays of categorical data.
Q12. Return x1 == x2, element-wise.
x1 = np.array(['Hello', 'my', 'name', 'is', 'John'], dtype=np.str)
x2 = np.array(['Hello', 'my', 'name', 'is', 'Jim'], dtype=np.str)
Q13. Return x1 != x2, element-wise.
x1 = np.array(['Hello', 'my', 'name', 'is', 'John'], dtype=np.str)
x2 = np.array(['Hello', 'my', 'name', 'is', 'Jim'], dtype=np.str)
String InformationΒΆ
These functions inspect string content element-wise: np.char.count() counts substring occurrences, np.char.find() locates substrings, and functions like np.char.isdigit(), np.char.islower(), and np.char.isupper() test character properties. These are useful for data validation β checking whether a column of strings contains only numeric data, or verifying formatting consistency across a dataset.
Q14. Count the number of βlβ in x, element-wise.
x = np.array(['Hello', 'my', 'name', 'is', 'Lily'], dtype=np.str)
Q15. Count the lowest index of βlβ in x, element-wise.
x = np.array(['Hello', 'my', 'name', 'is', 'Lily'], dtype=np.str)
Q16-1. Check if each element of x is composed of digits only.
Q16-2. Check if each element of x is composed of lower case letters only.
Q16-3. Check if each element of x is composed of upper case letters only.
x = np.array(['Hello', 'I', 'am', '20', 'years', 'old'], dtype=np.str)
out1 = ...
out2 = ...
out3 = ...
print("Digits only =", out1)
print("Lower cases only =", out2)
print("Upper cases only =", out3)
Q17. Check if each element of x starts with βhiβ.
x = np.array(['he', 'his', 'him', 'his'], dtype=np.str)