SciPy

numpy.fromregex

numpy.fromregex(file, regexp, dtype, encoding=None)[source]

Construct an array from a text file, using regular expression parsing.

The returned array is always a structured array, and is constructed from all matches of the regular expression in the file. Groups in the regular expression are converted to fields of the structured array.

Parameters:

file : str or file

File name or file object to read.

regexp : str or regexp

Regular expression used to parse the file. Groups in the regular expression correspond to fields in the dtype.

dtype : dtype or list of dtypes

Dtype for the structured array.

encoding : str, optional

Encoding used to decode the inputfile. Does not apply to input streams.

New in version 1.14.0.

Returns:

output : ndarray

The output array, containing the part of the content of file that was matched by regexp. output is always a structured array.

Raises:

TypeError

When dtype is not a valid dtype for a structured array.

See also

fromstring, loadtxt

Notes

Dtypes for structured arrays can be specified in several forms, but all forms specify at least the data type and field name. For details see doc.structured_arrays.

Examples

>>> f = open('test.dat', 'w')
>>> f.write("1312 foo\n1534  bar\n444   qux")
>>> f.close()
>>> regexp = r"(\d+)\s+(...)"  # match [digits, whitespace, anything]
>>> output = np.fromregex('test.dat', regexp,
...                       [('num', np.int64), ('key', 'S3')])
>>> output
array([(1312L, 'foo'), (1534L, 'bar'), (444L, 'qux')],
      dtype=[('num', '<i8'), ('key', '|S3')])
>>> output['num']
array([1312, 1534,  444], dtype=int64)

Previous topic

numpy.genfromtxt

Next topic

numpy.array2string