Read an arff file.

The data is returned as a record array, which can be accessed much like a dictionary of numpy arrays. For example, if one of the attributes is called ‘pressure’, then its first 10 data points can be accessed from the data record array like so: data['pressure'][0:10]


f : file-like or str

File-like object to read from, or filename to open.


data : record array

The data of the arff file, accessible by attribute names.

meta : MetaData

Contains information about the arff file such as name and type of attributes, the relation (name of the dataset), etc...



This is raised if the given file is not ARFF-formatted.


The ARFF file has an attribute which is not supported yet.


This function should be able to read most arff files. Not implemented functionality include:

  • date type attributes
  • string type attributes

It can read files with numeric and nominal attributes. It cannot read files with sparse data ({} in the file). However, this function can read files with missing data (? in the file), representing the data points as NaNs.


>>> from import arff
>>> from cStringIO import StringIO
>>> content = """
... @relation foo
... @attribute width  numeric
... @attribute height numeric
... @attribute color  {red,green,blue,yellow,black}
... @data
... 5.0,3.25,blue
... 4.5,3.75,green
... 3.0,4.00,red
... """
>>> f = StringIO(content)
>>> data, meta = arff.loadarff(f)
>>> data
array([(5.0, 3.25, 'blue'), (4.5, 3.75, 'green'), (3.0, 4.0, 'red')],
      dtype=[('width', '<f8'), ('height', '<f8'), ('color', '|S6')])
>>> meta
Dataset: foo
    width's type is numeric
    height's type is numeric
    color's type is nominal, range is ('red', 'green', 'blue', 'yellow', 'black')