Read Digits From File Into a Numpy Array

11. Reading and Writing Data Files: ndarrays

By Bernd Klein. Last modified: 01 Feb 2022.

There are lots of means for reading from file and writing to information files in numpy. Nosotros volition discuss the different ways and corresponding functions in this chapter:

  • savetxt
  • loadtxt
  • tofile
  • fromfile
  • relieve
  • load
  • genfromtxt

Saving textfiles with savetxt

Scrabble with the Text Numpy, read, write, array

The commencement two functions nosotros will encompass are savetxt and loadtxt.

In the following elementary example, we define an array 10 and save information technology as a textfile with savetxt:

            import            numpy            as            np            x            =            np            .            array            ([[            1            ,            2            ,            3            ],            [            4            ,            v            ,            6            ],            [            7            ,            8            ,            9            ]],            np            .            int32            )            np            .            savetxt            (            "test.txt"            ,            x            )          

The file "test.txt" is a textfile and its content looks like this:

          [email protected]:~/Dropbox/notebooks/numpy$ more examination.txt one.000000000000000000e+00 2.000000000000000000e+00 3.000000000000000000e+00 4.000000000000000000e+00 5.000000000000000000e+00 6.000000000000000000e+00 7.000000000000000000e+00 eight.000000000000000000e+00 9.000000000000000000e+00        

Attention: The above output has been created on the Linux command prompt!

It'southward also possible to impress the assortment in a special format, like for example with three decimal places or as integers, which are preceded with leading blanks, if the number of digits is less than 4 digits. For this purpose we assign a format string to the tertiary parameter 'fmt'. Nosotros saw in our kickoff case that the default delimeter is a blank. We can modify this behaviour by assigning a string to the parameter "delimiter". In almost cases this cord volition consist solely of a single character but it can be a sequence of character, like a smiley " :-) " as well:

            np            .            savetxt            (            "test2.txt"            ,            x            ,            fmt            =            "            %two.3f            "            ,            delimiter            =            ","            )            np            .            savetxt            (            "test3.txt"            ,            x            ,            fmt            =            "            %04d            "            ,            delimiter            =            " :-) "            )          

The newly created files look like this:

          [e-mail protected]:~/Dropbox/notebooks/numpy$ more test2.txt  1.000,ii.000,3.000 4.000,5.000,six.000 7.000,8.000,9.000          [electronic mail protected]:~/Dropbox/notebooks/numpy$ more test3.txt  0001 :-) 0002 :-) 0003 0004 :-) 0005 :-) 0006 0007 :-) 0008 :-) 0009        

The complete syntax of savetxt looks like this:

savetxt(fname, X, fmt='%.18e', delimiter=' ', newline='\n', header='', footer='', comments='# ')        
Parameter Meaning
X array_like Data to be saved to a text file.
fmt str or sequence of strs, optional
A single format (%ten.5f), a sequence of formats, or a multi-format string, e.g. 'Iteration %d -- %10.5f', in which case 'delimiter' is ignored. For circuitous 'Ten', the legal options for 'fmt' are:
a) a unmarried specifier, "fmt='%.4e'", resulting in numbers formatted like "' (%s+%sj)' % (fmt, fmt)"
b) a full string specifying every real and imaginary part, due east.g. "' %.4e %+.4j %.4e %+.4j %.4e %+.4j'" for 3 columns
c) a list of specifiers, ane per column - in this case, the real and imaginary office must have separate specifiers, eastward.m. "['%.3e + %.3ej', '(%.15e%+.15ej)']" for 2 columns
delimiter A string used for separating the columns.
newline A string (e.g. "\n", "\r\n" or ",\n") which will stop a line instead of the default line ending
header A Cord that volition be written at the beginning of the file.
footer A String that will be written at the end of the file.
comments A String that volition be prepended to the 'header' and 'footer' strings, to mark them as comments. The hash tag '#' is used every bit the default.

Loading Textfiles with loadtxt

We will read in now the file "examination.txt", which we accept written in our previous subchapter:

              y              =              np              .              loadtxt              (              "test.txt"              )              print              (              y              )            

OUTPUT:

[[ 1.  ii.  3.]  [ four.  5.  6.]  [ 7.  8.  9.]]            
              y              =              np              .              loadtxt              (              "test2.txt"              ,              delimiter              =              ","              )              print              (              y              )            

OUTPUT:

[[ 1.  2.  3.]  [ 4.  5.  6.]  [ vii.  8.  9.]]            

Nothing new, if nosotros read in our text, in which we used a smiley to separator:

              y              =              np              .              loadtxt              (              "test3.txt"              ,              delimiter              =              " :-) "              )              print              (              y              )            

OUTPUT:

[[ one.  2.  three.]  [ 4.  five.  6.]  [ 7.  8.  9.]]            

It'due south besides possible to cull the columns by index:

              y              =              np              .              loadtxt              (              "test3.txt"              ,              delimiter              =              " :-) "              ,              usecols              =              (              0              ,              ii              ))              print              (              y              )            

OUTPUT:

[[ one.  iii.]  [ four.  vi.]  [ 7.  ix.]]            

We will read in our next example the file "times_and_temperatures.txt", which we have created in our affiliate on Generators of our Python tutorial. Every line contains a time in the format "hh::mm::ss" and random temperatures between 10.0 and 25.0 degrees. We have to convert the time string into float numbers. The time will exist in minutes with seconds in the hundred. We ascertain first a function which converts "hh::mm::ss" into minutes:

              def              time2float_minutes              (              fourth dimension              ):              if              type              (              time              )              ==              bytes              :              time              =              time              .              decode              ()              t              =              time              .              split              (              ":"              )              minutes              =              float              (              t              [              0              ])              *              60              +              float              (              t              [              i              ])              +              bladder              (              t              [              2              ])              *              0.05              /              3              render              minutes              for              t              in              [              "06:00:10"              ,              "06:27:45"              ,              "12:59:59"              ]:              print              (              time2float_minutes              (              t              ))            

OUTPUT:

360.1666666666667 387.75 779.9833333333333            

You might take noticed that we cheque the type of time for binary. The reason for this is the use of our function "time2float_minutes in loadtxt in the following example. The keyword parameter converters contains a dictionary which tin can hold a function for a column (the key of the column corresponds to the key of the dictionary) to catechumen the string information of this column into a float. The string data is a byte string. That is why we had to transfer it into a a unicode string in our part:

              y              =              np              .              loadtxt              (              "times_and_temperatures.txt"              ,              converters              =              {              0              :              time2float_minutes              })              print              (              y              )            

OUTPUT:

[[  360.     xx.i]  [  361.v    16.1]  [  363.     16.9]  ...,   [ 1375.5    22.5]  [ 1377.     11.i]  [ 1378.5    15.2]]            
            # delimiter = ";" , # i.east. employ ";" as delimiter instead of whitespace                      

tofile

tofile is a function to write the content of an array to a file both in binary, which is the default, and text format.

A.tofile(fid, sep="", format="%s")

The information of the A ndarry is always written in 'C' social club, regardless of the order of A.

The data file written past this method tin be reloaded with the part fromfile().

Parameter Meaning
fid can be either an open up file object, or a string containing a filename.
sep The string 'sep' defines the separator between assortment items for text output. If it is empty (''), a binary file is written, equivalent to file.write(a.tostring()).
format Format string for text file output. Each entry in the array is formatted to text by first converting information technology to the closest Python type, and and then using 'format' % particular.

Remark:

Information on endianness and precision is lost. Therefore information technology may not be a good idea to use the part to archive data or transport information between machines with different endianness. Some of these problems tin can exist overcome past outputting the data as text files, at the expense of speed and file size.

              dt              =              np              .              dtype              ([(              'time'              ,              [(              'min'              ,              int              ),              (              'sec'              ,              int              )]),              (              'temp'              ,              float              )])              ten              =              np              .              zeros              ((              1              ,),              dtype              =              dt              )              ten              [              'time'              ][              'min'              ]              =              ten              10              [              'temp'              ]              =              98.25              impress              (              x              )              fh              =              open              (              "test6.txt"              ,              "bw"              )              x              .              tofile              (              fh              )            

OUTPUT:

Live Python training

instructor-led training course

Upcoming online Courses

Data Analysis With Python

09 Mar 2022 to 11 Mar 2022
18 May 2022 to 20 May 2022
31 Aug 2022 to 02 Sep 2022
19 Oct 2022 to 21 Oct 2022

Enrol here

fromfile

fromfile to read in information, which has been written with the tofile function. It's possible to read binary data, if the data type is known. It's likewise possible to parse merely formatted text files. The data from the file is turned into an array.

The general syntax looks like this:

numpy.fromfile(file, dtype=float, count=-ane, sep='')

Parameter Meaning
file 'file' can be either a file object or the proper name of the file to read.
dtype defines the data type of the assortment, which will be constructed from the file data. For binary files, it is used to decide the size and byte-order of the items in the file.
count defines the number of items, which will exist read. -one means all items will be read.
sep The string 'sep' defines the separator betwixt the items, if the file is a text file. If it is empty (''), the file will exist treated as a binary file. A infinite (" ") in a separator matches goose egg or more whitespace characters. A separator consisting solely of spaces has to friction match at least one whitespace.
              fh              =              open              (              "test4.txt"              ,              "rb"              )              np              .              fromfile              (              fh              ,              dtype              =              dt              )            

OUTPUT:

array([((4294967296, 12884901890), ane.0609978957e-313),        ((30064771078, 38654705672), two.33419537056e-313),        ((55834574860, 64424509454), 3.60739284543e-313),        ((81604378642, 90194313236), 4.8805903203e-313),        ((107374182424, 115964117018), 6.1537877952e-313),        ((133143986206, 141733920800), seven.42698527006e-313),        ((158913789988, 167503724582), eight.70018274493e-313),        ((184683593770, 193273528364), 9.9733802198e-313)],        dtype=[('time', [('min', '<i8'), ('sec', '<i8')]), ('temp', '<f8')])
              import              numpy              as              np              import              os              # platform dependent: difference between Linux and Windows              #information = np.arange(50, dtype=np.int)              data              =              np              .              arange              (              50              ,              dtype              =              np              .              int32              )              information              .              tofile              (              "test4.txt"              )              fh              =              open              (              "test4.txt"              ,              "rb"              )              # four * 32 = 128              fh              .              seek              (              128              ,              os              .              SEEK_SET              )              x              =              np              .              fromfile              (              fh              ,              dtype              =              np              .              int32              )              print              (              x              )            

OUTPUT:

[32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49]            

Attention:

Information technology can cause problems to use tofile and fromfile for information storage, considering the binary files generated are not platform contained. At that place is no byte-society or data-blazon data saved by tofile. Data tin be stored in the platform contained .npy format using salvage and load instead.

Best Practice to Load and Salvage Data

The recommended manner to store and load information with Numpy in Python consists in using load and save. We also use a temporary file in the following :

              import              numpy              equally              np              print              (              10              )              from              tempfile              import              TemporaryFile              outfile              =              TemporaryFile              ()              ten              =              np              .              arange              (              10              )              np              .              salvage              (              outfile              ,              ten              )              outfile              .              seek              (              0              )              # Merely needed here to simulate closing & reopening file              np              .              load              (              outfile              )            

OUTPUT:

[32 33 34 35 36 37 38 39 xl 41 42 43 44 45 46 47 48 49] array([0, 1, 2, 3, iv, 5, 6, 7, 8, 9])

and however another way: genfromtxt

There is yet another manner to read tabular input from file to create arrays. As the proper name implies, the input file is supposed to be a text file. The text file can be in the form of an annal file as well. genfromtxt tin can process the archive formats gzip and bzip2. The type of the archive is determined past the extension of the file, i.e. '.gz' for gzip and bz2' for an bzip2.

genfromtxt is slower than loadtxt, but it is capable of coping with missing data. It processes the file information in two passes. At first it converts the lines of the file into strings. Thereupon it converts the strings into the requested information blazon. loadtxt on the other hand works in one go, which is the reason, why it is faster.

recfromcsv(fname, **kwargs)

This is not really some other way to read in csv data. 'recfromcsv' basically a shortcut for

np.genfromtxt(filename, delimiter=",", dtype=None)

Live Python preparation

instructor-led training course

Upcoming online Courses

Data Assay With Python

09 Mar 2022 to eleven Mar 2022
18 May 2022 to 20 May 2022
31 Aug 2022 to 02 Sep 2022
19 Oct 2022 to 21 October 2022

Enrol hither

slatervoine2000.blogspot.com

Source: https://python-course.eu/numerical-programming/reading-and-writing-data-files-ndarrays.php

0 Response to "Read Digits From File Into a Numpy Array"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel